From: Jeff King <peff@peff.net>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Andrei Rybak <rybak.a.v@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] mailinfo: support Unicode scissors
Date: Mon, 1 Apr 2019 06:11:57 -0400 [thread overview]
Message-ID: <20190401101156.GA1131@sigill.intra.peff.net> (raw)
In-Reply-To: <20190331230947.GI32732@szeder.dev>
On Mon, Apr 01, 2019 at 01:09:47AM +0200, SZEDER Gábor wrote:
> On Mon, Apr 01, 2019 at 12:01:04AM +0200, Andrei Rybak wrote:
> > diff --git a/mailinfo.c b/mailinfo.c
> > index b395adbdf2..4ef6cdee85 100644
> > --- a/mailinfo.c
> > +++ b/mailinfo.c
> > @@ -701,6 +701,13 @@ static int is_scissors_line(const char *line)
> > c++;
> > continue;
> > }
> > + if (!memcmp(c, "✂", 3)) {
>
> This character is tiny. Please add a comment that it's supposed to be
> a Unicode scissors character.
I think it might also be the first raw UTF-8 character in our source,
which is otherwise ASCII. Usually we'd spell out the binary (with a
comment).
I think I agree with Junio's response, tough, that this is probably not
a road we want to go down, unless this micro-format is being actively
used in the wild (I have no idea, but I have never seen it).
> Should we worry about this memcmp() potentially reading past the end
> of the string when 'c' points to the last character?
I also wondered if the existing memcmps for ">8", etc, would have this
problem. They don't, but it's somewhat subtle. They are only 2
characters long, and the outer loop guarantees we have at least 1
character. So at most we will look at the NUL. But obviously a 3-byte
sequence like this may invoke undefined behavior, and the existing
memcmps encourage anybody adding code to do it wrong.
I wonder if it's worth re-writing it like:
diff --git a/mailinfo.c b/mailinfo.c
index b395adbdf2..46b1b2a4a8 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -693,8 +693,8 @@ static int is_scissors_line(const char *line)
perforation++;
continue;
}
- if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
- !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
+ if ((starts_with(c, ">8") || starts_with(c, "8<") ||
+ starts_with(c, ">%") || starts_with(c, "%<"))) {
in_perforation = 1;
perforation += 2;
scissors += 2;
-Peff
next prev parent reply other threads:[~2019-04-01 10:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-31 22:01 [PATCH] mailinfo: support Unicode scissors Andrei Rybak
2019-03-31 23:09 ` SZEDER Gábor
2019-04-01 9:07 ` Junio C Hamano
2019-04-01 10:11 ` Jeff King [this message]
2019-04-01 9:27 ` Duy Nguyen
2019-04-01 21:54 ` Andrei Rybak
2019-04-01 21:53 ` [PATCH v2 1/2] mailinfo: use starts_with() for clarity Andrei Rybak
2019-04-01 21:53 ` [PATCH v2 2/2] mailinfo: support Unicode scissors Andrei Rybak
2019-04-02 14:36 ` Jeff King
2019-04-03 6:47 ` Junio C Hamano
2019-04-02 14:28 ` [PATCH v2 1/2] mailinfo: use starts_with() for clarity Jeff King
2021-06-08 20:48 ` [PATCH] mailinfo: use starts_with() when checking scissors Andrei Rybak
2021-06-08 21:57 ` Jeff King
2021-06-09 2:22 ` Junio C Hamano
2021-06-09 3:59 ` Jeff King
2021-06-09 5:11 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190401101156.GA1131@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=rybak.a.v@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).