git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RE: grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f)
@ 2023-02-01 15:18 D. Ben Knoble
  2023-02-01 16:09 ` demerphq
  0 siblings, 1 reply; 22+ messages in thread
From: D. Ben Knoble @ 2023-02-01 15:18 UTC (permalink / raw)
  To: git

I recently updated to git 2.39.1 and noticed today that `git diff
--word-diff` fails for files with `diff=scheme`. I was able to narrow
the failure down to the inclusion of control characters \xc0, \xff,
\x80, \xbf by https://github.com/git/git/blob/2fc9e9ca3c7505bc60069f11e7ef09b1aeeee473/userdiff.c#L17
in the definition of the scheme diff pattern (really, all patterns).

I suspect the commit referenced in the subject, given that it messes
with regex handling on macOS.

Relevant environment that I can think of:
```
# locale
LANG="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_CTYPE="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_ALL="fr_FR.UTF-8"
```

I'm on macOS 11.7.

Failure (using Zsh to produce the characters; I think there's a Bash
equivalent):
```
# git diff --word-diff --word-diff-regex=$'[\xc0-\xff][\x80-\xbf]+'
fatal¬†: invalid regular expression: [¿-ˇ][Ä-ø]+
```
(Looks like the output is a bit scrambled; here's the hexdump)
```
# !! |& xxd
00000000: 6661 7461 6cc2 a03a 2069 6e76 616c 6964  fatal..: invalid
00000010: 2072 6567 756c 6172 2065 7870 7265 7373   regular express
00000020: 696f 6e3a 205b c02d ff5d 5b80 2dbf 5d2b  ion: [.-.][.-.]+
00000030: 0a                                       .
```

-- 
D. Ben Knoble

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-02-07 22:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01 15:18 grep: fix multibyte regex handling under macOS (1819ad327b7a1f19540a819813b70a0e8a7f798f) D. Ben Knoble
2023-02-01 16:09 ` demerphq
2023-02-01 16:21   ` D. Ben Knoble
2023-02-01 18:23     ` demerphq
2023-02-01 18:54       ` Junio C Hamano
2023-02-01 21:33         ` D. Ben Knoble
2023-02-01 21:34           ` D. Ben Knoble
2023-02-01 22:15           ` Junio C Hamano
2023-02-01 23:03   ` Jeff King
2023-02-02 16:22     ` demerphq
2023-02-02 20:49       ` D. Ben Knoble
2023-02-03 17:01       ` Jeff King
2023-02-03 21:56         ` Ævar Arnfjörð Bjarmason
2023-02-04 11:17           ` Jeff King
2023-02-04 11:32         ` demerphq
2023-02-05 19:51           ` D. Ben Knoble
2023-02-07 18:23             ` Jeff King
2023-02-07 22:27               ` D. Ben Knoble
2023-02-07 18:19           ` Jeff King
2023-02-02 20:47     ` D. Ben Knoble
2023-02-03 16:55       ` Jeff King
2023-02-03 17:06         ` D. Ben Knoble

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).