git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "A. Wilcox" <awilfox@adelielinux.org>
Cc: git@vger.kernel.org, Andreas Schwab <schwab@linux-m68k.org>
Subject: [PATCH] t7810: avoid assumption about invalid regex syntax
Date: Wed, 11 Jan 2017 06:10:55 -0500	[thread overview]
Message-ID: <20170111111055.j3hgijpaabvy6kyg@sigill.intra.peff.net> (raw)
In-Reply-To: <20170111100400.vhd5ytarqpujigbn@sigill.intra.peff.net>

A few of the tests want to check that "git grep -P -E" will
override -P with -E, and vice versa. To do so, we use a
regex with "\x{..}", which is valid in PCRE but not defined
by POSIX (for basic or extended regular expressions).

However, POSIX declares quite a lot of syntax, including
"\x", as "undefined". That leaves implementations free to
extend the standard if they choose. At least one, musl libc,
implements "\x" in the same way as PCRE.  Our tests check
that "-E" complains about "\x", which fails with musl.

We can fix this by finding some construct which behaves
reliably on both PCRE and POSIX, but differently in each
system.

One such construct is the use of backslash inside brackets.
In PCRE, "[\d]" interprets "\d" as it would outside the
brackets, matching a digit. Whereas in POSIX, the backslash
must be treated literally, and we match either it or a
literal "d".  Moreover, implementations are not free to
change this according to POSIX, so we should be able to rely
on it.

Signed-off-by: Jeff King <peff@peff.net>
---
I've tested this with glibc, but I wasn't able to do so with musl. The
two complications are:

  1. Recent versions of git won't build with musl's regex at all,
     because it doesn't support the non-standard REG_STARTEND that we
     rely on since b7d36ffca (regex: use regexec_buf(), 2016-09-21).

     So if applied on an older git, this patch should help, but newer
     versions need NO_REGEX (to use the fallback glibc regex code)
     either way, which would also make the problem go away.

     Still, I think it's the right thing to do, since we are relying on
     something that POSIX clearly leaves up to the implementation. It
     may also help on other systems, or if musl ends up supporting
     REG_STARTEND in the future.

  2. I tried to cherry-pick to v2.7.x and test it with musl. Debian
     ships with a "musl-gcc" wrapper, but it doesn't work out of the
     box. Both zlib and pcre are compiled against glibc, so I'd have to
     rebuild those, too. At which point I gave up and decided to just
     let you test it on your musl-based system. :)

 t/t7810-grep.sh | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index de2405ccb..19f0108f8 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -39,6 +39,10 @@ test_expect_success setup '
 		echo "a+bc"
 		echo "abc"
 	} >ab &&
+	{
+		echo d &&
+		echo 0
+	} >d0 &&
 	echo vvv >v &&
 	echo ww w >w &&
 	echo x x xx x >x &&
@@ -1105,36 +1109,36 @@ test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =extended
 '
 
 test_expect_success 'grep -G -F -P -E pattern' '
-	>empty &&
-	test_must_fail git grep -G -F -P -E "a\x{2b}b\x{2a}c" ab >actual &&
-	test_cmp empty actual
+	echo "d0:d" >expected &&
+	git grep -G -F -P -E "[\d]" d0 >actual &&
+	test_cmp expected actual
 '
 
 test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =perl, =extended' '
-	>empty &&
-	test_must_fail git \
+	echo "d0:d" >expected &&
+	git \
 		-c grep.patterntype=fixed \
 		-c grep.patterntype=basic \
 		-c grep.patterntype=perl \
 		-c grep.patterntype=extended \
-		grep "a\x{2b}b\x{2a}c" ab >actual &&
-	test_cmp empty actual
+		grep "[\d]" d0 >actual &&
+	test_cmp expected actual
 '
 
 test_expect_success LIBPCRE 'grep -G -F -E -P pattern' '
-	echo "ab:a+b*c" >expected &&
-	git grep -G -F -E -P "a\x{2b}b\x{2a}c" ab >actual &&
+	echo "d0:0" >expected &&
+	git grep -G -F -E -P "[\d]" d0 >actual &&
 	test_cmp expected actual
 '
 
 test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
-	echo "ab:a+b*c" >expected &&
+	echo "d0:0" >expected &&
 	git \
 		-c grep.patterntype=fixed \
 		-c grep.patterntype=basic \
 		-c grep.patterntype=extended \
 		-c grep.patterntype=perl \
-		grep "a\x{2b}b\x{2a}c" ab >actual &&
+		grep "[\d]" d0 >actual &&
 	test_cmp expected actual
 '
 
-- 
2.11.0.627.gfa6151259

  reply	other threads:[~2017-01-11 11:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-01  4:59 Test failures when Git is built with libpcre and grep is built without it A. Wilcox
2017-01-01 10:19 ` Torsten Bögershausen
2017-01-02  6:53 ` Jeff King
2017-01-09 10:51   ` A. Wilcox
2017-01-09 11:27     ` Jeff King
2017-01-09 13:05     ` Andreas Schwab
2017-01-09 21:33       ` Jeff King
2017-01-10 10:36         ` A. Wilcox
2017-01-10 11:40           ` [musl] " Szabolcs Nagy
2017-01-11 10:04             ` Jeff King
2017-01-11 11:10               ` Jeff King [this message]
2017-01-11 20:49               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170111111055.j3hgijpaabvy6kyg@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=awilfox@adelielinux.org \
    --cc=git@vger.kernel.org \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).