git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support
@ 2017-04-25 21:05 Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 01/19] grep: amend submodule recursion test in preparation for rx engine testing Ævar Arnfjörð Bjarmason
                   ` (18 more replies)
  0 siblings, 19 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Trivial changes since last time. Just sending this because I'd like
the copy in pu updated. Changes noted below:

Ævar Arnfjörð Bjarmason (19):
  grep: amend submodule recursion test in preparation for rx engine
    testing
  grep: add tests for grep pattern types being passed to submodules

A s/PCRE/LIBPCRE/ on the test_have_prereq, now makes sense with the
series in sequence (error added during rebasing).

  grep: submodule-related case statements should die if new fields are
    added
  grep: remove redundant regflags assignment under PCRE
  grep: remove redundant `regflags &= ~REG_EXTENDED` assignments

NEW: Similarly to how we didn't need to set regflags under PCRE, we
were negating REG_EXTENDED under POSIX basic, without ever setting it
in the first place.

This was just as confusing as the PCRE oddity, so remove it.

  Makefile & configure: reword outdated comment about PCRE
  grep: add a test for backreferences in PCRE patterns
  log: add exhaustive tests for pattern style options & config
  log: add -P as a synonym for --perl-regexp
  grep & rev-list doc: stop promising libpcre for --perl-regexp
  grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
  test-lib: rename the LIBPCRE prerequisite to PCRE

One corresponding s/LIBPCRE/PCRE/ for the earlier change, see above.

  grep: change the internal PCRE macro names to be PCRE1
  grep: change the internal PCRE code & header names to be PCRE1
  perf: add a performance comparison test of grep -E and -P
  grep: add support for the PCRE v1 JIT API
  grep: add support for PCRE v2

We now give proper error messages via pcre2_get_error_message() when
pcre2_match() fails with errors other than "didn't match", the common
case for this is that the engine gave up on a pathological pattern /
input combination.

  grep: remove support for concurrent use of both PCRE v1 & v2
  Makefile & configure: make PCRE v2 the default PCRE implementation

Added more details to the commit message about why switching to PCRE
v2 by default is a good idea. I hadn't noticed before that deep bugs
in PCRE v1 are being WONTFIX'd on the bugtracker saying "nope, never
fixing thath in v1, switch to v2".

 Documentation/git-grep.txt         |   7 +-
 Documentation/rev-list-options.txt |   9 +-
 Makefile                           |  39 +++++--
 builtin/grep.c                     |   4 +
 configure.ac                       |  81 ++++++++++++--
 grep.c                             | 222 +++++++++++++++++++++++++++++++------
 grep.h                             |  32 +++++-
 revision.c                         |   2 +-
 t/README                           |   4 +-
 t/perf/p7820-grep-engines.sh       |  25 +++++
 t/t4202-log.sh                     |  86 +++++++++++++-
 t/t7810-grep.sh                    |  41 ++++---
 t/t7812-grep-icase-non-ascii.sh    |   4 +-
 t/t7813-grep-icase-iso.sh          |  11 +-
 t/t7814-grep-recurse-submodules.sh | 215 +++++++++++++++++++++--------------
 t/test-lib.sh                      |   3 +-
 16 files changed, 613 insertions(+), 172 deletions(-)
 create mode 100755 t/perf/p7820-grep-engines.sh

-- 
2.11.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 01/19] grep: amend submodule recursion test in preparation for rx engine testing
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 02/19] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Amend the submodule recursion test added in commit 0281e487fd ("grep:
optionally recurse into submodules", 2016-12-16) to prepare it for
subsequent tests of whether it passes along the grep.patternType to
the submodule greps.

This is just the result of searching & replacing:

    foobar -> (1|2)d(3|4)
    foo    -> (1|2)
    bar    -> (3|4)

Currently there's no tests for whether e.g. -P or -E is correctly
passed along, tests for that will be added in a follow-up change, but
first add content to the tests which will match differently under
different regex engines.

Reuse the pattern established in an earlier commit of mine in this
series ("log: add exhaustive tests for pattern style options &
config", 2017-04-07). The pattern "(.|.)[\d]" will match this content
differently under fixed/basic/extended & perl.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7814-grep-recurse-submodules.sh | 166 ++++++++++++++++++-------------------
 1 file changed, 83 insertions(+), 83 deletions(-)

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 5b6eb3a65e..3c580b38ba 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -9,13 +9,13 @@ submodules.
 . ./test-lib.sh
 
 test_expect_success 'setup directory structure and submodule' '
-	echo "foobar" >a &&
+	echo "(1|2)d(3|4)" >a &&
 	mkdir b &&
-	echo "bar" >b/b &&
+	echo "(3|4)" >b/b &&
 	git add a b &&
 	git commit -m "add a and b" &&
 	git init submodule &&
-	echo "foobar" >submodule/a &&
+	echo "(1|2)d(3|4)" >submodule/a &&
 	git -C submodule add a &&
 	git -C submodule commit -m "add a" &&
 	git submodule add ./submodule &&
@@ -24,18 +24,18 @@ test_expect_success 'setup directory structure and submodule' '
 
 test_expect_success 'grep correctly finds patterns in a submodule' '
 	cat >expect <<-\EOF &&
-	a:foobar
-	b/b:bar
-	submodule/a:foobar
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and basic pathspecs' '
 	cat >expect <<-\EOF &&
-	submodule/a:foobar
+	submodule/a:(1|2)d(3|4)
 	EOF
 
 	git grep -e. --recurse-submodules -- submodule >actual &&
@@ -44,7 +44,7 @@ test_expect_success 'grep and basic pathspecs' '
 
 test_expect_success 'grep and nested submodules' '
 	git init submodule/sub &&
-	echo "foobar" >submodule/sub/a &&
+	echo "(1|2)d(3|4)" >submodule/sub/a &&
 	git -C submodule/sub add a &&
 	git -C submodule/sub commit -m "add a" &&
 	git -C submodule submodule add ./sub &&
@@ -54,117 +54,117 @@ test_expect_success 'grep and nested submodules' '
 	git commit -m "updated submodule" &&
 
 	cat >expect <<-\EOF &&
-	a:foobar
-	b/b:bar
-	submodule/a:foobar
-	submodule/sub/a:foobar
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
 	cat >expect <<-\EOF &&
-	a:foobar
-	submodule/a:foobar
-	submodule/sub/a:foobar
+	a:(1|2)d(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --and -e "foo" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --and -e "(1|2)d" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
 	cat >expect <<-\EOF &&
-	b/b:bar
+	b/b:(3|4)
 	EOF
 
-	git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --and --not -e "(1|2)d" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'basic grep tree' '
 	cat >expect <<-\EOF &&
-	HEAD:a:foobar
-	HEAD:b/b:bar
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:a:(1|2)d(3|4)
+	HEAD:b/b:(3|4)
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree HEAD^' '
 	cat >expect <<-\EOF &&
-	HEAD^:a:foobar
-	HEAD^:b/b:bar
-	HEAD^:submodule/a:foobar
+	HEAD^:a:(1|2)d(3|4)
+	HEAD^:b/b:(3|4)
+	HEAD^:submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD^ >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree HEAD^^' '
 	cat >expect <<-\EOF &&
-	HEAD^^:a:foobar
-	HEAD^^:b/b:bar
+	HEAD^^:a:(1|2)d(3|4)
+	HEAD^^:b/b:(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD^^ >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD^^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- submodule >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- submodule >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodule*a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodule*a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and more pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodul?/a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodul?/a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and more pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodul*/sub/a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodul*/sub/a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success !MINGW 'grep recurse submodule colon in name' '
 	git init parent &&
 	test_when_finished "rm -rf parent" &&
-	echo "foobar" >"parent/fi:le" &&
+	echo "(1|2)d(3|4)" >"parent/fi:le" &&
 	git -C parent add "fi:le" &&
 	git -C parent commit -m "add fi:le" &&
 
 	git init "su:b" &&
 	test_when_finished "rm -rf su:b" &&
-	echo "foobar" >"su:b/fi:le" &&
+	echo "(1|2)d(3|4)" >"su:b/fi:le" &&
 	git -C "su:b" add "fi:le" &&
 	git -C "su:b" commit -m "add fi:le" &&
 
@@ -172,30 +172,30 @@ test_expect_success !MINGW 'grep recurse submodule colon in name' '
 	git -C parent commit -m "add submodule" &&
 
 	cat >expect <<-\EOF &&
-	fi:le:foobar
-	su:b/fi:le:foobar
+	fi:le:(1|2)d(3|4)
+	su:b/fi:le:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	cat >expect <<-\EOF &&
-	HEAD:fi:le:foobar
-	HEAD:su:b/fi:le:foobar
+	HEAD:fi:le:(1|2)d(3|4)
+	HEAD:su:b/fi:le:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules HEAD >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules HEAD >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep history with moved submoules' '
 	git init parent &&
 	test_when_finished "rm -rf parent" &&
-	echo "foobar" >parent/file &&
+	echo "(1|2)d(3|4)" >parent/file &&
 	git -C parent add file &&
 	git -C parent commit -m "add file" &&
 
 	git init sub &&
 	test_when_finished "rm -rf sub" &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
@@ -203,82 +203,82 @@ test_expect_success 'grep history with moved submoules' '
 	git -C parent commit -m "add submodule" &&
 
 	cat >expect <<-\EOF &&
-	dir/sub/file:foobar
-	file:foobar
+	dir/sub/file:(1|2)d(3|4)
+	file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	git -C parent mv dir/sub sub-moved &&
 	git -C parent commit -m "moved submodule" &&
 
 	cat >expect <<-\EOF &&
-	file:foobar
-	sub-moved/file:foobar
+	file:(1|2)d(3|4)
+	sub-moved/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	cat >expect <<-\EOF &&
-	HEAD^:dir/sub/file:foobar
-	HEAD^:file:foobar
+	HEAD^:dir/sub/file:(1|2)d(3|4)
+	HEAD^:file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules HEAD^ >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules HEAD^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep using relative path' '
 	test_when_finished "rm -rf parent sub" &&
 	git init sub &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
 	git init parent &&
-	echo "foobar" >parent/file &&
+	echo "(1|2)d(3|4)" >parent/file &&
 	git -C parent add file &&
 	mkdir parent/src &&
-	echo "foobar" >parent/src/file2 &&
+	echo "(1|2)d(3|4)" >parent/src/file2 &&
 	git -C parent add src/file2 &&
 	git -C parent submodule add ../sub &&
 	git -C parent commit -m "add files and submodule" &&
 
 	# From top works
 	cat >expect <<-\EOF &&
-	file:foobar
-	src/file2:foobar
-	sub/file:foobar
+	file:(1|2)d(3|4)
+	src/file2:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual &&
 
 	# Relative path to top
 	cat >expect <<-\EOF &&
-	../file:foobar
-	file2:foobar
-	../sub/file:foobar
+	../file:(1|2)d(3|4)
+	file2:(1|2)d(3|4)
+	../sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" -- .. >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" -- .. >actual &&
 	test_cmp expect actual &&
 
 	# Relative path to submodule
 	cat >expect <<-\EOF &&
-	../sub/file:foobar
+	../sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" -- ../sub >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" -- ../sub >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep from a subdir' '
 	test_when_finished "rm -rf parent sub" &&
 	git init sub &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
 	git init parent &&
 	mkdir parent/src &&
-	echo "foobar" >parent/src/file &&
+	echo "(1|2)d(3|4)" >parent/src/file &&
 	git -C parent add src/file &&
 	git -C parent submodule add ../sub src/sub &&
 	git -C parent submodule add ../sub sub &&
@@ -286,19 +286,19 @@ test_expect_success 'grep from a subdir' '
 
 	# Verify grep from root works
 	cat >expect <<-\EOF &&
-	src/file:foobar
-	src/sub/file:foobar
-	sub/file:foobar
+	src/file:(1|2)d(3|4)
+	src/sub/file:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual &&
 
 	# Verify grep from a subdir works
 	cat >expect <<-\EOF &&
-	file:foobar
-	sub/file:foobar
+	file:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual
 '
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 02/19] grep: add tests for grep pattern types being passed to submodules
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 01/19] grep: amend submodule recursion test in preparation for rx engine testing Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 03/19] grep: submodule-related case statements should die if new fields are added Ævar Arnfjörð Bjarmason
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Add testing for grep pattern types being correctly passed to
submodules. The pattern "(.|.)[\d]" matches differently under
fixed (not at all), and then matches different lines under
basic/extended & perl regular expressions, so this change asserts that
the pattern type is passed along correctly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7814-grep-recurse-submodules.sh | 49 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 3c580b38ba..e1822feaf1 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -313,4 +313,53 @@ test_incompatible_with_recurse_submodules ()
 test_incompatible_with_recurse_submodules --untracked
 test_incompatible_with_recurse_submodules --no-index
 
+test_expect_success 'grep --recurse-submodules should pass the pattern type along' '
+	# Fixed
+	test_must_fail git grep -F --recurse-submodules -e "(.|.)[\d]" &&
+	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
+
+	# Basic
+	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
+	cat >expect <<-\EOF &&
+	a:(1|2)d(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
+	EOF
+	test_cmp expect actual &&
+	git -c grep.patternType=basic grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+
+	# Extended
+	git grep -E --recurse-submodules -e "(.|.)[\d]" >actual &&
+	cat >expect <<-\EOF &&
+	.gitmodules:[submodule "submodule"]
+	.gitmodules:	path = submodule
+	.gitmodules:	url = ./submodule
+	a:(1|2)d(3|4)
+	submodule/.gitmodules:[submodule "sub"]
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
+	EOF
+	test_cmp expect actual &&
+	git -c grep.patternType=extended grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+	git -c grep.extendedRegexp=true grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+
+	# Perl
+	if test_have_prereq LIBPCRE
+	then
+		git grep -P --recurse-submodules -e "(.|.)[\d]" >actual &&
+		cat >expect <<-\EOF &&
+		a:(1|2)d(3|4)
+		b/b:(3|4)
+		submodule/a:(1|2)d(3|4)
+		submodule/sub/a:(1|2)d(3|4)
+		EOF
+		test_cmp expect actual &&
+		git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+		test_cmp expect actual
+	fi
+'
+
 test_done
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 03/19] grep: submodule-related case statements should die if new fields are added
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 01/19] grep: amend submodule recursion test in preparation for rx engine testing Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 02/19] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 04/19] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Change two case statements added in commit 0281e487fd ("grep:
optionally recurse into submodules", 2016-12-16) so that they die if
new GREP_PATTERN_* enum fields are added without updating them.

These case statements currently check for an exhaustive list of
fields, but if a new field is added it's easy to introduce a bug here
where the code will start subtly doing the wrong thing, e.g. if a new
pattern type is added we'll fall through to
GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular
expressions.

This should arguably be done for the switch(opt->binary)
case-statement as well, but isn't trivial to add since that code isn't
currently working with an exhaustive list.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ffb5b4e81..be3dbd6957 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt *opt,
 		break;
 	case GREP_PATTERN_TYPE_UNSPECIFIED:
 		break;
+	default:
+		die("BUG: Added a new grep pattern type without updating switch statement");
 	}
 
 	for (pattern = opt->pattern_list; pattern != NULL;
@@ -515,6 +517,8 @@ static void compile_submodule_options(const struct grep_opt *opt,
 		case GREP_PATTERN_BODY:
 		case GREP_PATTERN_HEAD:
 			break;
+		default:
+			die("BUG: Added a new grep token type without updating case statement");
 		}
 	}
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 04/19] grep: remove redundant regflags assignment under PCRE
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 03/19] grep: submodule-related case statements should die if new fields are added Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Remove a redundant assignment to the "regflags" variable. This
variable is only used for POSIX regular expression matching, not when
the PCRE library is used.

This redundant assignment was added as a result of copy/paste
programming in commit 84befcd0a4 ("grep: add a grep.patternType
configuration setting", 2012-08-03). That commit modified already
working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P
when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to
regflags when under PCRE.

Revert back to that behavior, more to reduce "wait this is used under
PCRE how?" confusion when reading the code, than to to save ourselves
trivial CPU cycles by removing one assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.c b/grep.c
index 47cee45067..59ae7809f2 100644
--- a/grep.c
+++ b/grep.c
@@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_PCRE:
 		opt->fixed = 0;
 		opt->pcre = 1;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 	}
 }
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 04/19] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-26  4:50   ` Junio C Hamano
  2017-04-26  5:29   ` Junio C Hamano
  2017-04-25 21:05 ` [PATCH v4 06/19] Makefile & configure: reword outdated comment about PCRE Ævar Arnfjörð Bjarmason
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Remove redundant assignments to the "regflags" variable. There are no
code paths that have previously set the regflags to anything, and
certainly not to `|= REG_EXTENDED`.

This code gave the impression that it had to reset its environment,
but it doesn't. This dates back to the initial introduction of
git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/grep.c b/grep.c
index 59ae7809f2..6995f0989a 100644
--- a/grep.c
+++ b/grep.c
@@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
@@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_PCRE:
@@ -417,7 +415,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
 	int regflags;
 
 	basic_regex_quote_buf(&sb, p->pattern);
-	regflags = opt->regflags & ~REG_EXTENDED;
 	if (opt->ignore_case)
 		regflags |= REG_ICASE;
 	err = regcomp(&p->regexp, sb.buf, regflags);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 06/19] Makefile & configure: reword outdated comment about PCRE
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (4 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 07/19] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Reword an outdated comment which suggests that only git-grep can use
PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Copy/pasting this so much in configure.ac is lame, these Makefile-like
flags aren't even used by autoconf, just the corresponding
--with[out]-* options. But copy/pasting the comments that make sense
for the Makefile to configure.ac where they make less sence is the
pattern everything else follows in that file. I'm not going to war
against that as part of this change, just following the existing
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile     |  6 ++++--
 configure.ac | 12 ++++++++----
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index eb1a1a7cff..2e63b1cfcc 100644
--- a/Makefile
+++ b/Makefile
@@ -24,8 +24,10 @@ all::
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 # This also implies BLK_SHA1.
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
diff --git a/configure.ac b/configure.ac
index 128165529f..deeb968daa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library (default is YES)])
 AS_HELP_STRING([],              [ARG can be prefix for openssl library and headers]),
 GIT_PARSE_WITH([openssl]))
 
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 
 if test -n "$USE_LIBPCRE"; then
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 07/19] grep: add a test for backreferences in PCRE patterns
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (5 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 06/19] Makefile & configure: reword outdated comment about PCRE Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 08/19] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Add a test for backreferences such as (.)\1 in PCRE patterns. This
test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned
on. Before this change turning it on would break these sort of
patterns, but wouldn't break any tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index cee42097b0..ec8fe585a7 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' '
 	test_cmp expected actual
 '
 
+test_expect_success LIBPCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' '
+	git grep -P -h "(?P<one>.)(?P=one)" hello_world >actual &&
+	test_cmp hello_world actual &&
+	git grep -P -h "(.)\1" hello_world >actual &&
+	test_cmp hello_world actual
+'
+
 test_expect_success 'grep -G invalidpattern properly dies ' '
 	test_must_fail git grep -G "a["
 '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 08/19] log: add exhaustive tests for pattern style options & config
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (6 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 07/19] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 09/19] log: add -P as a synonym for --perl-regexp Ævar Arnfjörð Bjarmason
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Add exhaustive tests for how the different grep.patternType options &
the corresponding command-line options affect git-log.

Before this change it was possible to patch revision.c so that the
--basic-regexp option was synonymous with --extended-regexp, and
--perl-regexp wasn't recognized at all, and still have 100% of the
test suite pass.

This was because the first test being modified here, added in commit
34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as
"git grep"", 2012-10-03), didn't actually check whether we'd enabled
extended regular expressions as distinct from re-toggling non-fixed
string support.

Fix that by changing the pattern to a pattern that'll only match if
--extended-regexp option is provided, but won't match under the
default --basic-regexp option.

Other potential regressions were possible since there were no tests
for the rest of the combinations of grep.patternType configuration
toggles & corresponding git-log command-line options. Add exhaustive
tests for those.

The patterns being passed to fixed/basic/extended/PCRE are carefully
crafted to return the wrong thing if the grep engine were to pick any
other matching method than the one it's told to use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 76 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index f577990716..fc186f10ea 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' '
 
 test_expect_success 'log -F -E --grep=<ere> uses ere' '
 	echo second >expect &&
-	git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
+	git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep=<pcre> uses PCRE' '
+	test_when_finished "rm -rf num_commits" &&
+	git init num_commits &&
+	(
+		cd num_commits &&
+		test_commit 1d &&
+		test_commit 2e
+	) &&
+	echo 2e >expect &&
+	git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual &&
+	test_cmp expect actual &&
+	echo 1d >expect &&
+	git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual &&
 	test_cmp expect actual
 '
 
@@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType configuration and command line' '
 	test_cmp expect actual
 '
 
+test_expect_success 'log with various grep.patternType configurations & command-lines' '
+	git init pattern-type &&
+	(
+		cd pattern-type &&
+		test_commit 1 file A &&
+		test_commit "(1|2)" file B &&
+
+		echo "(1|2)" >expect.fixed &&
+		cp expect.fixed expect.basic &&
+		cp expect.fixed expect.extended &&
+		cp expect.fixed expect.perl &&
+
+		git -c grep.patternType=fixed log --pretty=tformat:%s \
+			--grep="(1|2)" >actual.fixed &&
+		git -c grep.patternType=basic log --pretty=tformat:%s \
+			--grep="(.|.)" >actual.basic &&
+		git -c grep.patternType=extended log --pretty=tformat:%s \
+			--grep="\|2" >actual.extended &&
+		if test_have_prereq LIBPCRE
+		then
+			git -c grep.patternType=perl log --pretty=tformat:%s \
+				--grep="[\d]\|" >actual.perl
+		fi &&
+		test_cmp expect.fixed actual.fixed &&
+		test_cmp expect.basic actual.basic &&
+		test_cmp expect.extended actual.extended &&
+		if test_have_prereq LIBPCRE
+		then
+			test_cmp expect.perl actual.perl
+		fi &&
+
+		git log --pretty=tformat:%s -F \
+			--grep="(1|2)" >actual.fixed.short-arg &&
+		git log --pretty=tformat:%s -E \
+			--grep="\|2" >actual.extended.short-arg &&
+		test_cmp expect.fixed actual.fixed.short-arg &&
+		test_cmp expect.extended actual.extended.short-arg &&
+
+		git log --pretty=tformat:%s --fixed-strings \
+			--grep="(1|2)" >actual.fixed.long-arg &&
+		git log --pretty=tformat:%s --basic-regexp \
+			--grep="(.|.)" >actual.basic.long-arg &&
+		git log --pretty=tformat:%s --extended-regexp \
+			--grep="\|2" >actual.extended.long-arg &&
+		if test_have_prereq LIBPCRE
+		then
+			git log --pretty=tformat:%s --perl-regexp \
+				--grep="[\d]\|" >actual.perl.long-arg
+		fi &&
+		test_cmp expect.fixed actual.fixed.long-arg &&
+		test_cmp expect.basic actual.basic.long-arg &&
+		test_cmp expect.extended actual.extended.long-arg &&
+		if test_have_prereq LIBPCRE
+		then
+			test_cmp expect.perl actual.perl.long-arg
+		fi
+	)
+'
+
 cat > expect <<EOF
 * Second
 * sixth
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 09/19] log: add -P as a synonym for --perl-regexp
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (7 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 08/19] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 10/19] grep & rev-list doc: stop promising libpcre " Ævar Arnfjörð Bjarmason
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Since nothing has come along in over 4 1/2 years that's wanted to use
it, it's more valuable to make it consistent with "grep" than to keep
it open for future use, and to avoid the confusion of -P meaning
different things for grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G<regex>.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/rev-list-options.txt | 1 +
 revision.c                         | 2 +-
 t/t4202-log.sh                     | 9 +++++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a02f7324c0..5b7fa49a7f 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -91,6 +91,7 @@ endif::git-rev-list[]
 	Consider the limiting patterns to be fixed strings (don't interpret
 	pattern as a regular expression).
 
+-P::
 --perl-regexp::
 	Consider the limiting patterns to be Perl-compatible regular expressions.
 	Requires libpcre to be compiled in.
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE);
 	} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
 		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
-	} else if (!strcmp(arg, "--perl-regexp")) {
+	} else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
 		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
 	} else if (!strcmp(arg, "--all-match")) {
 		revs->grep_filter.all_match = 1;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index fc186f10ea..06acc848b8 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -331,8 +331,17 @@ test_expect_success 'log with various grep.patternType configurations & command-
 			--grep="(1|2)" >actual.fixed.short-arg &&
 		git log --pretty=tformat:%s -E \
 			--grep="\|2" >actual.extended.short-arg &&
+		if test_have_prereq PCRE
+		then
+			git log --pretty=tformat:%s -P \
+				--grep="[\d]\|" >actual.perl.short-arg
+		fi &&
 		test_cmp expect.fixed actual.fixed.short-arg &&
 		test_cmp expect.extended actual.extended.short-arg &&
+		if test_have_prereq PCRE
+		then
+			test_cmp expect.perl actual.perl.short-arg
+		fi &&
 
 		git log --pretty=tformat:%s --fixed-strings \
 			--grep="(1|2)" >actual.fixed.long-arg &&
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 10/19] grep & rev-list doc: stop promising libpcre for --perl-regexp
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (8 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 09/19] log: add -P as a synonym for --perl-regexp Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 11/19] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" Ævar Arnfjörð Bjarmason
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Stop promising in our grep & rev-list options documentation that we're
always going to be using libpcre when given the --perl-regexp option.

Instead talk about using "Perl-compatible regular expressions" and
using these types of patterns using "a compile-time dependency".

Saying "libpcre" strongly suggests that we might be talking about
libpcre.so, which is always going to be v1. This change is part of a
series to add support for libpcre2 which comes with v2 of PCRE.

In the future we might use some completely unrelated library to
provide perl-compatible regular expression support. By wording the
documentation differently and not promising any specific version of
PCRE or ever PCRE at all we have more wiggle room to change the
implementation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-grep.txt         | 7 +++++--
 Documentation/rev-list-options.txt | 8 ++++++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 71f32f3508..5033483db4 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -161,8 +161,11 @@ OPTIONS
 
 -P::
 --perl-regexp::
-	Use Perl-compatible regexp for patterns. Requires libpcre to be
-	compiled in.
+	Use Perl-compatible regular expressions for patterns.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 -F::
 --fixed-strings::
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 5b7fa49a7f..9c44eae55d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -93,8 +93,12 @@ endif::git-rev-list[]
 
 -P::
 --perl-regexp::
-	Consider the limiting patterns to be Perl-compatible regular expressions.
-	Requires libpcre to be compiled in.
+	Consider the limiting patterns to be Perl-compatible regular
+	expressions.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 --remove-empty::
 	Stop when a given path disappears from the tree.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 11/19] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (9 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 10/19] grep & rev-list doc: stop promising libpcre " Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 12/19] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Make the pattern types "pcre" & "pcre1" synonyms for the long-standing
"perl" grep.patternType.

This change is part of a longer patch series to add pcre2 support to
Git. It's nice to be able to performance test PCRE v1 v.s. v2 without
having to recompile git, and doing that via grep.patternType makes
sense.

However, just adding "pcre2" when we only have "perl" would be
confusing, so start by adding a "pcre" & "pcre1" synonym.

In the future "perl" and "pcre" might be changed to default to "pcre2"
instead of "pcre1", and depending on how Git is compiled the more
specific "pcre1" or "pcre2" pattern types might produce an error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt           |  9 +++++++++
 grep.c                             |  4 +++-
 t/t7810-grep.sh                    | 10 ++++++++++
 t/t7814-grep-recurse-submodules.sh |  4 ++++
 4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..5ef12d0694 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,6 +1624,15 @@ grep.patternType::
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
 	value 'default' will return to the default matching behavior.
++
+The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
+values starting with 'pcre' are reserved for future use, e.g. if we'd
+like to use 'pcre2' for the PCRE v2 library.
++
+In the future 'perl' and 'pcre' might become synonyms for some other
+implementation or PCRE version, such as 'pcre2', while the more
+specific 'pcre1' & 'pcre2' might throw errors depending on whether git
+is compiled to include those libraries.
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
diff --git a/grep.c b/grep.c
index 6995f0989a..62c3749d9f 100644
--- a/grep.c
+++ b/grep.c
@@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 		return GREP_PATTERN_TYPE_ERE;
 	else if (!strcmp(arg, "fixed"))
 		return GREP_PATTERN_TYPE_FIXED;
-	else if (!strcmp(arg, "perl"))
+	else if (!strcmp(arg, "perl") ||
+		 !strcmp(arg, "pcre") ||
+		 !strcmp(arg, "pcre1"))
 		return GREP_PATTERN_TYPE_PCRE;
 	die("bad %s argument: %s", opt, arg);
 }
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index ec8fe585a7..7199356d35 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1512,4 +1512,14 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	test_cmp expected actual
 '
 
+test_expect_success LIBPCRE "grep with grep.patternType synonyms perl/pcre/pcre1" '
+	echo "#include <stdio.h>" >expected &&
+	git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" >actual &&
+	test_cmp expected actual &&
+	git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" >actual &&
+	test_cmp expected actual &&
+	git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual &&
+	test_cmp expected actual
+'
+
 test_done
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index e1822feaf1..325fde53ef 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -358,6 +358,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 		EOF
 		test_cmp expect actual &&
 		git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+		test_cmp expect actual &&
+		git -c grep.patternType=pcre grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+		test_cmp expect actual &&
+		git -c grep.patternType=pcre1 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
 		test_cmp expect actual
 	fi
 '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 12/19] test-lib: rename the LIBPCRE prerequisite to PCRE
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (10 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 11/19] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 13/19] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 10696 bytes --]

Rename the LIBPCRE prerequisite to PCRE. This is for preparation for
libpcre2 support, where having just "LIBPCRE" would be confusing as it
implies v1 of the library.

None of these tests are incompatible between versions 1 & 2 of
libpcre, it's less confusing to give them a more general name to make
it clear that they work on both library versions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/README                           |  4 ++--
 t/t4202-log.sh                     | 10 +++++-----
 t/t7810-grep.sh                    | 32 ++++++++++++++++----------------
 t/t7812-grep-icase-non-ascii.sh    |  4 ++--
 t/t7813-grep-icase-iso.sh          |  2 +-
 t/t7814-grep-recurse-submodules.sh |  2 +-
 t/test-lib.sh                      |  2 +-
 7 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/t/README b/t/README
index ab386c3681..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own.
    Test is not run by root user, and an attempt to write to an
    unwritable file is expected to fail correctly.
 
- - LIBPCRE
+ - PCRE
 
-   Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
+   Git was compiled with support for PCRE. Wrap any tests
    that use git-grep --perl-regexp or git-grep -P in these.
 
  - CASE_INSENSITIVE_FS
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 06acc848b8..0b5f808ca3 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -266,7 +266,7 @@ test_expect_success 'log -F -E --grep=<ere> uses ere' '
 	test_cmp expect actual
 '
 
-test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep=<pcre> uses PCRE' '
+test_expect_success PCRE 'log -F -E --perl-regexp --grep=<pcre> uses PCRE' '
 	test_when_finished "rm -rf num_commits" &&
 	git init num_commits &&
 	(
@@ -314,7 +314,7 @@ test_expect_success 'log with various grep.patternType configurations & command-
 			--grep="(.|.)" >actual.basic &&
 		git -c grep.patternType=extended log --pretty=tformat:%s \
 			--grep="\|2" >actual.extended &&
-		if test_have_prereq LIBPCRE
+		if test_have_prereq PCRE
 		then
 			git -c grep.patternType=perl log --pretty=tformat:%s \
 				--grep="[\d]\|" >actual.perl
@@ -322,7 +322,7 @@ test_expect_success 'log with various grep.patternType configurations & command-
 		test_cmp expect.fixed actual.fixed &&
 		test_cmp expect.basic actual.basic &&
 		test_cmp expect.extended actual.extended &&
-		if test_have_prereq LIBPCRE
+		if test_have_prereq PCRE
 		then
 			test_cmp expect.perl actual.perl
 		fi &&
@@ -349,7 +349,7 @@ test_expect_success 'log with various grep.patternType configurations & command-
 			--grep="(.|.)" >actual.basic.long-arg &&
 		git log --pretty=tformat:%s --extended-regexp \
 			--grep="\|2" >actual.extended.long-arg &&
-		if test_have_prereq LIBPCRE
+		if test_have_prereq PCRE
 		then
 			git log --pretty=tformat:%s --perl-regexp \
 				--grep="[\d]\|" >actual.perl.long-arg
@@ -357,7 +357,7 @@ test_expect_success 'log with various grep.patternType configurations & command-
 		test_cmp expect.fixed actual.fixed.long-arg &&
 		test_cmp expect.basic actual.basic.long-arg &&
 		test_cmp expect.extended actual.extended.long-arg &&
-		if test_have_prereq LIBPCRE
+		if test_have_prereq PCRE
 		then
 			test_cmp expect.perl actual.perl.long-arg
 		fi
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 7199356d35..f929b447e9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -275,7 +275,7 @@ do
 		test_cmp expected actual
 	'
 
-	test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" '
+	test_expect_success PCRE "grep $L with grep.patterntype=perl" '
 		echo "${HC}ab:a+b*c" >expected &&
 		git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab >actual &&
 		test_cmp expected actual
@@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv)
 hello.c:	printf("Hello world.\n");
 EOF
 
-test_expect_success LIBPCRE 'grep --perl-regexp pattern' '
+test_expect_success PCRE 'grep --perl-regexp pattern' '
 	git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern' '
+test_expect_success PCRE 'grep -P pattern' '
 	git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
@@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with grep.extendedRegexp=true' '
 	test_cmp empty actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern with grep.extendedRegexp=true' '
+test_expect_success PCRE 'grep -P pattern with grep.extendedRegexp=true' '
 	git -c grep.extendedregexp=true \
 		grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -v pattern' '
+test_expect_success PCRE 'grep -P -v pattern' '
 	{
 		echo "ab:a+b*c"
 		echo "ab:a+bc"
@@ -1085,7 +1085,7 @@ test_expect_success LIBPCRE 'grep -P -v pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -i pattern' '
+test_expect_success PCRE 'grep -P -i pattern' '
 	cat >expected <<-EOF &&
 	hello.c:	printf("Hello world.\n");
 	EOF
@@ -1093,7 +1093,7 @@ test_expect_success LIBPCRE 'grep -P -i pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -w pattern' '
+test_expect_success PCRE 'grep -P -w pattern' '
 	{
 		echo "hello_world:Hello world"
 		echo "hello_world:HeLLo world"
@@ -1102,7 +1102,7 @@ test_expect_success LIBPCRE 'grep -P -w pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' '
+test_expect_success PCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' '
 	git grep -P -h "(?P<one>.)(?P=one)" hello_world >actual &&
 	test_cmp hello_world actual &&
 	git grep -P -h "(.)\1" hello_world >actual &&
@@ -1125,11 +1125,11 @@ test_expect_success 'grep invalidpattern properly dies with grep.patternType=ext
 	test_must_fail git -c grep.patterntype=extended grep "a["
 '
 
-test_expect_success LIBPCRE 'grep -P invalidpattern properly dies ' '
+test_expect_success PCRE 'grep -P invalidpattern properly dies ' '
 	test_must_fail git grep -P "a["
 '
 
-test_expect_success LIBPCRE 'grep invalidpattern properly dies with grep.patternType=perl' '
+test_expect_success PCRE 'grep invalidpattern properly dies with grep.patternType=perl' '
 	test_must_fail git -c grep.patterntype=perl grep "a["
 '
 
@@ -1198,13 +1198,13 @@ test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =perl, =e
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -G -F -E -P pattern' '
+test_expect_success PCRE 'grep -G -F -E -P pattern' '
 	echo "d0:0" >expected &&
 	git grep -G -F -E -P "[\d]" d0 >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
+test_expect_success PCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
 	echo "d0:0" >expected &&
 	git \
 		-c grep.patterntype=fixed \
@@ -1215,7 +1215,7 @@ test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern with grep.patternType=fixed' '
+test_expect_success PCRE 'grep -P pattern with grep.patternType=fixed' '
 	echo "ab:a+b*c" >expected &&
 	git \
 		-c grep.patterntype=fixed \
@@ -1350,12 +1350,12 @@ space: line with leading space2
 space: line with leading space3
 EOF
 
-test_expect_success LIBPCRE 'grep -E "^ "' '
+test_expect_success PCRE 'grep -E "^ "' '
 	git grep -E "^ " space >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P "^ "' '
+test_expect_success PCRE 'grep -P "^ "' '
 	git grep -P "^ " space >actual &&
 	test_cmp expected actual
 '
@@ -1512,7 +1512,7 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE "grep with grep.patternType synonyms perl/pcre/pcre1" '
+test_expect_success PCRE "grep with grep.patternType synonyms perl/pcre/pcre1" '
 	echo "#include <stdio.h>" >expected &&
 	git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" >actual &&
 	test_cmp expected actual &&
diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh
index 169fd8d706..04a61cb8e0 100755
--- a/t/t7812-grep-icase-non-ascii.sh
+++ b/t/t7812-grep-icase-non-ascii.sh
@@ -20,13 +20,13 @@ test_expect_success REGEX_LOCALE 'grep literal string, no -F' '
 	git grep -i "TILRAUN: HALLÓ HEIMUR!"
 '
 
-test_expect_success GETTEXT_LOCALE,LIBPCRE 'grep pcre utf-8 icase' '
+test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 icase' '
 	git grep --perl-regexp    "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!"
 '
 
-test_expect_success GETTEXT_LOCALE,LIBPCRE 'grep pcre utf-8 string with "+"' '
+test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 string with "+"' '
 	test_write_lines "TILRAUN: Hallóó Heimur!" >file2 &&
 	git add file2 &&
 	git grep -l --perl-regexp "TILRAUN: H.lló+ Heimur!" >actual &&
diff --git a/t/t7813-grep-icase-iso.sh b/t/t7813-grep-icase-iso.sh
index efef7fb81f..701e08a8e5 100755
--- a/t/t7813-grep-icase-iso.sh
+++ b/t/t7813-grep-icase-iso.sh
@@ -11,7 +11,7 @@ test_expect_success GETTEXT_ISO_LOCALE 'setup' '
 	export LC_ALL
 '
 
-test_expect_success GETTEXT_ISO_LOCALE,LIBPCRE 'grep pcre string' '
+test_expect_success GETTEXT_ISO_LOCALE,PCRE 'grep pcre string' '
 	git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!"
 '
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 325fde53ef..dc6cf771ec 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -347,7 +347,7 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 	test_cmp expect actual &&
 
 	# Perl
-	if test_have_prereq LIBPCRE
+	if test_have_prereq PCRE
 	then
 		git grep -P --recurse-submodules -e "(.|.)[\d]" >actual &&
 		cat >expect <<-\EOF &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822..6abf1d1918 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq LIBPCRE
+test -n "$USE_LIBPCRE" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 13/19] grep: change the internal PCRE macro names to be PCRE1
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (11 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 12/19] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 14/19] grep: change the internal PCRE code & header " Ævar Arnfjörð Bjarmason
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile      | 4 ++--
 grep.c        | 6 +++---
 grep.h        | 2 +-
 t/test-lib.sh | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 2e63b1cfcc..aecdfdcfe6 100644
--- a/Makefile
+++ b/Makefile
@@ -1086,7 +1086,7 @@ ifdef NO_LIBGEN_H
 endif
 
 ifdef USE_LIBPCRE
-	BASIC_CFLAGS += -DUSE_LIBPCRE
+	BASIC_CFLAGS += -DUSE_LIBPCRE1
 	ifdef LIBPCREDIR
 		BASIC_CFLAGS += -I$(LIBPCREDIR)/include
 		EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@@ -2238,7 +2238,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
 	@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
 	@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
-	@echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
+	@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
 	@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
 	@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/grep.c b/grep.c
index 62c3749d9f..a378b7edaa 100644
--- a/grep.c
+++ b/grep.c
@@ -323,7 +323,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p,
 	die("%s'%s': %s", where, p->pattern, error);
 }
 
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	const char *error;
@@ -375,7 +375,7 @@ static void free_pcre_regexp(struct grep_pat *p)
 	pcre_free(p->pcre_extra_info);
 	pcre_free((void *)p->pcre_tables);
 }
-#else /* !USE_LIBPCRE */
+#else /* !USE_LIBPCRE1 */
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE");
@@ -390,7 +390,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 static void free_pcre_regexp(struct grep_pat *p)
 {
 }
-#endif /* !USE_LIBPCRE */
+#endif /* !USE_LIBPCRE1 */
 
 static int is_fixed(const char *s, size_t len)
 {
diff --git a/grep.h b/grep.h
index 267534ca24..073b0e4c92 100644
--- a/grep.h
+++ b/grep.h
@@ -1,7 +1,7 @@
 #ifndef GREP_H
 #define GREP_H
 #include "color.h"
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 #include <pcre.h>
 #else
 typedef int pcre;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6abf1d1918..e5cfbcc36b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 14/19] grep: change the internal PCRE code & header names to be PCRE1
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (12 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 13/19] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 15/19] perf: add a performance comparison test of grep -E and -P Ævar Arnfjörð Bjarmason
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Change the internal PCRE variable & function names to have a "1"
suffix. This is for preparation for libpcre2 support, where having
non-versioned names would be confusing.

The earlier "grep: change the internal PCRE macro names to be PCRE1"
change elaborates on the motivations behind this commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 grep.c         | 56 ++++++++++++++++++++++++++++----------------------------
 grep.h         | 10 +++++-----
 revision.c     |  2 +-
 4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index be3dbd6957..28e0dd3236 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt *opt,
 	case GREP_PATTERN_TYPE_FIXED:
 		argv_array_push(&submodule_options, "-F");
 		break;
-	case GREP_PATTERN_TYPE_PCRE:
+	case GREP_PATTERN_TYPE_PCRE1:
 		argv_array_push(&submodule_options, "-P");
 		break;
 	case GREP_PATTERN_TYPE_UNSPECIFIED:
@@ -1022,7 +1022,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    GREP_PATTERN_TYPE_FIXED),
 		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
 			    N_("use Perl-compatible regular expressions"),
-			    GREP_PATTERN_TYPE_PCRE),
+			    GREP_PATTERN_TYPE_PCRE1),
 		OPT_GROUP(""),
 		OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")),
 		OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1),
diff --git a/grep.c b/grep.c
index a378b7edaa..c61e69deaa 100644
--- a/grep.c
+++ b/grep.c
@@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 	else if (!strcmp(arg, "perl") ||
 		 !strcmp(arg, "pcre") ||
 		 !strcmp(arg, "pcre1"))
-		return GREP_PATTERN_TYPE_PCRE;
+		return GREP_PATTERN_TYPE_PCRE1;
 	die("bad %s argument: %s", opt, arg);
 }
 
@@ -180,23 +180,23 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
 		opt->fixed = 0;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		opt->regflags |= REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		break;
 
-	case GREP_PATTERN_TYPE_PCRE:
+	case GREP_PATTERN_TYPE_PCRE1:
 		opt->fixed = 0;
-		opt->pcre = 1;
+		opt->pcre1 = 1;
 		break;
 	}
 }
@@ -324,7 +324,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p,
 }
 
 #ifdef USE_LIBPCRE1
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	const char *error;
 	int erroffset;
@@ -332,23 +332,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 
 	if (opt->ignore_case) {
 		if (has_non_ascii(p->pattern))
-			p->pcre_tables = pcre_maketables();
+			p->pcre1_tables = pcre_maketables();
 		options |= PCRE_CASELESS;
 	}
 	if (is_utf8_locale() && has_non_ascii(p->pattern))
 		options |= PCRE_UTF8;
 
-	p->pcre_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
-				      p->pcre_tables);
-	if (!p->pcre_regexp)
+	p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
+				      p->pcre1_tables);
+	if (!p->pcre1_regexp)
 		compile_regexp_failed(p, error);
 
-	p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, &error);
-	if (!p->pcre_extra_info && error)
+	p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error);
+	if (!p->pcre1_extra_info && error)
 		die("%s", error);
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 		regmatch_t *match, int eflags)
 {
 	int ovector[30], ret, flags = 0;
@@ -356,7 +356,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 	if (eflags & REG_NOTBOL)
 		flags |= PCRE_NOTBOL;
 
-	ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line,
+	ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
 			0, flags, ovector, ARRAY_SIZE(ovector));
 	if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
 		die("pcre_exec failed with error code %d", ret);
@@ -369,25 +369,25 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 	return ret;
 }
 
-static void free_pcre_regexp(struct grep_pat *p)
+static void free_pcre1_regexp(struct grep_pat *p)
 {
-	pcre_free(p->pcre_regexp);
-	pcre_free(p->pcre_extra_info);
-	pcre_free((void *)p->pcre_tables);
+	pcre_free(p->pcre1_regexp);
+	pcre_free(p->pcre1_extra_info);
+	pcre_free((void *)p->pcre1_tables);
 }
 #else /* !USE_LIBPCRE1 */
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE");
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 		regmatch_t *match, int eflags)
 {
 	return 1;
 }
 
-static void free_pcre_regexp(struct grep_pat *p)
+static void free_pcre1_regexp(struct grep_pat *p)
 {
 }
 #endif /* !USE_LIBPCRE1 */
@@ -473,8 +473,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre) {
-		compile_pcre_regexp(p, opt);
+	if (opt->pcre1) {
+		compile_pcre1_regexp(p, opt);
 		return;
 	}
 
@@ -830,8 +830,8 @@ void free_grep_patterns(struct grep_opt *opt)
 		case GREP_PATTERN_BODY:
 			if (p->kws)
 				kwsfree(p->kws);
-			else if (p->pcre_regexp)
-				free_pcre_regexp(p);
+			else if (p->pcre1_regexp)
+				free_pcre1_regexp(p);
 			else
 				regfree(&p->regexp);
 			free(p->pattern);
@@ -910,8 +910,8 @@ static int patmatch(struct grep_pat *p, char *line, char *eol,
 
 	if (p->fixed)
 		hit = !fixmatch(p, line, eol, match);
-	else if (p->pcre_regexp)
-		hit = !pcrematch(p, line, eol, match, eflags);
+	else if (p->pcre1_regexp)
+		hit = !pcre1match(p, line, eol, match, eflags);
 	else
 		hit = !regexec_buf(&p->regexp, line, eol - line, 1, match,
 				   eflags);
diff --git a/grep.h b/grep.h
index 073b0e4c92..fa2ab9485f 100644
--- a/grep.h
+++ b/grep.h
@@ -46,9 +46,9 @@ struct grep_pat {
 	size_t patternlen;
 	enum grep_header_field field;
 	regex_t regexp;
-	pcre *pcre_regexp;
-	pcre_extra *pcre_extra_info;
-	const unsigned char *pcre_tables;
+	pcre *pcre1_regexp;
+	pcre_extra *pcre1_extra_info;
+	const unsigned char *pcre1_tables;
 	kwset_t kws;
 	unsigned fixed:1;
 	unsigned ignore_case:1;
@@ -68,7 +68,7 @@ enum grep_pattern_type {
 	GREP_PATTERN_TYPE_BRE,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
-	GREP_PATTERN_TYPE_PCRE
+	GREP_PATTERN_TYPE_PCRE1
 };
 
 struct grep_expr {
@@ -111,7 +111,7 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre;
+	int pcre1;
 	int relative;
 	int pathname;
 	int null_following_name;
diff --git a/revision.c b/revision.c
index 03a3a012de..7a10a8570a 100644
--- a/revision.c
+++ b/revision.c
@@ -1996,7 +1996,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
 		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
 	} else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
-		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
+		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE1;
 	} else if (!strcmp(arg, "--all-match")) {
 		revs->grep_filter.all_match = 1;
 	} else if (!strcmp(arg, "--invert-grep")) {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 15/19] perf: add a performance comparison test of grep -E and -P
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (13 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 14/19] grep: change the internal PCRE code & header " Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 16/19] grep: add support for the PCRE v1 JIT API Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Add a very basic performance comparison test comparing the POSIX
extended & pcre1 engines.

I'm skipping the "basic" POSIX engine because supporting its alternate
regex syntax is hard, although it would be interesting to test it, at
least under glibc it seems to be an entirely different engine, since
it can have very different performance even for patterns that mean the
same thing under extended and non-extended POSIX regular expression
syntax.

Running this on an i7 3.4GHz Linux 3.16.0-4 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3:

    $ GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
    [...]
    Test                                                       this tree
    -----------------------------------------------------------------------------
    7820.1: extended with how.to                               0.28(1.23+0.44)
    7820.2: extended with ^how to                              0.26(1.15+0.38)
    7820.3: extended with \w+our\w*                            6.06(38.44+0.35)
    7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?-----------$   0.37(1.57+0.38)
    7820.5: pcre1 with how.to                                  0.26(1.15+0.37)
    7820.6: pcre1 with ^how to                                 0.46(2.66+0.31)
    7820.7: pcre1 with \w+our\w*                               16.42(99.42+0.48)
    7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?-----------$      81.52(275.37+0.41)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/p7820-grep-engines.sh | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100755 t/perf/p7820-grep-engines.sh

diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
new file mode 100755
index 0000000000..5ae42ceccc
--- /dev/null
+++ b/t/perf/p7820-grep-engines.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+test_description="Comparison of git-grep's regex engines"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for engine in extended pcre1
+do
+	# Patterns stolen from http://sljit.sourceforge.net/pcre.html
+	for pattern in \
+		'how.to' \
+		'^how to' \
+		'\w+our\w*' \
+		'-?-?-?-?-?-?-?-?-?-?-?-----------$'
+	do
+		test_perf "$engine with $pattern" "
+			git -c grep.patternType=$engine grep -- '$pattern' || :
+		"
+	done
+done
+
+test_done
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 16/19] grep: add support for the PCRE v1 JIT API
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (14 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 15/19] perf: add a performance comparison test of grep -E and -P Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 17/19] grep: add support for PCRE v2 Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Change the grep PCRE v1 code to use JIT when available. When PCRE
support was initially added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into
8.20 released on 2011-10-21.

When JIT support is enabled the PCRE performance usually improves by
more than 50%. The pattern compilation times are relatively slower,
but those relative numbers are tiny, and are easily made back in all
but the most trivial cases of grep. Detailed benchhmarks are available
at: http://sljit.sourceforge.net/pcre.html

With this change the difference in a t/perf/p7820-grep-engines.sh run
is, shown with git --word-diff:

    7820.1: extended with how.to                               [-0.28(1.23+0.44)-]{+0.28(1.18+0.39)+}
    7820.2: extended with ^how to                              [-0.26(1.15+0.38)-]{+0.27(1.13+0.40)+}
    7820.3: extended with \w+our\w*                            [-6.06(38.44+0.35)-]{+6.11(38.66+0.32)+}
    7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?-----------$   [-0.37(1.57+0.38)-]{+0.37(1.56+0.42)+}
    7820.5: pcre1 with how.to                                  [-0.26(1.15+0.37)-]{+0.19(0.39+0.55)+}
    7820.6: pcre1 with ^how to                                 [-0.46(2.66+0.31)-]{+0.22(0.67+0.44)+}
    7820.7: pcre1 with \w+our\w*                               [-16.42(99.42+0.48)-]{+0.51(3.05+0.24)+}
    7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?-----------$      [-81.52(275.37+0.41)-]{+5.16(19.31+0.33)+}

The conditional support for JIT is implemented as suggested in the
pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's
not present.

There's no graceful fallback if pcre_jit_stack_alloc() fails under
PCRE_CONFIG_JIT, instead the program will abort. I don't think this is
worth handling, it'll only fail in cases where malloc() doesn't work,
in which case we're screwed anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 27 ++++++++++++++++++++++++++-
 grep.h |  5 +++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index c61e69deaa..8e3ba164b5 100644
--- a/grep.c
+++ b/grep.c
@@ -329,6 +329,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 	const char *error;
 	int erroffset;
 	int options = PCRE_MULTILINE;
+#ifdef PCRE_CONFIG_JIT
+	int canjit;
+#endif
 
 	if (opt->ignore_case) {
 		if (has_non_ascii(p->pattern))
@@ -343,9 +346,19 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 	if (!p->pcre1_regexp)
 		compile_regexp_failed(p, error);
 
-	p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error);
+	p->pcre1_extra_info = pcre_study(p->pcre1_regexp, PCRE_STUDY_JIT_COMPILE, &error);
 	if (!p->pcre1_extra_info && error)
 		die("%s", error);
+
+#ifdef PCRE_CONFIG_JIT
+	pcre_config(PCRE_CONFIG_JIT, &canjit);
+	if (canjit == 1) {
+		p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
+		if (!p->pcre1_jit_stack)
+			die("BUG: Couldn't allocate PCRE JIT stack");
+		pcre_assign_jit_stack(p->pcre1_extra_info, NULL, p->pcre1_jit_stack);
+	}
+#endif
 }
 
 static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
@@ -356,8 +369,15 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 	if (eflags & REG_NOTBOL)
 		flags |= PCRE_NOTBOL;
 
+#ifdef PCRE_CONFIG_JIT
+	ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
+			    0, flags, ovector, ARRAY_SIZE(ovector),
+			    p->pcre1_jit_stack);
+#else
 	ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
 			0, flags, ovector, ARRAY_SIZE(ovector));
+#endif
+
 	if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
 		die("pcre_exec failed with error code %d", ret);
 	if (ret > 0) {
@@ -372,7 +392,12 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 static void free_pcre1_regexp(struct grep_pat *p)
 {
 	pcre_free(p->pcre1_regexp);
+#ifdef PCRE_CONFIG_JIT
+	pcre_free_study(p->pcre1_extra_info);
+	pcre_jit_stack_free(p->pcre1_jit_stack);
+#else
 	pcre_free(p->pcre1_extra_info);
+#endif
 	pcre_free((void *)p->pcre1_tables);
 }
 #else /* !USE_LIBPCRE1 */
diff --git a/grep.h b/grep.h
index fa2ab9485f..29e20bf837 100644
--- a/grep.h
+++ b/grep.h
@@ -3,9 +3,13 @@
 #include "color.h"
 #ifdef USE_LIBPCRE1
 #include <pcre.h>
+#ifndef PCRE_STUDY_JIT_COMPILE
+#define PCRE_STUDY_JIT_COMPILE 0
+#endif
 #else
 typedef int pcre;
 typedef int pcre_extra;
+typedef int pcre_jit_stack;
 #endif
 #include "kwset.h"
 #include "thread-utils.h"
@@ -48,6 +52,7 @@ struct grep_pat {
 	regex_t regexp;
 	pcre *pcre1_regexp;
 	pcre_extra *pcre1_extra_info;
+	pcre_jit_stack *pcre1_jit_stack;
 	const unsigned char *pcre1_tables;
 	kwset_t kws;
 	unsigned fixed:1;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 17/19] grep: add support for PCRE v2
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (15 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 16/19] grep: add support for the PCRE v1 JIT API Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 18/19] grep: remove support for concurrent use of both PCRE v1 & v2 Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 19/19] Makefile & configure: make PCRE v2 the default PCRE implementation Ævar Arnfjörð Bjarmason
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 24855 bytes --]

Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1].

The regular expression syntax is the same, but while the API is
similar-ish, pretty much every function is either renamed or takes
different arguments. Thus using it via entirely new functions makes
sense, as opposed to trying to e.g. have one compile_pcre_pattern()
that would call either PCRE v1 or v2 functions.

Git can now be compiled with any combination of
USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are
provided the version of the PCRE library can be selected at runtime
with grep.PatternType, but the default (for now) is v1.

This table shows what the various combinations do, depending on what
libraries Git is compiled against:

    |------------------+-----+-----+----------|
    | grep.PatternType | v1  | v2  | v1 && v2 |
    |------------------+-----+-----+----------|
    | perl             | v1  | v2  | v1       |
    | pcre             | v1  | v2  | v1       |
    | pcre1            | v1  | ERR | v1       |
    | pcre2            | ERR | v2  | v2       |
    |------------------+-----+-----+----------|

When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp
& -P will use v2. All tests pass with this new PCRE version. When Git
is compiled with both v1 & v2 most of the tests will only test v1, but
there are some v2-specific tests that will be run.

With earlier patches to enable JIT for PCRE v1 the performance of the
release versions of both libraries have almost exactly the same
performance, with PCRE v2 being around 1% slower.

However after I reported this to the pcre-dev mailing list[2] I got a
lot of help with the API use from Zoltán Herczeg, he
subsequently optimized some of the JIT functionality in v2 of the
library.

Running the p7820-grep-engines.sh performance test against the latest
Subversion trunk of both, with both them and git compiled as -O3, and
the test run against linux.git, gives the following results:

    7820.1: extended with how.to                               0.27(1.22+0.34)
    7820.2: extended with ^how to                              0.27(1.18+0.36)
    7820.3: extended with \w+our\w*                            6.09(38.64+0.32)
    7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?-----------$   0.38(1.69+0.28)
    7820.5: pcre1 with how.to                                  0.19(0.42+0.53)
    7820.6: pcre1 with ^how to                                 0.23(0.58+0.50)
    7820.7: pcre1 with \w+our\w*                               0.50(2.93+0.34)
    7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?-----------$      5.12(19.32+0.38)
    7820.9: pcre2 with how.to                                  0.19(0.34+0.57)
    7820.10: pcre2 with ^how to                                0.19(0.29+0.60)
    7820.11: pcre2 with \w+our\w*                              0.51(2.85+0.41)
    7820.12: pcre2 with -?-?-?-?-?-?-?-?-?-?-?-----------$     5.04(19.27+0.31)

I.e. the two are neck-to-neck, but PCRE v2 usually pulls ahead, when
it does it's up to 20% faster.

A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
the compiled pattern can be shared between threads, but not some of
the JIT context, however the grep threading support does all pattern &
JIT compilation in separate threads, so this code doesn't need to
concern itself with thread safety.

See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the
initial addition of PCRE v1. This change follows some of the same
patterns it did (and which were discussed on list at the time),
e.g. mocking up types with typedef instead of ifdef-ing them out when
USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
program, but makes the code look nicer.

1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt           |  14 ++--
 Makefile                           |  18 +++++
 builtin/grep.c                     |   7 +-
 configure.ac                       |  49 +++++++++++-
 grep.c                             | 154 ++++++++++++++++++++++++++++++++++++-
 grep.h                             |  21 ++++-
 revision.c                         |   2 +-
 t/README                           |  12 +++
 t/perf/p7820-grep-engines.sh       |   2 +-
 t/t7810-grep.sh                    |  30 +++++++-
 t/t7813-grep-icase-iso.sh          |  11 ++-
 t/t7814-grep-recurse-submodules.sh |  12 ++-
 t/test-lib.sh                      |   4 +-
 13 files changed, 309 insertions(+), 27 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5ef12d0694..a5fc482495 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1625,14 +1625,12 @@ grep.patternType::
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
 	value 'default' will return to the default matching behavior.
 +
-The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
-values starting with 'pcre' are reserved for future use, e.g. if we'd
-like to use 'pcre2' for the PCRE v2 library.
-+
-In the future 'perl' and 'pcre' might become synonyms for some other
-implementation or PCRE version, such as 'pcre2', while the more
-specific 'pcre1' & 'pcre2' might throw errors depending on whether git
-is compiled to include those libraries.
+The 'perl' and 'pcre' values are synonyms. Depending on which PCRE
+library Git was compiled with either or both of 'pcre1' and 'pcre2'
+might also be available.
++
+If both are available Git currently defaults to 'pcre1', but this
+might change in future versions.
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
diff --git a/Makefile b/Makefile
index aecdfdcfe6..afdde49cda 100644
--- a/Makefile
+++ b/Makefile
@@ -32,6 +32,14 @@ all::
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
 #
+# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
+#
+# Define LIBPCRE2DIR=/foo/bar if your libpcre2 header and library
+# files are in /foo/bar/include and /foo/bar/lib directories.
+#
 # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
 #
 # Define NO_CURL if you do not have libcurl installed.  git-http-fetch and
@@ -1094,6 +1102,15 @@ ifdef USE_LIBPCRE
 	EXTLIBS += -lpcre
 endif
 
+ifdef USE_LIBPCRE2
+	BASIC_CFLAGS += -DUSE_LIBPCRE2
+	ifdef LIBPCRE2DIR
+		BASIC_CFLAGS += -I$(LIBPCRE2DIR)/include
+		EXTLIBS += -L$(LIBPCRE2DIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCR2EDIR)/$(lib)
+	endif
+	EXTLIBS += -lpcre2-8
+endif
+
 ifdef HAVE_ALLOCA_H
 	BASIC_CFLAGS += -DHAVE_ALLOCA_H
 endif
@@ -2239,6 +2256,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
 	@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
 	@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
+	@echo USE_LIBPCRE2=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE2)))'\' >>$@+
 	@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
 	@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/builtin/grep.c b/builtin/grep.c
index 28e0dd3236..178b10aa6f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -490,11 +490,14 @@ static void compile_submodule_options(const struct grep_opt *opt,
 	case GREP_PATTERN_TYPE_FIXED:
 		argv_array_push(&submodule_options, "-F");
 		break;
-	case GREP_PATTERN_TYPE_PCRE1:
+	case GREP_PATTERN_TYPE_PCRE:
 		argv_array_push(&submodule_options, "-P");
 		break;
 	case GREP_PATTERN_TYPE_UNSPECIFIED:
 		break;
+	case GREP_PATTERN_TYPE_PCRE1:
+	case GREP_PATTERN_TYPE_PCRE2:
+		die("BUG: Command-line option for pcre1 or pcre2 added without updating switch statement");
 	default:
 		die("BUG: Added a new grep pattern type without updating switch statement");
 	}
@@ -1022,7 +1025,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    GREP_PATTERN_TYPE_FIXED),
 		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
 			    N_("use Perl-compatible regular expressions"),
-			    GREP_PATTERN_TYPE_PCRE1),
+			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
 		OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")),
 		OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1),
diff --git a/configure.ac b/configure.ac
index deeb968daa..7ceb22ed03 100644
--- a/configure.ac
+++ b/configure.ac
@@ -259,8 +259,8 @@ GIT_PARSE_WITH([openssl]))
 # /foo/bar/include and /foo/bar/lib directories.
 #
 AC_ARG_WITH(libpcre,
-AS_HELP_STRING([--with-libpcre],[support Perl-compatible regexes (default is NO)])
-AS_HELP_STRING([],           [ARG can be also prefix for libpcre library and headers]),
+AS_HELP_STRING([--with-libpcre],[support Perl-compatible regexes via libpcre1 (default is NO)])
+AS_HELP_STRING([],           [ARG can be also prefix for libpcre1 library and headers]),
     if test "$withval" = "no"; then
 	USE_LIBPCRE=
     elif test "$withval" = "yes"; then
@@ -273,6 +273,30 @@ AS_HELP_STRING([],           [ARG can be also prefix for libpcre library and hea
         dnl it yet.
 	GIT_CONF_SUBST([LIBPCREDIR])
     fi)
+
+# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
+#
+# Define LIBPCR2EDIR=/foo/bar if your libpcre2 header and library
+# files are in /foo/bar/include and /foo/bar/lib directories.
+#
+AC_ARG_WITH(libpcre2,
+AS_HELP_STRING([--with-libpcre2],[support Perl-compatible regexes via libpcre2 (default is NO)])
+AS_HELP_STRING([],           [ARG can be also prefix for libpcre library and headers]),
+    if test "$withval" = "no"; then
+	USE_LIBPCRE2=
+    elif test "$withval" = "yes"; then
+	USE_LIBPCRE2=YesPlease
+    else
+	USE_LIBPCRE2=YesPlease
+	LIBPCRE2DIR=$withval
+	AC_MSG_NOTICE([Setting LIBPCRE2DIR to $LIBPCRE2DIR])
+        dnl USE_LIBPCRE2 can still be modified below, so don't substitute
+        dnl it yet.
+	GIT_CONF_SUBST([LIBPCRE2DIR])
+    fi)
 #
 # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
 AC_FUNC_ALLOCA
@@ -522,6 +546,27 @@ GIT_CONF_SUBST([USE_LIBPCRE])
 fi
 
 #
+# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
+#
+
+if test -n "$USE_LIBPCRE2"; then
+
+GIT_STASH_FLAGS($LIBPCRE2DIR)
+
+AC_CHECK_LIB([pcre2-8], [pcre2_config_8],
+[USE_LIBPCRE2=YesPlease],
+[USE_LIBPCRE2=])
+
+GIT_UNSTASH_FLAGS($LIBPCRE2DIR)
+
+GIT_CONF_SUBST([USE_LIBPCRE2])
+
+fi
+
+#
 # Define NO_CURL if you do not have libcurl installed.  git-http-pull and
 # git-http-push are not built, and you cannot use http:// and https://
 # transports.
diff --git a/grep.c b/grep.c
index 8e3ba164b5..dd5dff08f8 100644
--- a/grep.c
+++ b/grep.c
@@ -61,9 +61,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 	else if (!strcmp(arg, "fixed"))
 		return GREP_PATTERN_TYPE_FIXED;
 	else if (!strcmp(arg, "perl") ||
-		 !strcmp(arg, "pcre") ||
-		 !strcmp(arg, "pcre1"))
+		 !strcmp(arg, "pcre"))
+		return GREP_PATTERN_TYPE_PCRE;
+	else if (!strcmp(arg, "pcre1"))
 		return GREP_PATTERN_TYPE_PCRE1;
+	else if (!strcmp(arg, "pcre2"))
+		return GREP_PATTERN_TYPE_PCRE2;
 	die("bad %s argument: %s", opt, arg);
 }
 
@@ -181,22 +184,46 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
 		opt->pcre1 = 0;
+		opt->pcre2 = 0;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
 		opt->fixed = 0;
 		opt->pcre1 = 0;
+		opt->pcre2 = 0;
 		opt->regflags |= REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
 		opt->pcre1 = 0;
+		opt->pcre2 = 0;
+		break;
+
+	case GREP_PATTERN_TYPE_PCRE:
+		opt->fixed = 0;
+		/* We default to pcre1 if we have both versions. This
+		 * may change in future versions.
+		 */
+#ifdef USE_LIBPCRE1
+		opt->pcre1 = 1;
+		opt->pcre2 = 0;
+#elif USE_LIBPCRE2
+		opt->pcre1 = 0;
+		opt->pcre2 = 1;
+#endif
 		break;
 
 	case GREP_PATTERN_TYPE_PCRE1:
 		opt->fixed = 0;
 		opt->pcre1 = 1;
+		opt->pcre2 = 0;
+		break;
+
+	case GREP_PATTERN_TYPE_PCRE2:
+		opt->fixed = 0;
+		opt->pcre1 = 0;
+		opt->pcre2 = 1;
 		break;
 	}
 }
@@ -417,6 +444,120 @@ static void free_pcre1_regexp(struct grep_pat *p)
 }
 #endif /* !USE_LIBPCRE1 */
 
+#ifdef USE_LIBPCRE2
+static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
+{
+	int error;
+	PCRE2_UCHAR errbuf[256];
+	PCRE2_SIZE erroffset;
+	int options = PCRE2_MULTILINE;
+	const uint8_t *character_tables = NULL;
+	uint32_t canjit;
+	int jitret;
+
+	p->pcre2_ccontext = NULL;
+
+	if (opt->ignore_case) {
+		if (has_non_ascii(p->pattern)) {
+			character_tables = pcre2_maketables(NULL);
+			p->pcre2_ccontext = pcre2_compile_context_create(NULL);
+			pcre2_set_character_tables(p->pcre2_ccontext, character_tables);
+		}
+		options |= PCRE2_CASELESS;
+	}
+	if (is_utf8_locale() && has_non_ascii(p->pattern))
+		options |= PCRE2_UTF;
+
+	p->pcre2_pattern = pcre2_compile((PCRE2_SPTR)p->pattern,
+					 p->patternlen, options, &error, &erroffset,
+					 p->pcre2_ccontext);
+
+	if (p->pcre2_pattern) {
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
+		if (!p->pcre2_match_data)
+			die("BUG: Couldn't allocate PCRE2 match data");
+	} else {
+		pcre2_get_error_message(error, errbuf, sizeof(errbuf));
+		compile_regexp_failed(p, (const char *)&errbuf);
+	}
+
+	pcre2_config(PCRE2_CONFIG_JIT, &canjit);
+	if (canjit == 1) {
+		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
+		if (!jitret)
+			p->pcre2_jit_on = 1;
+		else
+			die("BUG: Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+		p->pcre2_jit_stack = pcre2_jit_stack_create(1, 1024 * 1024, NULL);
+		if (!p->pcre2_jit_stack)
+			die("BUG: Couldn't allocate PCRE2 JIT stack");
+		p->pcre2_match_context = pcre2_match_context_create(NULL);
+		if (!p->pcre2_jit_stack)
+			die("BUG: Couldn't allocate PCRE2 match context");
+		pcre2_jit_stack_assign(p->pcre2_match_context, NULL, p->pcre2_jit_stack);
+	}
+}
+
+static int pcre2match(struct grep_pat *p, const char *line, const char *eol,
+		regmatch_t *match, int eflags)
+{
+	int ret, flags = 0;
+	PCRE2_SIZE *ovector;
+	PCRE2_UCHAR errbuf[256];
+
+	if (eflags & REG_NOTBOL)
+		flags |= PCRE2_NOTBOL;
+
+	if (p->pcre2_jit_on)
+		ret = pcre2_jit_match(p->pcre2_pattern, (unsigned char *)line,
+				      eol - line, 0, flags, p->pcre2_match_data,
+				      NULL);
+	else
+		ret = pcre2_match(p->pcre2_pattern, (unsigned char *)line,
+				  eol - line, 0, flags, p->pcre2_match_data,
+				  NULL);
+
+	if (ret < 0 && ret != PCRE2_ERROR_NOMATCH) {
+		pcre2_get_error_message(ret, errbuf, sizeof(errbuf));
+		die("%s failed with error code %d: %s",
+		    (p->pcre2_jit_on ? "pcre2_jit_match" : "pcre2_match"), ret,
+		    errbuf);
+	}
+	if (ret > 0) {
+		ovector = pcre2_get_ovector_pointer(p->pcre2_match_data);
+		ret = 0;
+		match->rm_so = (int)ovector[0];
+		match->rm_eo = (int)ovector[1];
+	}
+
+	return ret;
+}
+
+static void free_pcre2_pattern(struct grep_pat *p)
+{
+	pcre2_compile_context_free(p->pcre2_ccontext);
+	pcre2_code_free(p->pcre2_pattern);
+	pcre2_match_data_free(p->pcre2_match_data);
+	pcre2_jit_stack_free(p->pcre2_jit_stack);
+	pcre2_match_context_free(p->pcre2_match_context);
+}
+#else /* !USE_LIBPCRE2 */
+static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
+{
+	die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE2");
+}
+
+static int pcre2match(struct grep_pat *p, const char *line, const char *eol,
+		regmatch_t *match, int eflags)
+{
+	return 1;
+}
+
+static void free_pcre2_pattern(struct grep_pat *p)
+{
+}
+#endif /* !USE_LIBPCRE2 */
+
 static int is_fixed(const char *s, size_t len)
 {
 	size_t i;
@@ -503,6 +644,11 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
+	if (opt->pcre2) {
+		compile_pcre2_pattern(p, opt);
+		return;
+	}
+
 	err = regcomp(&p->regexp, p->pattern, opt->regflags);
 	if (err) {
 		char errbuf[1024];
@@ -857,6 +1003,8 @@ void free_grep_patterns(struct grep_opt *opt)
 				kwsfree(p->kws);
 			else if (p->pcre1_regexp)
 				free_pcre1_regexp(p);
+			else if (p->pcre2_pattern)
+				free_pcre2_pattern(p);
 			else
 				regfree(&p->regexp);
 			free(p->pattern);
@@ -937,6 +1085,8 @@ static int patmatch(struct grep_pat *p, char *line, char *eol,
 		hit = !fixmatch(p, line, eol, match);
 	else if (p->pcre1_regexp)
 		hit = !pcre1match(p, line, eol, match, eflags);
+	else if (p->pcre2_pattern)
+		hit = !pcre2match(p, line, eol, match, eflags);
 	else
 		hit = !regexec_buf(&p->regexp, line, eol - line, 1, match,
 				   eflags);
diff --git a/grep.h b/grep.h
index 29e20bf837..dd16b5fd53 100644
--- a/grep.h
+++ b/grep.h
@@ -11,6 +11,16 @@ typedef int pcre;
 typedef int pcre_extra;
 typedef int pcre_jit_stack;
 #endif
+#ifdef USE_LIBPCRE2
+#define PCRE2_CODE_UNIT_WIDTH 8
+#include <pcre2.h>
+#else
+typedef int pcre2_code;
+typedef int pcre2_match_data;
+typedef int pcre2_compile_context;
+typedef int pcre2_match_context;
+typedef int pcre2_jit_stack;
+#endif
 #include "kwset.h"
 #include "thread-utils.h"
 #include "userdiff.h"
@@ -54,6 +64,12 @@ struct grep_pat {
 	pcre_extra *pcre1_extra_info;
 	pcre_jit_stack *pcre1_jit_stack;
 	const unsigned char *pcre1_tables;
+	pcre2_code *pcre2_pattern;
+	pcre2_match_data *pcre2_match_data;
+	pcre2_compile_context *pcre2_ccontext;
+	pcre2_match_context *pcre2_match_context;
+	pcre2_jit_stack *pcre2_jit_stack;
+	int pcre2_jit_on;
 	kwset_t kws;
 	unsigned fixed:1;
 	unsigned ignore_case:1;
@@ -73,7 +89,9 @@ enum grep_pattern_type {
 	GREP_PATTERN_TYPE_BRE,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
-	GREP_PATTERN_TYPE_PCRE1
+	GREP_PATTERN_TYPE_PCRE,
+	GREP_PATTERN_TYPE_PCRE1,
+	GREP_PATTERN_TYPE_PCRE2
 };
 
 struct grep_expr {
@@ -117,6 +135,7 @@ struct grep_opt {
 	int extended;
 	int use_reflog_filter;
 	int pcre1;
+	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
diff --git a/revision.c b/revision.c
index 7a10a8570a..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1996,7 +1996,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
 		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
 	} else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
-		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE1;
+		revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
 	} else if (!strcmp(arg, "--all-match")) {
 		revs->grep_filter.all_match = 1;
 	} else if (!strcmp(arg, "--invert-grep")) {
diff --git a/t/README b/t/README
index a90cb62583..547b06e700 100644
--- a/t/README
+++ b/t/README
@@ -808,6 +808,18 @@ use these, and "test_set_prereq" for how to define your own.
    Git was compiled with support for PCRE. Wrap any tests
    that use git-grep --perl-regexp or git-grep -P in these.
 
+ - LIBPCRE1
+
+   Git was compiled with PCRE v1 support via
+   USE_LIBPCRE=YesPlease. Wrap any PCRE using tests that for some
+   reason need v1 of the PCRE library instead of v2 in these.
+
+ - LIBPCRE2
+
+   Git was compiled with PCRE v2 support via
+   USE_LIBPCRE2=YesPlease. Wrap any PCRE using tests that for some
+   reason need v2 of the PCRE library instead of v1 in these.
+
  - CASE_INSENSITIVE_FS
 
    Test is run on a case insensitive file system.
diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
index 5ae42ceccc..70c5a5ca32 100755
--- a/t/perf/p7820-grep-engines.sh
+++ b/t/perf/p7820-grep-engines.sh
@@ -7,7 +7,7 @@ test_description="Comparison of git-grep's regex engines"
 test_perf_large_repo
 test_checkout_worktree
 
-for engine in extended pcre1
+for engine in extended pcre1 pcre2
 do
 	# Patterns stolen from http://sljit.sourceforge.net/pcre.html
 	for pattern in \
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index f929b447e9..bae7e524b9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1063,6 +1063,16 @@ test_expect_success PCRE 'grep -P pattern' '
 	test_cmp expected actual
 '
 
+test_expect_success LIBPCRE1 'grep libpcre1 pattern' '
+	git -c grep.patternType=pcre1 grep "\p{Ps}.*?\p{Pe}" hello.c >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success LIBPCRE2 'grep libpcre2 pattern' '
+	git -c grep.patternType=pcre2 grep "\p{Ps}.*?\p{Pe}" hello.c >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'grep pattern with grep.extendedRegexp=true' '
 	>empty &&
 	test_must_fail git -c grep.extendedregexp=true \
@@ -1512,14 +1522,28 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	test_cmp expected actual
 '
 
-test_expect_success PCRE "grep with grep.patternType synonyms perl/pcre/pcre1" '
+test_expect_success PCRE "grep with grep.patternType synonyms perl/pcre" '
 	echo "#include <stdio.h>" >expected &&
 	git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" >actual &&
 	test_cmp expected actual &&
 	git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" >actual &&
-	test_cmp expected actual &&
-	git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual &&
 	test_cmp expected actual
 '
 
+test_expect_success LIBPCRE1 "grep with grep.patternType=pcre1" '
+	echo "#include <stdio.h>" >expected &&
+	git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual &&
+	test_cmp expected actual &&
+	test_must_fail git -c grep.patternType=pcre1 grep "foo(?+bar)" 2>error &&
+	test_i18ngrep -q "digit expected after" error
+'
+
+test_expect_success LIBPCRE2 "grep with grep.patternType=pcre2" '
+	echo "#include <stdio.h>" >expected &&
+	git -c grep.patternType=pcre2 grep -h --no-line-number "st(?=dio)" >actual &&
+	test_cmp expected actual &&
+	test_must_fail git -c grep.patternType=pcre2 grep "foo(?+bar)" 2>error &&
+	test_i18ngrep -q "digit expected after" error
+'
+
 test_done
diff --git a/t/t7813-grep-icase-iso.sh b/t/t7813-grep-icase-iso.sh
index 701e08a8e5..e16570690c 100755
--- a/t/t7813-grep-icase-iso.sh
+++ b/t/t7813-grep-icase-iso.sh
@@ -11,9 +11,12 @@ test_expect_success GETTEXT_ISO_LOCALE 'setup' '
 	export LC_ALL
 '
 
-test_expect_success GETTEXT_ISO_LOCALE,PCRE 'grep pcre string' '
-	git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" &&
-	git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!"
-'
+for pcrev in 1 2
+do
+	test_expect_success GETTEXT_ISO_LOCALE,LIBPCRE$pcrev "grep -i with i18n string using libpcre$pcrev" "
+		git -c grep.patternType=pcre$pcrev grep -i \"TILRAUN: H.lló Heimur!\" &&
+		git -c grep.patternType=pcre$pcrev grep -i \"TILRAUN: H.LLÓ HEIMUR!\"
+	"
+done
 
 test_done
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index dc6cf771ec..7a8a48493b 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -361,8 +361,16 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 		test_cmp expect actual &&
 		git -c grep.patternType=pcre grep --recurse-submodules -e "(.|.)[\d]" >actual &&
 		test_cmp expect actual &&
-		git -c grep.patternType=pcre1 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
-		test_cmp expect actual
+		if test_have_prereq LIBPCRE1
+		then
+			git -c grep.patternType=pcre1 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+			test_cmp expect actual
+		fi &&
+		if test_have_prereq LIBPCRE2
+		then
+			git -c grep.patternType=pcre2 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+			test_cmp expect actual
+		fi
 	fi
 '
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index e5cfbcc36b..6f873de3e7 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,9 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1$USE_LIBPCRE2" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
+test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 18/19] grep: remove support for concurrent use of both PCRE v1 & v2
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (16 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 17/19] grep: add support for PCRE v2 Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  2017-04-25 21:05 ` [PATCH v4 19/19] Makefile & configure: make PCRE v2 the default PCRE implementation Ævar Arnfjörð Bjarmason
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Remove the support for concurrently using PCRE v1 & v2 by compiling
Git with support for both at the same time.

Having access to both at runtime via grep.patternType=[pcre1|pcre2]
makes it easier for the developer hacking on the PCRE implementations
to test them concurrently, but adds confusion for everyone else,
particularly Git users who have no reason to concurrently use both
libraries.

Now either USE_LIBPCRE1=YesPlease (or its alias USE_LIBPCRE) or
USE_LIBPCRE2=YesPlease can be supplied when building git, but
providing both will yield an error, similarly providing both
--with-libpcre1 & --with-libpcre2 to the configure script will produce
an error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config.txt           |  7 ----
 Makefile                           | 36 +++++++++----------
 builtin/grep.c                     |  3 --
 configure.ac                       | 72 +++++++++++++++++++++-----------------
 grep.c                             | 19 +---------
 grep.h                             |  4 +--
 t/README                           | 12 -------
 t/perf/p7820-grep-engines.sh       |  2 +-
 t/t7810-grep.sh                    | 30 +---------------
 t/t7814-grep-recurse-submodules.sh | 14 +-------
 t/test-lib.sh                      |  5 ++-
 11 files changed, 65 insertions(+), 139 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a5fc482495..475e874d51 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,13 +1624,6 @@ grep.patternType::
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
 	value 'default' will return to the default matching behavior.
-+
-The 'perl' and 'pcre' values are synonyms. Depending on which PCRE
-library Git was compiled with either or both of 'pcre1' and 'pcre2'
-might also be available.
-+
-If both are available Git currently defaults to 'pcre1', but this
-might change in future versions.
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
diff --git a/Makefile b/Makefile
index afdde49cda..a792f206b9 100644
--- a/Makefile
+++ b/Makefile
@@ -29,16 +29,13 @@ all::
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
-# /foo/bar/include and /foo/bar/lib directories.
+# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
+# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
+# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
+# default in future releases.
 #
-# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
-# commands such as log and grep offer runtime options to use
-# Perl-compatible regular expressions instead of standard or extended
-# POSIX regular expressions.
-#
-# Define LIBPCRE2DIR=/foo/bar if your libpcre2 header and library
-# files are in /foo/bar/include and /foo/bar/lib directories.
+# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
+# /foo/bar/include and /foo/bar/lib directories.
 #
 # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
 #
@@ -1093,24 +1090,27 @@ ifdef NO_LIBGEN_H
 	COMPAT_OBJS += compat/basename.o
 endif
 
-ifdef USE_LIBPCRE
-	BASIC_CFLAGS += -DUSE_LIBPCRE1
-	ifdef LIBPCREDIR
-		BASIC_CFLAGS += -I$(LIBPCREDIR)/include
-		EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
+USE_LIBPCRE1 ?= $(USE_LIBPCRE)
+
+ifneq (,$(USE_LIBPCRE1))
+	ifdef USE_LIBPCRE2
+$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not both!)
 	endif
+
+	BASIC_CFLAGS += -DUSE_LIBPCRE1
 	EXTLIBS += -lpcre
 endif
 
 ifdef USE_LIBPCRE2
 	BASIC_CFLAGS += -DUSE_LIBPCRE2
-	ifdef LIBPCRE2DIR
-		BASIC_CFLAGS += -I$(LIBPCRE2DIR)/include
-		EXTLIBS += -L$(LIBPCRE2DIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCR2EDIR)/$(lib)
-	endif
 	EXTLIBS += -lpcre2-8
 endif
 
+ifdef LIBPCREDIR
+	BASIC_CFLAGS += -I$(LIBPCREDIR)/include
+	EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
+endif
+
 ifdef HAVE_ALLOCA_H
 	BASIC_CFLAGS += -DHAVE_ALLOCA_H
 endif
diff --git a/builtin/grep.c b/builtin/grep.c
index 178b10aa6f..be3dbd6957 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,9 +495,6 @@ static void compile_submodule_options(const struct grep_opt *opt,
 		break;
 	case GREP_PATTERN_TYPE_UNSPECIFIED:
 		break;
-	case GREP_PATTERN_TYPE_PCRE1:
-	case GREP_PATTERN_TYPE_PCRE2:
-		die("BUG: Command-line option for pcre1 or pcre2 added without updating switch statement");
 	default:
 		die("BUG: Added a new grep pattern type without updating switch statement");
 	}
diff --git a/configure.ac b/configure.ac
index 7ceb22ed03..11d083fbe0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -255,47 +255,63 @@ GIT_PARSE_WITH([openssl]))
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
+# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
+# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
+# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
+# default in future releases.
+#
+# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
 #
 AC_ARG_WITH(libpcre,
-AS_HELP_STRING([--with-libpcre],[support Perl-compatible regexes via libpcre1 (default is NO)])
-AS_HELP_STRING([],           [ARG can be also prefix for libpcre1 library and headers]),
+AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre1]),
     if test "$withval" = "no"; then
-	USE_LIBPCRE=
+	USE_LIBPCRE1=
     elif test "$withval" = "yes"; then
-	USE_LIBPCRE=YesPlease
+	USE_LIBPCRE1=YesPlease
     else
-	USE_LIBPCRE=YesPlease
+	USE_LIBPCRE1=YesPlease
 	LIBPCREDIR=$withval
 	AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR])
-        dnl USE_LIBPCRE can still be modified below, so don't substitute
+        dnl USE_LIBPCRE1 can still be modified below, so don't substitute
+        dnl it yet.
+	GIT_CONF_SUBST([LIBPCREDIR])
+    fi)
+
+AC_ARG_WITH(libpcre1,
+AS_HELP_STRING([--with-libpcre1],[support Perl-compatible regexes via libpcre1 (default is NO)])
+AS_HELP_STRING([],           [ARG can be also prefix for libpcre library and headers]),
+    if test "$withval" = "no"; then
+	USE_LIBPCRE1=
+    elif test "$withval" = "yes"; then
+	USE_LIBPCRE1=YesPlease
+    else
+	USE_LIBPCRE1=YesPlease
+	LIBPCREDIR=$withval
+	AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR])
+        dnl USE_LIBPCRE1 can still be modified below, so don't substitute
         dnl it yet.
 	GIT_CONF_SUBST([LIBPCREDIR])
     fi)
 
-# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
-# commands such as log and grep offer runtime options to use
-# Perl-compatible regular expressions instead of standard or extended
-# POSIX regular expressions.
-#
-# Define LIBPCR2EDIR=/foo/bar if your libpcre2 header and library
-# files are in /foo/bar/include and /foo/bar/lib directories.
-#
 AC_ARG_WITH(libpcre2,
 AS_HELP_STRING([--with-libpcre2],[support Perl-compatible regexes via libpcre2 (default is NO)])
 AS_HELP_STRING([],           [ARG can be also prefix for libpcre library and headers]),
+    if test -n "$USE_LIBPCRE1"; then
+        AC_MSG_ERROR([Only supply one of --with-libpcre1 or --with-libpcre2!])
+    fi
+
     if test "$withval" = "no"; then
 	USE_LIBPCRE2=
     elif test "$withval" = "yes"; then
 	USE_LIBPCRE2=YesPlease
     else
 	USE_LIBPCRE2=YesPlease
-	LIBPCRE2DIR=$withval
-	AC_MSG_NOTICE([Setting LIBPCRE2DIR to $LIBPCRE2DIR])
+	LIBPCREDIR=$withval
+	AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR])
         dnl USE_LIBPCRE2 can still be modified below, so don't substitute
         dnl it yet.
-	GIT_CONF_SUBST([LIBPCRE2DIR])
+	GIT_CONF_SUBST([LIBPCREDIR])
     fi)
 #
 # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
@@ -525,13 +541,11 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. Various
-# commands such as log and grep offer runtime options to use
-# Perl-compatible regular expressions instead of standard or extended
-# POSIX regular expressions.
+# Handle the USE_LIBPCRE1 and USE_LIBPCRE2 options potentially set
+# above.
 #
 
-if test -n "$USE_LIBPCRE"; then
+if test -n "$USE_LIBPCRE1"; then
 
 GIT_STASH_FLAGS($LIBPCREDIR)
 
@@ -541,26 +555,20 @@ AC_CHECK_LIB([pcre], [pcre_version],
 
 GIT_UNSTASH_FLAGS($LIBPCREDIR)
 
-GIT_CONF_SUBST([USE_LIBPCRE])
+GIT_CONF_SUBST([USE_LIBPCRE1])
 
 fi
 
-#
-# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
-# commands such as log and grep offer runtime options to use
-# Perl-compatible regular expressions instead of standard or extended
-# POSIX regular expressions.
-#
 
 if test -n "$USE_LIBPCRE2"; then
 
-GIT_STASH_FLAGS($LIBPCRE2DIR)
+GIT_STASH_FLAGS($LIBPCREDIR)
 
 AC_CHECK_LIB([pcre2-8], [pcre2_config_8],
 [USE_LIBPCRE2=YesPlease],
 [USE_LIBPCRE2=])
 
-GIT_UNSTASH_FLAGS($LIBPCRE2DIR)
+GIT_UNSTASH_FLAGS($LIBPCREDIR)
 
 GIT_CONF_SUBST([USE_LIBPCRE2])
 
diff --git a/grep.c b/grep.c
index dd5dff08f8..18a7f155a6 100644
--- a/grep.c
+++ b/grep.c
@@ -60,13 +60,8 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 		return GREP_PATTERN_TYPE_ERE;
 	else if (!strcmp(arg, "fixed"))
 		return GREP_PATTERN_TYPE_FIXED;
-	else if (!strcmp(arg, "perl") ||
-		 !strcmp(arg, "pcre"))
+	else if (!strcmp(arg, "perl"))
 		return GREP_PATTERN_TYPE_PCRE;
-	else if (!strcmp(arg, "pcre1"))
-		return GREP_PATTERN_TYPE_PCRE1;
-	else if (!strcmp(arg, "pcre2"))
-		return GREP_PATTERN_TYPE_PCRE2;
 	die("bad %s argument: %s", opt, arg);
 }
 
@@ -213,18 +208,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 		opt->pcre2 = 1;
 #endif
 		break;
-
-	case GREP_PATTERN_TYPE_PCRE1:
-		opt->fixed = 0;
-		opt->pcre1 = 1;
-		opt->pcre2 = 0;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE2:
-		opt->fixed = 0;
-		opt->pcre1 = 0;
-		opt->pcre2 = 1;
-		break;
 	}
 }
 
diff --git a/grep.h b/grep.h
index dd16b5fd53..1b56d327e2 100644
--- a/grep.h
+++ b/grep.h
@@ -89,9 +89,7 @@ enum grep_pattern_type {
 	GREP_PATTERN_TYPE_BRE,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
-	GREP_PATTERN_TYPE_PCRE,
-	GREP_PATTERN_TYPE_PCRE1,
-	GREP_PATTERN_TYPE_PCRE2
+	GREP_PATTERN_TYPE_PCRE
 };
 
 struct grep_expr {
diff --git a/t/README b/t/README
index 547b06e700..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -808,18 +808,6 @@ use these, and "test_set_prereq" for how to define your own.
    Git was compiled with support for PCRE. Wrap any tests
    that use git-grep --perl-regexp or git-grep -P in these.
 
- - LIBPCRE1
-
-   Git was compiled with PCRE v1 support via
-   USE_LIBPCRE=YesPlease. Wrap any PCRE using tests that for some
-   reason need v1 of the PCRE library instead of v2 in these.
-
- - LIBPCRE2
-
-   Git was compiled with PCRE v2 support via
-   USE_LIBPCRE2=YesPlease. Wrap any PCRE using tests that for some
-   reason need v2 of the PCRE library instead of v1 in these.
-
  - CASE_INSENSITIVE_FS
 
    Test is run on a case insensitive file system.
diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
index 70c5a5ca32..96d993ec7d 100755
--- a/t/perf/p7820-grep-engines.sh
+++ b/t/perf/p7820-grep-engines.sh
@@ -7,7 +7,7 @@ test_description="Comparison of git-grep's regex engines"
 test_perf_large_repo
 test_checkout_worktree
 
-for engine in extended pcre1 pcre2
+for engine in extended perl
 do
 	# Patterns stolen from http://sljit.sourceforge.net/pcre.html
 	for pattern in \
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index bae7e524b9..f5f1b61e02 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1063,16 +1063,6 @@ test_expect_success PCRE 'grep -P pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE1 'grep libpcre1 pattern' '
-	git -c grep.patternType=pcre1 grep "\p{Ps}.*?\p{Pe}" hello.c >actual &&
-	test_cmp expected actual
-'
-
-test_expect_success LIBPCRE2 'grep libpcre2 pattern' '
-	git -c grep.patternType=pcre2 grep "\p{Ps}.*?\p{Pe}" hello.c >actual &&
-	test_cmp expected actual
-'
-
 test_expect_success 'grep pattern with grep.extendedRegexp=true' '
 	>empty &&
 	test_must_fail git -c grep.extendedregexp=true \
@@ -1522,28 +1512,10 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	test_cmp expected actual
 '
 
-test_expect_success PCRE "grep with grep.patternType synonyms perl/pcre" '
+test_expect_success PCRE "grep with grep.patternType synonyms perl" '
 	echo "#include <stdio.h>" >expected &&
 	git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" >actual &&
-	test_cmp expected actual &&
-	git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE1 "grep with grep.patternType=pcre1" '
-	echo "#include <stdio.h>" >expected &&
-	git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual &&
-	test_cmp expected actual &&
-	test_must_fail git -c grep.patternType=pcre1 grep "foo(?+bar)" 2>error &&
-	test_i18ngrep -q "digit expected after" error
-'
-
-test_expect_success LIBPCRE2 "grep with grep.patternType=pcre2" '
-	echo "#include <stdio.h>" >expected &&
-	git -c grep.patternType=pcre2 grep -h --no-line-number "st(?=dio)" >actual &&
-	test_cmp expected actual &&
-	test_must_fail git -c grep.patternType=pcre2 grep "foo(?+bar)" 2>error &&
-	test_i18ngrep -q "digit expected after" error
-'
-
 test_done
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 7a8a48493b..ef658b7899 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -358,19 +358,7 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 		EOF
 		test_cmp expect actual &&
 		git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual &&
-		test_cmp expect actual &&
-		git -c grep.patternType=pcre grep --recurse-submodules -e "(.|.)[\d]" >actual &&
-		test_cmp expect actual &&
-		if test_have_prereq LIBPCRE1
-		then
-			git -c grep.patternType=pcre1 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
-			test_cmp expect actual
-		fi &&
-		if test_have_prereq LIBPCRE2
-		then
-			git -c grep.patternType=pcre2 grep --recurse-submodules -e "(.|.)[\d]" >actual &&
-			test_cmp expect actual
-		fi
+		test_cmp expect actual
 	fi
 '
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6f873de3e7..969f931ebf 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,9 +1010,8 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE1$USE_LIBPCRE2" && test_set_prereq PCRE
-test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
-test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 19/19] Makefile & configure: make PCRE v2 the default PCRE implementation
  2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
                   ` (17 preceding siblings ...)
  2017-04-25 21:05 ` [PATCH v4 18/19] grep: remove support for concurrent use of both PCRE v1 & v2 Ævar Arnfjörð Bjarmason
@ 2017-04-25 21:05 ` Ævar Arnfjörð Bjarmason
  18 siblings, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-25 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the
Makefile & configure script, respectively, to mean use PCRE v2, not
PCRE v1.

The legacy library previously available via those options is still
available on request via USE_LIBPCRE1=YesPlease or
--with-libpcre1. The existing USE_LIBPCRE2=YesPlease & --with-libpcre2
still explicitly ask for v2.

The v2 PCRE is stable & end-user compatible, all this change does is
change the default. Someone building a new git is likely to also have
packaged PCRE v2 sometime in the last 2 years since it was released.
If not they can choose to use the legacy v2 library by making the
trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2.

New releases of PCRE v2 are already faster than PCRE v1, and not only
is all significant development is happening on v2, but bugs in
reported against v1 have started getting WONTFIX'd asking users to
just upgrade to v2.

So it makes sense to give our downstream distributors a nudge to
switch over to it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile     | 21 ++++++++++-----------
 configure.ac | 18 +++++++++---------
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/Makefile b/Makefile
index a792f206b9..42b09f9632 100644
--- a/Makefile
+++ b/Makefile
@@ -29,10 +29,10 @@ all::
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
-# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
-# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
-# default in future releases.
+# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous
+# versions it meant the same thing USE_LIBPCRE1 does now. Define
+# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of
+# the PCRE library.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -1090,18 +1090,17 @@ ifdef NO_LIBGEN_H
 	COMPAT_OBJS += compat/basename.o
 endif
 
-USE_LIBPCRE1 ?= $(USE_LIBPCRE)
-
-ifneq (,$(USE_LIBPCRE1))
-	ifdef USE_LIBPCRE2
-$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not both!)
-	endif
+USE_LIBPCRE2 ?= $(USE_LIBPCRE)
 
+ifdef USE_LIBPCRE1
 	BASIC_CFLAGS += -DUSE_LIBPCRE1
 	EXTLIBS += -lpcre
 endif
 
-ifdef USE_LIBPCRE2
+ifneq (,$(USE_LIBPCRE2))
+	ifdef USE_LIBPCRE1
+$(error Only set USE_LIBPCRE2 (or its alias USE_LIBPCRE) or USE_LIBPCRE1, not both!)
+	endif
 	BASIC_CFLAGS += -DUSE_LIBPCRE2
 	EXTLIBS += -lpcre2-8
 endif
diff --git a/configure.ac b/configure.ac
index 11d083fbe0..f9659daeb7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -255,25 +255,25 @@ GIT_PARSE_WITH([openssl]))
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
-# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
-# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
-# default in future releases.
+# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous
+# versions it meant the same thing USE_LIBPCRE1 does now. Define
+# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of
+# the PCRE library.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
 #
 AC_ARG_WITH(libpcre,
-AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre1]),
+AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre2]),
     if test "$withval" = "no"; then
-	USE_LIBPCRE1=
+	USE_LIBPCRE2=
     elif test "$withval" = "yes"; then
-	USE_LIBPCRE1=YesPlease
+	USE_LIBPCRE2=YesPlease
     else
-	USE_LIBPCRE1=YesPlease
+	USE_LIBPCRE2=YesPlease
 	LIBPCREDIR=$withval
 	AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR])
-        dnl USE_LIBPCRE1 can still be modified below, so don't substitute
+        dnl USE_LIBPCRE2 can still be modified below, so don't substitute
         dnl it yet.
 	GIT_CONF_SUBST([LIBPCREDIR])
     fi)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-04-25 21:05 ` [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
@ 2017-04-26  4:50   ` Junio C Hamano
  2017-04-26  5:29   ` Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2017-04-26  4:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Remove redundant assignments to the "regflags" variable. There are no
> code paths that have previously set the regflags to anything, and
> certainly not to `|= REG_EXTENDED`.
>
> This code gave the impression that it had to reset its environment,
> but it doesn't. This dates back to the initial introduction of
> git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).

As long as nobody calls grep_commit_pattern_type() more than once
while interpreting something like "grep -E -G -P -F -G" (i.e. "I
cannot quite decide which pattern type I want to give..."), this
change should be safe.

This made me wonder if we should be doing "opt->regflags =
REG_EXTENDED" for ERE from the same "this is only called once"
reasoning, but regflags is a collection of bits, and presumably bits
other than REG_EXTENDED can be set before the control reaches
grep_commit_pattern_type(), so that one must stay as-is, i.e.
"opt->regflags |= REG_EXTENDED".

Thanks.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-04-25 21:05 ` [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
  2017-04-26  4:50   ` Junio C Hamano
@ 2017-04-26  5:29   ` Junio C Hamano
  2017-04-26  7:48     ` [PATCH v5 " Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2017-04-26  5:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> @@ -417,7 +415,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  	int regflags;
>  
>  	basic_regex_quote_buf(&sb, p->pattern);
> -	regflags = opt->regflags & ~REG_EXTENDED;
>  	if (opt->ignore_case)
>  		regflags |= REG_ICASE;
>  	err = regcomp(&p->regexp, sb.buf, regflags);

This hunk is wrong.  Now the use of regflags we see in the post
context is mixing ICASE bit into an uninitialized garbage on the
stack.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-04-26  5:29   ` Junio C Hamano
@ 2017-04-26  7:48     ` Ævar Arnfjörð Bjarmason
  2017-04-27  0:51       ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-26  7:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Remove redundant assignments to the "regflags" variable. There are no
code paths that have previously set the regflags to anything, and
certainly not to `|= REG_EXTENDED`.

This code gave the impression that it had to reset its environment,
but it doesn't. This dates back to the initial introduction of
git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Wed, Apr 26, 2017 at 7:29 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> @@ -417,7 +415,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>>       int regflags;
>>
>>       basic_regex_quote_buf(&sb, p->pattern);
>> -     regflags = opt->regflags & ~REG_EXTENDED;
>>       if (opt->ignore_case)
>>               regflags |= REG_ICASE;
>>       err = regcomp(&p->regexp, sb.buf, regflags);
>
> This hunk is wrong.  Now the use of regflags we see in the post
> context is mixing ICASE bit into an uninitialized garbage on the
> stack.

Oops, sorry about that. Here's a fixed version. Just sending a v5 for
this, not the entire rest of the series. If you'd like to grab it in
.git form it's github.com/avar/git:avar/pcre2-5, this is the only
change from v4.


 grep.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/grep.c b/grep.c
index 59ae7809f2..bf6c2494fd 100644
--- a/grep.c
+++ b/grep.c
@@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
@@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_PCRE:
@@ -414,10 +412,9 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
 {
 	struct strbuf sb = STRBUF_INIT;
 	int err;
-	int regflags;
+	int regflags = opt->regflags;
 
 	basic_regex_quote_buf(&sb, p->pattern);
-	regflags = opt->regflags & ~REG_EXTENDED;
 	if (opt->ignore_case)
 		regflags |= REG_ICASE;
 	err = regcomp(&p->regexp, sb.buf, regflags);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-04-26  7:48     ` [PATCH v5 " Ævar Arnfjörð Bjarmason
@ 2017-04-27  0:51       ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2017-04-27  0:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Zoltán Herczeg, Brandon Williams

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>>> @@ -417,7 +415,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>>>       int regflags;
>>>
>>>       basic_regex_quote_buf(&sb, p->pattern);
>>> -     regflags = opt->regflags & ~REG_EXTENDED;
>>>       if (opt->ignore_case)
>>>               regflags |= REG_ICASE;
>>>       err = regcomp(&p->regexp, sb.buf, regflags);
>>
>> This hunk is wrong.  Now the use of regflags we see in the post
>> context is mixing ICASE bit into an uninitialized garbage on the
>> stack.
>
> Oops, sorry about that. Here's a fixed version. Just sending a v5 for
> this, not the entire rest of the series. If you'd like to grab it in
> .git form it's github.com/avar/git:avar/pcre2-5, this is the only
> change from v4.

Thanks; I think I already queued an equivalent SQUASH??? patch
queued at the tip before merging it to 'pu', so it hopefully would
be the matter of running "rebase -i" and not making mistake when
instructing the command where to squash it in ;-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-04-27  0:51 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-25 21:05 [PATCH v4 00/19] PCRE v1 improvements & PCRE v2 support Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 01/19] grep: amend submodule recursion test in preparation for rx engine testing Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 02/19] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 03/19] grep: submodule-related case statements should die if new fields are added Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 04/19] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 05/19] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
2017-04-26  4:50   ` Junio C Hamano
2017-04-26  5:29   ` Junio C Hamano
2017-04-26  7:48     ` [PATCH v5 " Ævar Arnfjörð Bjarmason
2017-04-27  0:51       ` Junio C Hamano
2017-04-25 21:05 ` [PATCH v4 06/19] Makefile & configure: reword outdated comment about PCRE Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 07/19] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 08/19] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 09/19] log: add -P as a synonym for --perl-regexp Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 10/19] grep & rev-list doc: stop promising libpcre " Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 11/19] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 12/19] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 13/19] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 14/19] grep: change the internal PCRE code & header " Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 15/19] perf: add a performance comparison test of grep -E and -P Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 16/19] grep: add support for the PCRE v1 JIT API Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 17/19] grep: add support for PCRE v2 Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 18/19] grep: remove support for concurrent use of both PCRE v1 & v2 Ævar Arnfjörð Bjarmason
2017-04-25 21:05 ` [PATCH v4 19/19] Makefile & configure: make PCRE v2 the default PCRE implementation Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).