git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git BUG 2.37.3 and 2.38.0
@ 2022-10-10 14:33 orygaw
  2022-10-10 15:40 ` rsbecker
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: orygaw @ 2022-10-10 14:33 UTC (permalink / raw)
  To: git@vger.kernel.org

Hello, 


I found a bug with GIT with version 2.37.3 and 2.38.0

My system FreeBSD 13.0-RELEASE-p11:

command:

git log -1 --invert-grep


* thread #1, name = 'git', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00000000004fe580 git`free_pattern_expr(x=0x0000000000000000) at grep.c:755:13
    frame #1: 0x00000000004fe501 git`free_grep_patterns(opt=0x00007fffffffdc20) at grep.c:795:2
    frame #2: 0x00000000005edd16 git`release_revisions(revs=0x00007fffffffda58) at revision.c:3030:2
    frame #3: 0x00000000003826c8 git`cmd_log_deinit(ret=0, rev=0x00007fffffffda58) at log.c:353:2
    frame #4: 0x00000000003845ae git`cmd_log(argc=3, argv=0x00007fffffffe960, prefix=0x0000000000000000) at log.c:883:9
    frame #5: 0x00000000002f8e8c git`run_builtin(p=0x00000000006a4b58, argc=3, argv=0x00007fffffffe960) at git.c:466:11
    frame #6: 0x00000000002f7783 git`handle_builtin(argc=3, argv=0x00007fffffffe960) at git.c:721:3
    frame #7: 0x00000000002f87c6 git`run_argv(argcp=0x00007fffffffe8a4, argv=0x00007fffffffe898) at git.c:788:4
    frame #8: 0x00000000002f7521 git`cmd_main(argc=3, argv=0x00007fffffffe960) at git.c:921:19
    frame #9: 0x000000000042b53a git`main(argc=4, argv=0x00007fffffffe958) at common-main.c:56:11
    frame #10: 0x00000000002f6dd0 git`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1_c.c:75:7


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Git BUG 2.37.3 and 2.38.0
  2022-10-10 14:33 Git BUG 2.37.3 and 2.38.0 orygaw
@ 2022-10-10 15:40 ` rsbecker
  2022-10-10 15:48   ` Taylor Blau
  2022-10-10 16:57 ` [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault Ævar Arnfjörð Bjarmason
  2022-10-10 17:41 ` [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr() Taylor Blau
  2 siblings, 1 reply; 18+ messages in thread
From: rsbecker @ 2022-10-10 15:40 UTC (permalink / raw)
  To: 'orygaw', git

On October 10, 2022 10:34 AM, orygaw
>I found a bug with GIT with version 2.37.3 and 2.38.0
>
>My system FreeBSD 13.0-RELEASE-p11:
>
>command:
>
>git log -1 --invert-grep
>
>
>* thread #1, name = 'git', stop reason = signal SIGSEGV: invalid address (fault
>address: 0x0)
>  * frame #0: 0x00000000004fe580 git`free_pattern_expr(x=0x0000000000000000)
>at grep.c:755:13
>    frame #1: 0x00000000004fe501 git`free_grep_patterns(opt=0x00007fffffffdc20)
>at grep.c:795:2
>    frame #2: 0x00000000005edd16 git`release_revisions(revs=0x00007fffffffda58)
>at revision.c:3030:2
>    frame #3: 0x00000000003826c8 git`cmd_log_deinit(ret=0,
>rev=0x00007fffffffda58) at log.c:353:2
>    frame #4: 0x00000000003845ae git`cmd_log(argc=3, argv=0x00007fffffffe960,
>prefix=0x0000000000000000) at log.c:883:9
>    frame #5: 0x00000000002f8e8c git`run_builtin(p=0x00000000006a4b58, argc=3,
>argv=0x00007fffffffe960) at git.c:466:11
>    frame #6: 0x00000000002f7783 git`handle_builtin(argc=3,
>argv=0x00007fffffffe960) at git.c:721:3
>    frame #7: 0x00000000002f87c6 git`run_argv(argcp=0x00007fffffffe8a4,
>argv=0x00007fffffffe898) at git.c:788:4
>    frame #8: 0x00000000002f7521 git`cmd_main(argc=3, argv=0x00007fffffffe960)
>at git.c:921:19
>    frame #9: 0x000000000042b53a git`main(argc=4, argv=0x00007fffffffe958) at
>common-main.c:56:11
>    frame #10: 0x00000000002f6dd0 git`_start(ap=<unavailable>,
>cleanup=<unavailable>) at crt1_c.c:75:7

I can confirm a similar situation on Cygwin at 2.38.0.

$ git log -1 --invert-grep
commit 5385d4b84047b3c42cde36f1fab83ac57df17ca8 (HEAD -> topic, origin/topic)
Author: Me <yesme@domain.com>
Date:   Sun Oct 9 22:26:47 2022 -0400

    rms.yaml: add path variable definition
Segmentation fault (core dumped)

My history is extensive.
-Randall


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Git BUG 2.37.3 and 2.38.0
  2022-10-10 15:40 ` rsbecker
@ 2022-10-10 15:48   ` Taylor Blau
  2022-10-10 17:02     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 15:48 UTC (permalink / raw)
  To: rsbecker; +Cc: 'orygaw', Ævar Arnfjörð Bjarmason, git

On Mon, Oct 10, 2022 at 11:40:18AM -0400, rsbecker@nexbridge.com wrote:
> I can confirm a similar situation on Cygwin at 2.38.0.
>
> $ git log -1 --invert-grep
> commit 5385d4b84047b3c42cde36f1fab83ac57df17ca8 (HEAD -> topic, origin/topic)
> Author: Me <yesme@domain.com>
> Date:   Sun Oct 9 22:26:47 2022 -0400
>
>     rms.yaml: add path variable definition
> Segmentation fault (core dumped)

Thanks, both. I can reproduce it, too. A quick bisection points to
f41fb662f5 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13), which appears probable.

I'll push up a patch shortly which fixes this issue.


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault
  2022-10-10 14:33 Git BUG 2.37.3 and 2.38.0 orygaw
  2022-10-10 15:40 ` rsbecker
@ 2022-10-10 16:57 ` Ævar Arnfjörð Bjarmason
  2022-10-10 17:34   ` Junio C Hamano
  2022-10-11  9:48   ` [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", " Ævar Arnfjörð Bjarmason
  2022-10-10 17:41 ` [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr() Taylor Blau
  2 siblings, 2 replies; 18+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-10 16:57 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, orygaw, rsbecker, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Neither the "--invert-grep" option added in [1] nor the earlier
"--all-match" option added in [2] were intended to be used
stand-alone.

But due to how the built-in and the revision API interacted those
options without the corresponding --grep would be ignored.

Then in f41fb662f57 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13) this turned into a segfault, as we'd
attempt to free() the non-existing --grep patterns.

Arguably it makes more sense to add this check to
compile_grep_patterns(), since it's possible to use the C API in the
same way and trigger this segfault. But in practice the revision.c API
is the only user of "no_body_match", and by placing the check here we
can more sensibly emit a message that assumes that the user used
"--invert-grep" without "--grep".

1. 22dfa8a23de (log: teach --invert-grep option, 2015-01-12)
2. 0ab7befa31d (grep --all-match, 2006-09-27)

Reported-by: orygaw <orygaw@protonmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 revision.c      | 6 ++++++
 t/t4202-log.sh  | 4 ++++
 t/t7810-grep.sh | 4 ++++
 3 files changed, 14 insertions(+)

diff --git a/revision.c b/revision.c
index 36e31942cee..a55ead48448 100644
--- a/revision.c
+++ b/revision.c
@@ -2986,6 +2986,12 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 		die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph");
 	if (!revs->reflog_info && revs->grep_filter.use_reflog_filter)
 		die(_("the option '%s' requires '%s'"), "--grep-reflog", "--walk-reflogs");
+	if (!revs->grep_filter.pattern_expression) {
+		if (revs->grep_filter.no_body_match)
+			die(_("the option '%s' requires '%s'"), "--invert-grep", "--grep");
+		if (revs->grep_filter.all_match)
+			die(_("the option '%s' requires '%s'"), "--all-match", "--grep");
+	}
 
 	if (revs->line_level_traverse &&
 	    (revs->diffopt.output_format & ~(DIFF_FORMAT_PATCH | DIFF_FORMAT_NO_OUTPUT)))
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index cc15cb4ff62..298678fb7c8 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -249,6 +249,10 @@ test_expect_success 'log --grep' '
 	test_cmp expect actual
 '
 
+test_expect_success 'log --invert-grep usage' '
+	test_expect_code 128 git log --invert-grep
+'
+
 cat > expect << EOF
 second
 initial
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 8eded6ab274..6dd750349e1 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -914,6 +914,10 @@ test_expect_success 'log with multiple --grep uses union' '
 	test_cmp expect actual
 '
 
+test_expect_success 'log --all-match usage' '
+	test_expect_code 128 git log --all-match
+'
+
 test_expect_success 'log --all-match with multiple --grep uses intersection' '
 	git log --all-match --grep=i --grep=r --format=%s >actual &&
 	{
-- 
2.38.0.971.ge79ff6d20e7


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Git BUG 2.37.3 and 2.38.0
  2022-10-10 15:48   ` Taylor Blau
@ 2022-10-10 17:02     ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2022-10-10 17:02 UTC (permalink / raw)
  To: Taylor Blau
  Cc: rsbecker, 'orygaw',
	Ævar Arnfjörð Bjarmason, git

Taylor Blau <me@ttaylorr.com> writes:

> On Mon, Oct 10, 2022 at 11:40:18AM -0400, rsbecker@nexbridge.com wrote:
>> I can confirm a similar situation on Cygwin at 2.38.0.
>>
>> $ git log -1 --invert-grep
>> commit 5385d4b84047b3c42cde36f1fab83ac57df17ca8 (HEAD -> topic, origin/topic)
>> Author: Me <yesme@domain.com>
>> Date:   Sun Oct 9 22:26:47 2022 -0400
>>
>>     rms.yaml: add path variable definition
>> Segmentation fault (core dumped)
>
> Thanks, both. I can reproduce it, too. A quick bisection points to
> f41fb662f5 (revisions API: have release_revisions() release
> "grep_filter", 2022-04-13), which appears probable.
>
> I'll push up a patch shortly which fixes this issue.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault
  2022-10-10 16:57 ` [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault Ævar Arnfjörð Bjarmason
@ 2022-10-10 17:34   ` Junio C Hamano
  2022-10-10 17:45     ` Junio C Hamano
  2022-10-11  9:48   ` [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", " Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2022-10-10 17:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, orygaw, rsbecker, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Neither the "--invert-grep" option added in [1] nor the earlier
> "--all-match" option added in [2] were intended to be used
> stand-alone.

Mostly yes, but with "to be used" -> "to take effect".

"[alias] lga = log --all-match" would be equivalent to "log"
unless the command line starts talking about "--grep", which is
quite handy.

So the real fix would be not to forbid the standalone passing of the
option to the command, but to fix the over-eager freeing of an
unallocated resource, introduced more recently, I would suspect.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr()
  2022-10-10 14:33 Git BUG 2.37.3 and 2.38.0 orygaw
  2022-10-10 15:40 ` rsbecker
  2022-10-10 16:57 ` [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault Ævar Arnfjörð Bjarmason
@ 2022-10-10 17:41 ` Taylor Blau
  2022-10-10 17:41   ` [PATCH 1/2] t4202: demonstrate `git log --invert-grep` segfault Taylor Blau
  2022-10-10 17:41   ` [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr() Taylor Blau
  2 siblings, 2 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 17:41 UTC (permalink / raw)
  To: git; +Cc: orygaw, rsbecker, gitster

This short series demonstrates and fixes a bug whereby a NULL
`struct grep_expr*` argument (such as when running `git log
--invert-grep` without a `--grep` argument) could cause us to segfault
when trying to dereference that argument.

An alternative approach is to disallow `--invert-grep` without a
`--grep` argument. But a more user-friendly approach is to permit this
combination by pretending as if neither argument was given. This patch
series takes the latter approach by teaching free_grep_expr() that
passing a NULL argument should result in a noop, not a segfault.

Taylor Blau (2):
  t4202: demonstrate `git log --invert-grep` segfault
  grep.c: tolerate NULL grep_expr in free_pattern_expr()

 grep.c         | 5 +++--
 t/t4202-log.sh | 6 ++++++
 2 files changed, 9 insertions(+), 2 deletions(-)

-- 
2.37.0.1.g1379af2e9d

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] t4202: demonstrate `git log --invert-grep` segfault
  2022-10-10 17:41 ` [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr() Taylor Blau
@ 2022-10-10 17:41   ` Taylor Blau
  2022-10-10 17:41   ` [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr() Taylor Blau
  1 sibling, 0 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 17:41 UTC (permalink / raw)
  To: git; +Cc: orygaw, rsbecker, gitster

When `--invert-grep` is given without a pattern, `git log` behaves as
normal. But as of f41fb662f5 (revisions API: have release_revisions()
release "grep_filter", 2022-04-13), this doesn't quite work because we
try to dereference the NULL `grep_expr` pointer in
`free_pattern_expr()`, leading to a sgefault.

The subsequent patch will explain the bug, provide a fix, and update
this test to expect success.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t4202-log.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index cc15cb4ff6..e3ec5f5661 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -297,6 +297,12 @@ test_expect_success 'log --invert-grep --grep -i' '
 	fi
 '
 
+test_expect_failure 'log --invert-grep (no --grep)' '
+	git log --pretty="tformat:%s" >expect &&
+	git log --invert-grep --pretty="tformat:%s" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'log --grep option parsing' '
 	echo second >expect &&
 	git log -1 --pretty="tformat:%s" --grep sec >actual &&
-- 
2.37.0.1.g1379af2e9d


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()
  2022-10-10 17:41 ` [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr() Taylor Blau
  2022-10-10 17:41   ` [PATCH 1/2] t4202: demonstrate `git log --invert-grep` segfault Taylor Blau
@ 2022-10-10 17:41   ` Taylor Blau
  2022-10-10 17:54     ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 17:41 UTC (permalink / raw)
  To: git; +Cc: orygaw, rsbecker, gitster

As demonstrated in the previous commit, `git log --invert-grep` without
a `--grep` argument causes a segfault after f41fb662f5 (revisions API:
have release_revisions() release "grep_filter", 2022-04-13).

The segfault occurs in `free_pattern_expr()`, which crashes on trying to
switch on `x->node` when given a NULL pointer. Usually we avoid calling
`free_pattern_expr()` without a pattern as indicated by the `extended`
bit being zero.

But it is possible to get into a state where the `extended` bit is
non-zero, but the `pattern_expression` is still NULL. This happens
because the `--invert-grep` option sets the `no_body_match` bit. When we
call `compile_grep_patterns()`, we set `opt->extended = 1`. But the
`pattern_expression` is left as NULL, since we return with a NULL
`header_expr`.

So when we try to call `free_pattern_expr()`, things go awry, since
`free_grep_patterns()` expects a non-NULL argument.

Instead, teach `free_grep_patterns()` to tolerate a NULL argument
(treating it as a noop), and avoid checking whether or not the
`extended` bit is set, since `free_pattern_expr()` will handle its
argument regardless.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c         | 5 +++--
 t/t4202-log.sh | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/grep.c b/grep.c
index 52a894c989..bcc6e63365 100644
--- a/grep.c
+++ b/grep.c
@@ -752,6 +752,9 @@ void compile_grep_patterns(struct grep_opt *opt)
 
 static void free_pattern_expr(struct grep_expr *x)
 {
+	if (!x)
+		return;
+
 	switch (x->node) {
 	case GREP_NODE_TRUE:
 	case GREP_NODE_ATOM:
@@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
 		free(p);
 	}
 
-	if (!opt->extended)
-		return;
 	free_pattern_expr(opt->pattern_expression);
 }
 
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index e3ec5f5661..44f7ef0ea2 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -297,7 +297,7 @@ test_expect_success 'log --invert-grep --grep -i' '
 	fi
 '
 
-test_expect_failure 'log --invert-grep (no --grep)' '
+test_expect_success 'log --invert-grep (no --grep)' '
 	git log --pretty="tformat:%s" >expect &&
 	git log --invert-grep --pretty="tformat:%s" >actual &&
 	test_cmp expect actual
-- 
2.37.0.1.g1379af2e9d

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault
  2022-10-10 17:34   ` Junio C Hamano
@ 2022-10-10 17:45     ` Junio C Hamano
  2022-10-10 18:48       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2022-10-10 17:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, orygaw, rsbecker, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Neither the "--invert-grep" option added in [1] nor the earlier
>> "--all-match" option added in [2] were intended to be used
>> stand-alone.
>
> Mostly yes, but with "to be used" -> "to take effect".
>
> "[alias] lga = log --all-match" would be equivalent to "log"
> unless the command line starts talking about "--grep", which is
> quite handy.
>
> So the real fix would be not to forbid the standalone passing of the
> option to the command, but to fix the over-eager freeing of an
> unallocated resource, introduced more recently, I would suspect.

On the other hand, I do not think "--invert-grep" is useful in the
same way.  The only usage I can think of is to omit merges by
looking for substring "^Merge", but then we already have a more
robust "--no-merges" option for that purpose.

But both uses the same mechanism, I would say treating them the
same, i.e. silently ignoring these when --grep is not given, would
be the most sensible.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()
  2022-10-10 17:41   ` [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr() Taylor Blau
@ 2022-10-10 17:54     ` Junio C Hamano
  2022-10-10 18:10       ` Taylor Blau
  2022-10-10 18:11       ` Junio C Hamano
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2022-10-10 17:54 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, orygaw, rsbecker

Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/grep.c b/grep.c
> index 52a894c989..bcc6e63365 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -752,6 +752,9 @@ void compile_grep_patterns(struct grep_opt *opt)
>  
>  static void free_pattern_expr(struct grep_expr *x)
>  {
> +	if (!x)
> +		return;
> +
>  	switch (x->node) {
>  	case GREP_NODE_TRUE:
>  	case GREP_NODE_ATOM:

This hunk makes sense, but

> @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
>  		free(p);
>  	}
>  
> -	if (!opt->extended)
> -		return;
>  	free_pattern_expr(opt->pattern_expression);
>  }

I do not know about this one.  We used to avoid freeing, even when
the .pattern_expression member is set, as long as the .extended bit
is not set.  Now we unconditionally try to free it even when the bit
says it does not want to.  Why?

> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index e3ec5f5661..44f7ef0ea2 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -297,7 +297,7 @@ test_expect_success 'log --invert-grep --grep -i' '
>  	fi
>  '
>  
> -test_expect_failure 'log --invert-grep (no --grep)' '
> +test_expect_success 'log --invert-grep (no --grep)' '
>  	git log --pretty="tformat:%s" >expect &&
>  	git log --invert-grep --pretty="tformat:%s" >actual &&
>  	test_cmp expect actual

Especially for something this small, doing the "failing test first
and then fix with flipping the test to success" is very much
unwelcome.  For whoever gets curious (me included when accepting
posted patch), it is easy to revert only the part of the commit
outside t/ tentatively to see how the original code breaks.  Keeping
the fix and protection of the fix together will avoid mistakes.  In
this case, the whole test fits inside the post context of the patch,
but in general, this "flip failure to success" will hide the body of
the test that changes behaviour while reviewing the patch text,
which is another downside.

Thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()
  2022-10-10 17:54     ` Junio C Hamano
@ 2022-10-10 18:10       ` Taylor Blau
  2022-10-10 18:11       ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 18:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, orygaw, rsbecker

On Mon, Oct 10, 2022 at 10:54:23AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> > @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
> >  		free(p);
> >  	}
> >
> > -	if (!opt->extended)
> > -		return;
> >  	free_pattern_expr(opt->pattern_expression);
> >  }
>
> I do not know about this one.  We used to avoid freeing, even when
> the .pattern_expression member is set, as long as the .extended bit
> is not set.  Now we unconditionally try to free it even when the bit
> says it does not want to.  Why?

It's not "does not want to" be freed. As best I can tell, we conflate
`opt->extended` with "there is something in `opt->pattern_expression`".
So checking whether or not `opt->extended` is non-zero isn't "keep this
around because I'm going to use it later", but instead "there is
nothing to free, don't bother calling `free_pattern_expr()`".

A more direct way of saying the latter would have been to replace the
if-statement with `if (opt->pattern_expression)`.

I hinted at this in the commit message, but I will make it more direct
to avoid future readers' confusion.

> > diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> > index e3ec5f5661..44f7ef0ea2 100755
> > --- a/t/t4202-log.sh
> > +++ b/t/t4202-log.sh
> > @@ -297,7 +297,7 @@ test_expect_success 'log --invert-grep --grep -i' '
> >  	fi
> >  '
> >
> > -test_expect_failure 'log --invert-grep (no --grep)' '
> > +test_expect_success 'log --invert-grep (no --grep)' '
> >  	git log --pretty="tformat:%s" >expect &&
> >  	git log --invert-grep --pretty="tformat:%s" >actual &&
> >  	test_cmp expect actual
>
> Especially for something this small, doing the "failing test first
> and then fix with flipping the test to success" is very much
> unwelcome.  For whoever gets curious (me included when accepting
> posted patch), it is easy to revert only the part of the commit
> outside t/ tentatively to see how the original code breaks.  Keeping
> the fix and protection of the fix together will avoid mistakes.  In
> this case, the whole test fits inside the post context of the patch,
> but in general, this "flip failure to success" will hide the body of
> the test that changes behaviour while reviewing the patch text,
> which is another downside.

Good to know. I had considered it good practice, even for a small fix,
as a way to show your work and prove that you had a legitimately broken
test case demonstrating the bug. But if it creates an extra hassle, I
don't mind squashing it down.

I can send a squashed version of these two patches, but let's see if
there are any other comments, first.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()
  2022-10-10 17:54     ` Junio C Hamano
  2022-10-10 18:10       ` Taylor Blau
@ 2022-10-10 18:11       ` Junio C Hamano
  2022-10-10 18:14         ` Taylor Blau
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2022-10-10 18:11 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, orygaw, rsbecker

Junio C Hamano <gitster@pobox.com> writes:

>>  static void free_pattern_expr(struct grep_expr *x)
>>  {
>> +	if (!x)
>> +		return;
>> +
>>  	switch (x->node) {
>>  	case GREP_NODE_TRUE:
>>  	case GREP_NODE_ATOM:
>
> This hunk makes sense, but
>
>> @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
>>  		free(p);
>>  	}
>>  
>> -	if (!opt->extended)
>> -		return;
>>  	free_pattern_expr(opt->pattern_expression);
>>  }
>
> I do not know about this one.  We used to avoid freeing, even when
> the .pattern_expression member is set, as long as the .extended bit
> is not set.  Now we unconditionally try to free it even when the bit
> says it does not want to.  Why?

Ah, grep.c::compile_grep_patterns() has the answer.  We only
populate the .pattern_expression member when we are doing a complex
query and leave it empty otherwise.  The .pattern_list member is
used instead as a list of OR'ed patterns in grep.c::match_line()
when .extended is not set.

The !opt->extended guard assumes that opt->pattern_expression exists
only when extended is set, which is correct, but forgets that even
when extended is set, pattern_expression is not necessarily non-NULL.

So I think the right thing to do may be to allow free_pattern_expr()
to take and ignore NULL silently?  Ah, that is already what you are
doing in the first hunk.  Is this second hunk even necessary?

I wonder how calls to grep.c::match_line() with opt->extended true
and opt->pattern_expression NULL, though.  It should die() at the
beginning of match_expr_eval(), which probably is OK, but somehow
feels unsatisfactory.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()
  2022-10-10 18:11       ` Junio C Hamano
@ 2022-10-10 18:14         ` Taylor Blau
  0 siblings, 0 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 18:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, orygaw, rsbecker

On Mon, Oct 10, 2022 at 11:11:05AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> So I think the right thing to do may be to allow free_pattern_expr()
> to take and ignore NULL silently?  Ah, that is already what you are
> doing in the first hunk.  Is this second hunk even necessary?

Right. The fix in my patch is to have `free_pattern_expr()` treat its
arguments like `free()` does, where NULL is a silent noop.

We could do with or without the second hunk. I dropped it to avoid
confusion, but it seems to have had the opposite effect ;-). I'll keep
the if-statement there and instead drop the latter hunk.

> I wonder how calls to grep.c::match_line() with opt->extended true
> and opt->pattern_expression NULL, though.  It should die() at the
> beginning of match_expr_eval(), which probably is OK, but somehow
> feels unsatisfactory.

It does, and we can thank Linus for that: c922b01f54 (grep: fix segfault
when "git grep '('" is given, 2009-04-27).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault
  2022-10-10 17:45     ` Junio C Hamano
@ 2022-10-10 18:48       ` Ævar Arnfjörð Bjarmason
  2022-10-10 19:00         ` Taylor Blau
  0 siblings, 1 reply; 18+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-10 18:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, orygaw, rsbecker, Taylor Blau


On Mon, Oct 10 2022, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>>
>>> Neither the "--invert-grep" option added in [1] nor the earlier
>>> "--all-match" option added in [2] were intended to be used
>>> stand-alone.
>>
>> Mostly yes, but with "to be used" -> "to take effect".
>>
>> "[alias] lga = log --all-match" would be equivalent to "log"
>> unless the command line starts talking about "--grep", which is
>> quite handy.
>>
>> So the real fix would be not to forbid the standalone passing of the
>> option to the command, but to fix the over-eager freeing of an
>> unallocated resource, introduced more recently, I would suspect.
>
> On the other hand, I do not think "--invert-grep" is useful in the
> same way.  The only usage I can think of is to omit merges by
> looking for substring "^Merge", but then we already have a more
> robust "--no-merges" option for that purpose.
>
> But both uses the same mechanism, I would say treating them the
> same, i.e. silently ignoring these when --grep is not given, would
> be the most sensible.

The rationale for changing it this way is that the documentaion says:
	
	--all-match::
	        Limit the commits output to ones that match all given `--grep`,
	        instead of ones that match at least one.
	--invert-grep::
	        Limit the commits output to ones with log message that do not
	        match the pattern specified with `--grep=<pattern>`.

Especially as we also say:

	--grep-reflog=<pattern>::
	        Limit the commits output to ones with reflog entries that
        	match the specified pattern (regular expression) [...]

I.e. pretty much the same wording, but there we'll die if the
corresponding other option is omitted, as the context shows.

Maybe we want to post-hoc say we'd like to support it, in which case
some plasting over the issue (e.g. in the direction of Taylor's fix)
would do it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault
  2022-10-10 18:48       ` Ævar Arnfjörð Bjarmason
@ 2022-10-10 19:00         ` Taylor Blau
  0 siblings, 0 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-10 19:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, git, orygaw, rsbecker, Taylor Blau

On Mon, Oct 10, 2022 at 08:48:25PM +0200, Ævar Arnfjörð Bjarmason wrote:
> The rationale for changing it this way is that the documentaion says:
>
> 	--all-match::
> 	        Limit the commits output to ones that match all given `--grep`,
> 	        instead of ones that match at least one.
> 	--invert-grep::
> 	        Limit the commits output to ones with log message that do not
> 	        match the pattern specified with `--grep=<pattern>`.

This does feel a little academic, but to me the documentation seems to
suggest that `--all-match` or `--invert-grep` should both support the
absence of a `--grep` argument.

At least in `--invert-grep`, my reading is "the pattern specified with
`--grep=<pattern>` [...if any]".

In any case, the behavior has been as such for long enough that it feels
like our documentation needs to be changed, and not the behavior itself.
On the other hand, I have a hard time imagining that many/any people
care about this particular behavior.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", fix segfault
  2022-10-10 16:57 ` [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault Ævar Arnfjörð Bjarmason
  2022-10-10 17:34   ` Junio C Hamano
@ 2022-10-11  9:48   ` Ævar Arnfjörð Bjarmason
  2022-10-11 15:46     ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-11  9:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, orygaw, rsbecker, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Since 79d3696cfb4 (git-grep: boolean expression on pattern matching.,
2006-06-30) the "pattern_expression" member has been used for complex
queries (AND/OR...), with "pattern_list" being used for the simple OR
queries. Since then we've used both "pattern_expression" and its
associated boolean "extended" member to see if we have a complex
expression.

Since f41fb662f57 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13) we've had a subtle bug relating to that: If
we supplied options that were only used for "complex queries", but
didn't supply the query itself we'd set "opt->extended", but would
have a NULL "pattern_expression". As a result these would segfault as
we tried to call "free_grep_patterns()" from "release_revisions()":

	git -P log -1 --invert-grep
	git -P log -1 --all-match

The root cause of this is that we were conflating the state management
we needed in "compile_grep_patterns()" itself with whether or not we
had an "opt->pattern_expression" later on.

In this cases as we're going through "compile_grep_patterns()" we have
no "opt->pattern_list" but have "opt->no_body_match" or
"opt->all_match". So we'd set "opt->extended = 1", but not "return" on
"opt->extended" as that's an "else if" in the same "if" statement.

That behavior is intentional and required, as the common case is that
we have an "opt->pattern_list" that we're about to parse into the
"opt->pattern_expression".

But we don't need to keep track of this "extended" flag beyond the
state management in compile_grep_patterns() itself. It needs it, but
once we're out of that function we can rely on
"opt->pattern_expression" being non-NULL instead for using these
extended patterns.

As 79d3696cfb4 itself shows we've assumed that there's a one-to-one
mapping between the two since the very beginning. I.e. "match_line()"
would check "opt->extended" to see if it should call "match_expr()",
and the first thing we do in that function is assume that we have a
"opt->pattern_expression". We'd then call "match_expr_eval()", which
would have died if that "opt->pattern_expression" was NULL.

The "die" was added in c922b01f54c (grep: fix segfault when "git grep
'('" is given, 2009-04-27), and can now be removed as it's now clearly
unreachable. We still do the right thing in the case that prompted
that fix:

	git grep '('
	fatal: unmatched parenthesis

Arguably neither the "--invert-grep" option added in [1] nor the
earlier "--all-match" option added in [2] were intended to be used
stand-alone, and another approach[3] would be to error out in those
cases. But since we've been treating them as a NOOP when given without
--grep for a long time let's keep doing that.

We could also return in "free_pattern_expr()" if the argument is
non-NULL, as an alternative fix for this segfault does [4]. That would
be more elegant in making the "free_*()" function behave like
"free()", but it would also remove a sanity check: The
"free_pattern_expr()" function calls itself recursively, and only the
top-level is allowed to be NULL, let's not conflate those two
conditions.

1. 22dfa8a23de (log: teach --invert-grep option, 2015-01-12)
2. 0ab7befa31d (grep --all-match, 2006-09-27)
3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/
4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com

Reported-by: orygaw <orygaw@protonmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

Per the v1 feedback this v2 doesn't change any user-observable
behavior, except to address the segfault(s).

Per https://lore.kernel.org/git/Y0Rg9My2EaWl%2FWCU@nand.local/ it
sounds like Taylor's preparing his own v2 with a more narrow fix
focused on https://lore.kernel.org/git/Y0Rg9My2EaWl%2FWCU@nand.local/;
which we may or may not want to have instead of this as a quicker
band-aid.

But this attempts to address the root cause of the problem, which is
that grep.[ch] is effectively juggling two struct members that mean
the same thing, but whose state drifted apart.

Passing CI at: https://github.com/avar/git/actions/runs/3225227595
(actually https://github.com/avar/git/actions/runs/3225813488, but
when I was about to submit this I made a trivial whitespace fix in the
t/*.sh change).

Range-diff against v1:
1:  f4b90799fce ! 1:  6ad7627706f log: require --grep for --invert-grep and --all-match, fix segfault
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    log: require --grep for --invert-grep and --all-match, fix segfault
    +    grep.c: remove "extended" in favor of "pattern_expression", fix segfault
     
    -    Neither the "--invert-grep" option added in [1] nor the earlier
    -    "--all-match" option added in [2] were intended to be used
    -    stand-alone.
    +    Since 79d3696cfb4 (git-grep: boolean expression on pattern matching.,
    +    2006-06-30) the "pattern_expression" member has been used for complex
    +    queries (AND/OR...), with "pattern_list" being used for the simple OR
    +    queries. Since then we've used both "pattern_expression" and its
    +    associated boolean "extended" member to see if we have a complex
    +    expression.
     
    -    But due to how the built-in and the revision API interacted those
    -    options without the corresponding --grep would be ignored.
    +    Since f41fb662f57 (revisions API: have release_revisions() release
    +    "grep_filter", 2022-04-13) we've had a subtle bug relating to that: If
    +    we supplied options that were only used for "complex queries", but
    +    didn't supply the query itself we'd set "opt->extended", but would
    +    have a NULL "pattern_expression". As a result these would segfault as
    +    we tried to call "free_grep_patterns()" from "release_revisions()":
     
    -    Then in f41fb662f57 (revisions API: have release_revisions() release
    -    "grep_filter", 2022-04-13) this turned into a segfault, as we'd
    -    attempt to free() the non-existing --grep patterns.
    +            git -P log -1 --invert-grep
    +            git -P log -1 --all-match
     
    -    Arguably it makes more sense to add this check to
    -    compile_grep_patterns(), since it's possible to use the C API in the
    -    same way and trigger this segfault. But in practice the revision.c API
    -    is the only user of "no_body_match", and by placing the check here we
    -    can more sensibly emit a message that assumes that the user used
    -    "--invert-grep" without "--grep".
    +    The root cause of this is that we were conflating the state management
    +    we needed in "compile_grep_patterns()" itself with whether or not we
    +    had an "opt->pattern_expression" later on.
    +
    +    In this cases as we're going through "compile_grep_patterns()" we have
    +    no "opt->pattern_list" but have "opt->no_body_match" or
    +    "opt->all_match". So we'd set "opt->extended = 1", but not "return" on
    +    "opt->extended" as that's an "else if" in the same "if" statement.
    +
    +    That behavior is intentional and required, as the common case is that
    +    we have an "opt->pattern_list" that we're about to parse into the
    +    "opt->pattern_expression".
    +
    +    But we don't need to keep track of this "extended" flag beyond the
    +    state management in compile_grep_patterns() itself. It needs it, but
    +    once we're out of that function we can rely on
    +    "opt->pattern_expression" being non-NULL instead for using these
    +    extended patterns.
    +
    +    As 79d3696cfb4 itself shows we've assumed that there's a one-to-one
    +    mapping between the two since the very beginning. I.e. "match_line()"
    +    would check "opt->extended" to see if it should call "match_expr()",
    +    and the first thing we do in that function is assume that we have a
    +    "opt->pattern_expression". We'd then call "match_expr_eval()", which
    +    would have died if that "opt->pattern_expression" was NULL.
    +
    +    The "die" was added in c922b01f54c (grep: fix segfault when "git grep
    +    '('" is given, 2009-04-27), and can now be removed as it's now clearly
    +    unreachable. We still do the right thing in the case that prompted
    +    that fix:
    +
    +            git grep '('
    +            fatal: unmatched parenthesis
    +
    +    Arguably neither the "--invert-grep" option added in [1] nor the
    +    earlier "--all-match" option added in [2] were intended to be used
    +    stand-alone, and another approach[3] would be to error out in those
    +    cases. But since we've been treating them as a NOOP when given without
    +    --grep for a long time let's keep doing that.
    +
    +    We could also return in "free_pattern_expr()" if the argument is
    +    non-NULL, as an alternative fix for this segfault does [4]. That would
    +    be more elegant in making the "free_*()" function behave like
    +    "free()", but it would also remove a sanity check: The
    +    "free_pattern_expr()" function calls itself recursively, and only the
    +    top-level is allowed to be NULL, let's not conflate those two
    +    conditions.
     
         1. 22dfa8a23de (log: teach --invert-grep option, 2015-01-12)
         2. 0ab7befa31d (grep --all-match, 2006-09-27)
    +    3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/
    +    4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com
     
         Reported-by: orygaw <orygaw@protonmail.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## revision.c ##
    -@@ revision.c: int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
    - 		die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph");
    - 	if (!revs->reflog_info && revs->grep_filter.use_reflog_filter)
    - 		die(_("the option '%s' requires '%s'"), "--grep-reflog", "--walk-reflogs");
    -+	if (!revs->grep_filter.pattern_expression) {
    -+		if (revs->grep_filter.no_body_match)
    -+			die(_("the option '%s' requires '%s'"), "--invert-grep", "--grep");
    -+		if (revs->grep_filter.all_match)
    -+			die(_("the option '%s' requires '%s'"), "--all-match", "--grep");
    -+	}
    + ## grep.c ##
    +@@ grep.c: void compile_grep_patterns(struct grep_opt *opt)
    + {
    + 	struct grep_pat *p;
    + 	struct grep_expr *header_expr = prep_header_patterns(opt);
    ++	int extended = 0;
    + 
    + 	for (p = opt->pattern_list; p; p = p->next) {
    + 		switch (p->token) {
    +@@ grep.c: void compile_grep_patterns(struct grep_opt *opt)
    + 			compile_regexp(p, opt);
    + 			break;
    + 		default:
    +-			opt->extended = 1;
    ++			extended = 1;
    + 			break;
    + 		}
    + 	}
    + 
    + 	if (opt->all_match || opt->no_body_match || header_expr)
    +-		opt->extended = 1;
    +-	else if (!opt->extended)
    ++		extended = 1;
    ++	else if (!extended)
    + 		return;
    + 
    + 	p = opt->pattern_list;
    +@@ grep.c: void free_grep_patterns(struct grep_opt *opt)
    + 		free(p);
    + 	}
      
    - 	if (revs->line_level_traverse &&
    - 	    (revs->diffopt.output_format & ~(DIFF_FORMAT_PATCH | DIFF_FORMAT_NO_OUTPUT)))
    +-	if (!opt->extended)
    ++	if (!opt->pattern_expression)
    + 		return;
    + 	free_pattern_expr(opt->pattern_expression);
    + }
    +@@ grep.c: static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x,
    + {
    + 	int h = 0;
    + 
    +-	if (!x)
    +-		die("Not a valid grep expression");
    + 	switch (x->node) {
    + 	case GREP_NODE_TRUE:
    + 		h = 1;
    +@@ grep.c: static int match_line(struct grep_opt *opt,
    + 	struct grep_pat *p;
    + 	int hit = 0;
    + 
    +-	if (opt->extended)
    ++	if (opt->pattern_expression)
    + 		return match_expr(opt, bol, eol, ctx, col, icol,
    + 				  collect_hits);
    + 
    +@@ grep.c: static int should_lookahead(struct grep_opt *opt)
    + {
    + 	struct grep_pat *p;
    + 
    +-	if (opt->extended)
    ++	if (opt->pattern_expression)
    + 		return 0; /* punt for too complex stuff */
    + 	if (opt->invert)
    + 		return 0;
    +
    + ## grep.h ##
    +@@ grep.h: struct grep_opt {
    + #define GREP_BINARY_TEXT	2
    + 	int binary;
    + 	int allow_textconv;
    +-	int extended;
    + 	int use_reflog_filter;
    + 	int relative;
    + 	int pathname;
     
      ## t/t4202-log.sh ##
     @@ t/t4202-log.sh: test_expect_success 'log --grep' '
      	test_cmp expect actual
      '
      
    -+test_expect_success 'log --invert-grep usage' '
    -+	test_expect_code 128 git log --invert-grep
    -+'
    ++for noop_opt in --invert-grep --all-match
    ++do
    ++	test_expect_success "log $noop_opt without --grep is a NOOP" '
    ++		git log >expect &&
    ++		git log $noop_opt >actual &&
    ++		test_cmp expect actual
    ++	'
    ++done
     +
      cat > expect << EOF
      second
      initial
    -
    - ## t/t7810-grep.sh ##
    -@@ t/t7810-grep.sh: test_expect_success 'log with multiple --grep uses union' '
    - 	test_cmp expect actual
    - '
    - 
    -+test_expect_success 'log --all-match usage' '
    -+	test_expect_code 128 git log --all-match
    -+'
    -+
    - test_expect_success 'log --all-match with multiple --grep uses intersection' '
    - 	git log --all-match --grep=i --grep=r --format=%s >actual &&
    - 	{

 grep.c         | 15 +++++++--------
 grep.h         |  1 -
 t/t4202-log.sh |  9 +++++++++
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/grep.c b/grep.c
index 52a894c9890..06eed694936 100644
--- a/grep.c
+++ b/grep.c
@@ -708,6 +708,7 @@ void compile_grep_patterns(struct grep_opt *opt)
 {
 	struct grep_pat *p;
 	struct grep_expr *header_expr = prep_header_patterns(opt);
+	int extended = 0;
 
 	for (p = opt->pattern_list; p; p = p->next) {
 		switch (p->token) {
@@ -717,14 +718,14 @@ void compile_grep_patterns(struct grep_opt *opt)
 			compile_regexp(p, opt);
 			break;
 		default:
-			opt->extended = 1;
+			extended = 1;
 			break;
 		}
 	}
 
 	if (opt->all_match || opt->no_body_match || header_expr)
-		opt->extended = 1;
-	else if (!opt->extended)
+		extended = 1;
+	else if (!extended)
 		return;
 
 	p = opt->pattern_list;
@@ -790,7 +791,7 @@ void free_grep_patterns(struct grep_opt *opt)
 		free(p);
 	}
 
-	if (!opt->extended)
+	if (!opt->pattern_expression)
 		return;
 	free_pattern_expr(opt->pattern_expression);
 }
@@ -971,8 +972,6 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x,
 {
 	int h = 0;
 
-	if (!x)
-		die("Not a valid grep expression");
 	switch (x->node) {
 	case GREP_NODE_TRUE:
 		h = 1;
@@ -1052,7 +1051,7 @@ static int match_line(struct grep_opt *opt,
 	struct grep_pat *p;
 	int hit = 0;
 
-	if (opt->extended)
+	if (opt->pattern_expression)
 		return match_expr(opt, bol, eol, ctx, col, icol,
 				  collect_hits);
 
@@ -1370,7 +1369,7 @@ static int should_lookahead(struct grep_opt *opt)
 {
 	struct grep_pat *p;
 
-	if (opt->extended)
+	if (opt->pattern_expression)
 		return 0; /* punt for too complex stuff */
 	if (opt->invert)
 		return 0;
diff --git a/grep.h b/grep.h
index bdcadce61b8..6075f997e68 100644
--- a/grep.h
+++ b/grep.h
@@ -151,7 +151,6 @@ struct grep_opt {
 #define GREP_BINARY_TEXT	2
 	int binary;
 	int allow_textconv;
-	int extended;
 	int use_reflog_filter;
 	int relative;
 	int pathname;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index cc15cb4ff62..2ce2b41174d 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -249,6 +249,15 @@ test_expect_success 'log --grep' '
 	test_cmp expect actual
 '
 
+for noop_opt in --invert-grep --all-match
+do
+	test_expect_success "log $noop_opt without --grep is a NOOP" '
+		git log >expect &&
+		git log $noop_opt >actual &&
+		test_cmp expect actual
+	'
+done
+
 cat > expect << EOF
 second
 initial
-- 
2.38.0.971.ge79ff6d20e7


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", fix segfault
  2022-10-11  9:48   ` [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", " Ævar Arnfjörð Bjarmason
@ 2022-10-11 15:46     ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2022-10-11 15:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, orygaw, rsbecker, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>  grep.c         | 15 +++++++--------
>  grep.h         |  1 -
>  t/t4202-log.sh |  9 +++++++++
>  3 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 52a894c9890..06eed694936 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -708,6 +708,7 @@ void compile_grep_patterns(struct grep_opt *opt)
>  {
>  	struct grep_pat *p;
>  	struct grep_expr *header_expr = prep_header_patterns(opt);
> +	int extended = 0;
>  
>  	for (p = opt->pattern_list; p; p = p->next) {
>  		switch (p->token) {
> @@ -717,14 +718,14 @@ void compile_grep_patterns(struct grep_opt *opt)
>  			compile_regexp(p, opt);
>  			break;
>  		default:
> -			opt->extended = 1;
> +			extended = 1;
>  			break;
>  		}
>  	}
>  
>  	if (opt->all_match || opt->no_body_match || header_expr)
> -		opt->extended = 1;
> -	else if (!opt->extended)
> +		extended = 1;
> +	else if (!extended)
>  		return;

Nice. I like this change to make "!!opt->pattern_expression" the
authoritative source of truth for opt->extended by getting rid of
the latter.  This function did need to have a handy way to tell "do
we need to populate pattern_expression?" while going over the list
and also use of some features forced us to do so no matter how
simple the patterns on the list are, but after doing so and
populating the pattern_expression, there was no reason to keep it
around by having it as a member in the opt structure.

> @@ -790,7 +791,7 @@ void free_grep_patterns(struct grep_opt *opt)
>  		free(p);
>  	}
>  
> -	if (!opt->extended)
> +	if (!opt->pattern_expression)
>  		return;
>  	free_pattern_expr(opt->pattern_expression);
>  }
> @@ -971,8 +972,6 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x,
>  {
>  	int h = 0;
>  
> -	if (!x)
> -		die("Not a valid grep expression");
>  	switch (x->node) {
>  	case GREP_NODE_TRUE:
>  		h = 1;
> @@ -1052,7 +1051,7 @@ static int match_line(struct grep_opt *opt,
>  	struct grep_pat *p;
>  	int hit = 0;
>  
> -	if (opt->extended)
> +	if (opt->pattern_expression)
>  		return match_expr(opt, bol, eol, ctx, col, icol,
>  				  collect_hits);
>  
> @@ -1370,7 +1369,7 @@ static int should_lookahead(struct grep_opt *opt)
>  {
>  	struct grep_pat *p;
>  
> -	if (opt->extended)
> +	if (opt->pattern_expression)
>  		return 0; /* punt for too complex stuff */
>  	if (opt->invert)
>  		return 0;

And the necessary change for users is surprisingly small.

> diff --git a/grep.h b/grep.h
> index bdcadce61b8..6075f997e68 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -151,7 +151,6 @@ struct grep_opt {
>  #define GREP_BINARY_TEXT	2
>  	int binary;
>  	int allow_textconv;
> -	int extended;
>  	int use_reflog_filter;
>  	int relative;
>  	int pathname;
> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index cc15cb4ff62..2ce2b41174d 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -249,6 +249,15 @@ test_expect_success 'log --grep' '
>  	test_cmp expect actual
>  '
>  
> +for noop_opt in --invert-grep --all-match
> +do
> +	test_expect_success "log $noop_opt without --grep is a NOOP" '
> +		git log >expect &&
> +		git log $noop_opt >actual &&
> +		test_cmp expect actual
> +	'
> +done

OK.

Thanks, will queue.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-10-11 15:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-10 14:33 Git BUG 2.37.3 and 2.38.0 orygaw
2022-10-10 15:40 ` rsbecker
2022-10-10 15:48   ` Taylor Blau
2022-10-10 17:02     ` Junio C Hamano
2022-10-10 16:57 ` [PATCH] log: require --grep for --invert-grep and --all-match, fix segfault Ævar Arnfjörð Bjarmason
2022-10-10 17:34   ` Junio C Hamano
2022-10-10 17:45     ` Junio C Hamano
2022-10-10 18:48       ` Ævar Arnfjörð Bjarmason
2022-10-10 19:00         ` Taylor Blau
2022-10-11  9:48   ` [PATCH v2] grep.c: remove "extended" in favor of "pattern_expression", " Ævar Arnfjörð Bjarmason
2022-10-11 15:46     ` Junio C Hamano
2022-10-10 17:41 ` [PATCH 0/2] grep: tolerate NULL argument to free_grep_expr() Taylor Blau
2022-10-10 17:41   ` [PATCH 1/2] t4202: demonstrate `git log --invert-grep` segfault Taylor Blau
2022-10-10 17:41   ` [PATCH 2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr() Taylor Blau
2022-10-10 17:54     ` Junio C Hamano
2022-10-10 18:10       ` Taylor Blau
2022-10-10 18:11       ` Junio C Hamano
2022-10-10 18:14         ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).