git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* segfault on invalid 'git grep' invocation
@ 2021-06-28 12:17 Matthew Hughes
  2021-06-28 18:53 ` Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Matthew Hughes @ 2021-06-28 12:17 UTC (permalink / raw)
  To: git

Hi all,

It's my first time writing to this mailing list, so apologies in advance
for any mistakes/faux pas.

Summary
=======

Segfault seen when running:

    $ ./git grep --and -e 'pattern' -- not_a_path

Expected Behaviour
==================

The above invocation doesn't appear particularly valid to me, I
accidentally ran it while editing a command. Not sure if this needs
addressing, but perhaps the expected behaviour would be an error message
about it not being valid?

Compare with, running without '--':

    $ ./git grep --and -e 'pattern' not_a_path
    fatal: ambiguous argument 'not_a_path': unknown revision or path not in the working tree.
    Use '--' to separate paths from revisions, like this:
    'git <command> [<revision>...] -- [<file>...]'

Running with a valid path:

    $ ./git grep --and -e 'pattern' -- common-main.c
    fatal: Not a valid grep expression

    $ ./git grep --and -e 'pattern' common-main.c
    fatal: Not a valid grep expression


Background
==========

git version:
git version 2.32.0.93.g670b81a890
cpu: x86_64
built from commit: 670b81a890388c60b7032a4f5b879f2ece8c4558
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64
compiler info: gnuc: 8.3
libc info: glibc: 2.28
$SHELL (typically, interactive shell): /bin/bash

Also reproduced on git 2.20.1 (packaged version on Debian 10)

The segfault is raised at free_pattern_expr (x=0x0) at grep.c:825 (on
commit 670b81a890388c60b7032a4f5b879f2ece8c4558)

Regards,
Matt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: segfault on invalid 'git grep' invocation
  2021-06-28 12:17 segfault on invalid 'git grep' invocation Matthew Hughes
@ 2021-06-28 18:53 ` Ævar Arnfjörð Bjarmason
  2021-06-28 18:58 ` [PATCH] grep: report missing left operand of --and René Scharfe
  2021-06-30 16:12 ` [PATCH v2] " René Scharfe
  2 siblings, 0 replies; 6+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-28 18:53 UTC (permalink / raw)
  To: Matthew Hughes; +Cc: git


On Mon, Jun 28 2021, Matthew Hughes wrote:

> Hi all,
>
> It's my first time writing to this mailing list, so apologies in advance
> for any mistakes/faux pas.

Thanks a lot for the report, and welcome.

> Summary
> =======
>
> Segfault seen when running:
>
>     $ ./git grep --and -e 'pattern' -- not_a_path
>
> Expected Behaviour
> ==================
>
> The above invocation doesn't appear particularly valid to me, I
> accidentally ran it while editing a command. Not sure if this needs
> addressing, but perhaps the expected behaviour would be an error message
> about it not being valid?
>
> Compare with, running without '--':
>
>     $ ./git grep --and -e 'pattern' not_a_path
>     fatal: ambiguous argument 'not_a_path': unknown revision or path not in the working tree.
>     Use '--' to separate paths from revisions, like this:
>     'git <command> [<revision>...] -- [<file>...]'
>
> Running with a valid path:
>
>     $ ./git grep --and -e 'pattern' -- common-main.c
>     fatal: Not a valid grep expression
>
>     $ ./git grep --and -e 'pattern' common-main.c
>     fatal: Not a valid grep expression

The exact segfault you've found is found at least back to Git version
2.0.0, so it's been with us for a while.

The problem is that you're trying to --and two things, but did not
provide two things. We apparently never checked for this, and then
segfault in the cleanup, thinking an --and must imply that we have two
sides.

I haven't looked deeply, but think that the reason it segfaults just
with not_a_path is that we then proceed to die() in something that
should probably be a BUG() instead as we can't grep in the
expected-but-not-there pattern at runtime.

Giving it a pathspec that doesn't exist means we never run the "real"
grep, but we do run the cleanup, hence the segfault.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] grep: report missing left operand of --and
  2021-06-28 12:17 segfault on invalid 'git grep' invocation Matthew Hughes
  2021-06-28 18:53 ` Ævar Arnfjörð Bjarmason
@ 2021-06-28 18:58 ` René Scharfe
  2021-06-29 17:52   ` Ævar Arnfjörð Bjarmason
  2021-06-30 16:12 ` [PATCH v2] " René Scharfe
  2 siblings, 1 reply; 6+ messages in thread
From: René Scharfe @ 2021-06-28 18:58 UTC (permalink / raw)
  To: Matthew Hughes, git; +Cc: Junio C Hamano

Git grep allows combining two patterns with --and.  It checks and
reports if the second pattern is missing when compiling the expression.
A missing first pattern, however, is only reported later at match time.
Thus no error is returned if no matching is done, e.g. because no file
matches the also given pathspec.

When that happens we get an expression tree with an GREP_NODE_AND node
and a NULL pointer to the missing left child.  free_pattern_expr()
tries to dereference it during the cleanup at the end, which result in
a segmentation fault.

Fix this by verifying the presence of the left operand at expression
compilation time.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Whether the check in match_expr_eval() can now be turned into a BUG is
left as an exercise for the reader. ;-)

 grep.c          |  2 ++
 t/t7810-grep.sh | 10 ++++++++++
 2 files changed, 12 insertions(+)

diff --git a/grep.c b/grep.c
index 8f91af1cb0..7d0ea4e956 100644
--- a/grep.c
+++ b/grep.c
@@ -655,6 +655,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
 	struct grep_expr *x, *y, *z;

 	x = compile_pattern_not(list);
+	if (!x)
+		die("Not a valid grep expression");
 	p = *list;
 	if (p && p->token == GREP_AND) {
 		if (!p->next)
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 5830733f3d..c581239674 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

 . ./test-lib.sh

+test_invalid_grep_expression() {
+	params="$@" &&
+	test_expect_success "invalid expression: grep $params" '
+		test_must_fail git grep $params -- nonexisting
+	'
+}
+
 cat >hello.c <<EOF
 #include <assert.h>
 #include <stdio.h>
@@ -89,6 +96,9 @@ test_expect_success 'grep should not segfault with a bad input' '
 	test_must_fail git grep "("
 '

+test_invalid_grep_expression -e A --and
+test_invalid_grep_expression --and -e A
+
 for H in HEAD ''
 do
 	case "$H" in
--
2.32.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] grep: report missing left operand of --and
  2021-06-28 18:58 ` [PATCH] grep: report missing left operand of --and René Scharfe
@ 2021-06-29 17:52   ` Ævar Arnfjörð Bjarmason
  2021-06-29 18:35     ` René Scharfe
  0 siblings, 1 reply; 6+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-29 17:52 UTC (permalink / raw)
  To: René Scharfe; +Cc: Matthew Hughes, git, Junio C Hamano


On Mon, Jun 28 2021, René Scharfe wrote:

> Git grep allows combining two patterns with --and.  It checks and
> reports if the second pattern is missing when compiling the expression.
> A missing first pattern, however, is only reported later at match time.
> Thus no error is returned if no matching is done, e.g. because no file
> matches the also given pathspec.
>
> When that happens we get an expression tree with an GREP_NODE_AND node
> and a NULL pointer to the missing left child.  free_pattern_expr()
> tries to dereference it during the cleanup at the end, which result in
> a segmentation fault.
>
> Fix this by verifying the presence of the left operand at expression
> compilation time.
>
> Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Whether the check in match_expr_eval() can now be turned into a BUG is
> left as an exercise for the reader. ;-)
>
>  grep.c          |  2 ++
>  t/t7810-grep.sh | 10 ++++++++++
>  2 files changed, 12 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index 8f91af1cb0..7d0ea4e956 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -655,6 +655,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
>  	struct grep_expr *x, *y, *z;
>
>  	x = compile_pattern_not(list);
> +	if (!x)
> +		die("Not a valid grep expression");
>  	p = *list;
>  	if (p && p->token == GREP_AND) {
>  		if (!p->next)
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 5830733f3d..c581239674 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
>  . ./test-lib.sh
>
> +test_invalid_grep_expression() {
> +	params="$@" &&
> +	test_expect_success "invalid expression: grep $params" '
> +		test_must_fail git grep $params -- nonexisting
> +	'
> +}
> +
>  cat >hello.c <<EOF
>  #include <assert.h>
>  #include <stdio.h>
> @@ -89,6 +96,9 @@ test_expect_success 'grep should not segfault with a bad input' '
>  	test_must_fail git grep "("
>  '
>
> +test_invalid_grep_expression -e A --and
> +test_invalid_grep_expression --and -e A
> +
>  for H in HEAD ''
>  do
>  	case "$H" in

This seems like an incomplete fix, for the exact same thing with --or we
silently return 1, as we would if we exited early in free_pattern_expr
on !x, which aside from the segfault I think we should probably make a
habit in our own free()-like functions.

Whatever we're doing about the --and segfault it seems like we should do
the same under --or, no?

Your first test also passes before your fix, it's only the latter that
segfaults. The first one emits:

    fatal: --and not followed by pattern expression

So having that in a leading patch to indicate no behavior was changed
would be better.

Instead of the "Not a valid grep expression" error let's instead say
something like:

    fatal: --[and|or] must follow a pattern expression

The error (which I know you just copied from elsewhere) is misleading,
it's not the pattern that's not valid (as to me it implies), but our own
--and/--or option usage.

And the "excercise for the reader" is a bit flippant, do we actually hit
that condition now? If not and we're sure we won't now seems like the
time to add a BUG() there, and to change the "Not a valid grep
expression" to "internal error in --and/--or parsing" or something.

Thanks for the patch!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] grep: report missing left operand of --and
  2021-06-29 17:52   ` Ævar Arnfjörð Bjarmason
@ 2021-06-29 18:35     ` René Scharfe
  0 siblings, 0 replies; 6+ messages in thread
From: René Scharfe @ 2021-06-29 18:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Matthew Hughes, git, Junio C Hamano

Am 29.06.21 um 19:52 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Jun 28 2021, René Scharfe wrote:
>
>> Git grep allows combining two patterns with --and.  It checks and
>> reports if the second pattern is missing when compiling the expression.
>> A missing first pattern, however, is only reported later at match time.
>> Thus no error is returned if no matching is done, e.g. because no file
>> matches the also given pathspec.
>>
>> When that happens we get an expression tree with an GREP_NODE_AND node
>> and a NULL pointer to the missing left child.  free_pattern_expr()
>> tries to dereference it during the cleanup at the end, which result in
>> a segmentation fault.
>>
>> Fix this by verifying the presence of the left operand at expression
>> compilation time.
>>
>> Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> Whether the check in match_expr_eval() can now be turned into a BUG is
>> left as an exercise for the reader. ;-)
>>
>>  grep.c          |  2 ++
>>  t/t7810-grep.sh | 10 ++++++++++
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/grep.c b/grep.c
>> index 8f91af1cb0..7d0ea4e956 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -655,6 +655,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
>>  	struct grep_expr *x, *y, *z;
>>
>>  	x = compile_pattern_not(list);
>> +	if (!x)
>> +		die("Not a valid grep expression");
>>  	p = *list;
>>  	if (p && p->token == GREP_AND) {
>>  		if (!p->next)
>> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
>> index 5830733f3d..c581239674 100755
>> --- a/t/t7810-grep.sh
>> +++ b/t/t7810-grep.sh
>> @@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>>
>>  . ./test-lib.sh
>>
>> +test_invalid_grep_expression() {
>> +	params="$@" &&
>> +	test_expect_success "invalid expression: grep $params" '
>> +		test_must_fail git grep $params -- nonexisting
>> +	'
>> +}
>> +
>>  cat >hello.c <<EOF
>>  #include <assert.h>
>>  #include <stdio.h>
>> @@ -89,6 +96,9 @@ test_expect_success 'grep should not segfault with a bad input' '
>>  	test_must_fail git grep "("
>>  '
>>
>> +test_invalid_grep_expression -e A --and
>> +test_invalid_grep_expression --and -e A
>> +
>>  for H in HEAD ''
>>  do
>>  	case "$H" in
>
> This seems like an incomplete fix, for the exact same thing with --or we
> silently return 1, as we would if we exited early in free_pattern_expr
> on !x, which aside from the segfault I think we should probably make a
> habit in our own free()-like functions.
>
> Whatever we're doing about the --and segfault it seems like we should do
> the same under --or, no?

No, --or is a special case and needs special handling.  Currently it's
ignored.  If we want to berate the user for using it without expressions
left and right then we need to start actively handling it.

> Your first test also passes before your fix, it's only the latter that
> segfaults. The first one emits:
>
>     fatal: --and not followed by pattern expression
>
> So having that in a leading patch to indicate no behavior was changed
> would be better.

True, the first test is just nice to have.  I can remove it to reduce
confusion.

> Instead of the "Not a valid grep expression" error let's instead say
> something like:
>
>     fatal: --[and|or] must follow a pattern expression

Good point.

> The error (which I know you just copied from elsewhere) is misleading,
> it's not the pattern that's not valid (as to me it implies), but our own
> --and/--or option usage.

That's what's meant with extended pattern, I think.

> And the "excercise for the reader" is a bit flippant, do we actually hit
> that condition now? If not and we're sure we won't now seems like the
> time to add a BUG() there, and to change the "Not a valid grep
> expression" to "internal error in --and/--or parsing" or something.

I don't know.

René

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] grep: report missing left operand of --and
  2021-06-28 12:17 segfault on invalid 'git grep' invocation Matthew Hughes
  2021-06-28 18:53 ` Ævar Arnfjörð Bjarmason
  2021-06-28 18:58 ` [PATCH] grep: report missing left operand of --and René Scharfe
@ 2021-06-30 16:12 ` René Scharfe
  2 siblings, 0 replies; 6+ messages in thread
From: René Scharfe @ 2021-06-30 16:12 UTC (permalink / raw)
  To: Matthew Hughes, git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason

Git grep allows combining two patterns with --and.  It checks and
reports if the second pattern is missing when compiling the expression.
A missing first pattern, however, is only reported later at match time.
Thus no error is returned if no matching is done, e.g. because no file
matches the also given pathspec.

When that happens we get an expression tree with an GREP_NODE_AND node
and a NULL pointer to the missing left child.  free_pattern_expr()
tries to dereference it during the cleanup at the end, which results
in a segmentation fault.

Fix this by verifying the presence of the left operand at expression
compilation time.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes since v2:
- more specific error message,
- grammar error fix in commit message,
- no test of already working behavior.

 grep.c          | 2 ++
 t/t7810-grep.sh | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/grep.c b/grep.c
index 8f91af1cb0..424a39591b 100644
--- a/grep.c
+++ b/grep.c
@@ -657,6 +657,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
 	x = compile_pattern_not(list);
 	p = *list;
 	if (p && p->token == GREP_AND) {
+		if (!x)
+			die("--and not preceded by pattern expression");
 		if (!p->next)
 			die("--and not followed by pattern expression");
 		*list = p->next;
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 5830733f3d..6b6423a07c 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

 . ./test-lib.sh

+test_invalid_grep_expression() {
+	params="$@" &&
+	test_expect_success "invalid expression: grep $params" '
+		test_must_fail git grep $params -- nonexisting
+	'
+}
+
 cat >hello.c <<EOF
 #include <assert.h>
 #include <stdio.h>
@@ -89,6 +96,8 @@ test_expect_success 'grep should not segfault with a bad input' '
 	test_must_fail git grep "("
 '

+test_invalid_grep_expression --and -e A
+
 for H in HEAD ''
 do
 	case "$H" in
--
2.32.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-30 16:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 12:17 segfault on invalid 'git grep' invocation Matthew Hughes
2021-06-28 18:53 ` Ævar Arnfjörð Bjarmason
2021-06-28 18:58 ` [PATCH] grep: report missing left operand of --and René Scharfe
2021-06-29 17:52   ` Ævar Arnfjörð Bjarmason
2021-06-29 18:35     ` René Scharfe
2021-06-30 16:12 ` [PATCH v2] " René Scharfe

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).