git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug Report: Duplicate condition in read_author_script (sequencer.c)
@ 2022-10-03  6:39 Michael V. Scovetta
  2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Michael V. Scovetta @ 2022-10-03  6:39 UTC (permalink / raw)
  To: git

Hello!
I noticed a duplicate condition in the read_author_script function
defined in sequencer.c. I reported this initially to the security
mailing list, but folks there thought it was unlikely to be
interesting from a security perspective, and advised me to report it
here.

In commit 2a7d63a2, sequencer.c:912 looks like:
912  if (name_i == -2)
913      error(_("missing 'GIT_AUTHOR_NAME'"));
914  if (email_i == -2)
915      error(_("missing 'GIT_AUTHOR_EMAIL'"));
916  if (date_i == -2)
917      error(_("missing 'GIT_AUTHOR_DATE'"));
918  if (date_i < 0 || email_i < 0 || date_i < 0 || err)    <-- date_i
is referenced here twice
919      goto finish;

I'm fairly sure that line 918 should be:
918  if (name_i < 0 || email_i < 0 || date_i < 0 || err)

I haven't validated this, but I suspect that if
`rebase-merge/author-script` contained two GIT_AUTHOR_NAME fields,
then name_i would be set to -1 (by the error function), but control
wouldn't flow to finish, but instead to line 920 ( *name =
kv.items[name_i].util; ) where it would attempt to access memory
slightly outside of items' memory space.

I haven't been able to actually trigger the bug, but strongly suspect
I'm just not familiar enough with how rebasing works under the covers.

I can dig into this deeper if you'd like, but it looks like an obvious
typo or copy/paste error. Is this something that a maintainer would be
able to take from here?

Thanks so much!!

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  6:39 Bug Report: Duplicate condition in read_author_script (sequencer.c) Michael V. Scovetta
@ 2022-10-03  8:45 ` Jeff King
  2022-10-03  9:29   ` Phillip Wood
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Jeff King @ 2022-10-03  8:45 UTC (permalink / raw)
  To: Michael V. Scovetta; +Cc: Phillip Wood, git

On Sun, Oct 02, 2022 at 11:39:16PM -0700, Michael V. Scovetta wrote:

> In commit 2a7d63a2, sequencer.c:912 looks like:
> 912  if (name_i == -2)
> 913      error(_("missing 'GIT_AUTHOR_NAME'"));
> 914  if (email_i == -2)
> 915      error(_("missing 'GIT_AUTHOR_EMAIL'"));
> 916  if (date_i == -2)
> 917      error(_("missing 'GIT_AUTHOR_DATE'"));
> 918  if (date_i < 0 || email_i < 0 || date_i < 0 || err)    <-- date_i
> is referenced here twice
> 919      goto finish;
> 
> I'm fairly sure that line 918 should be:
> 918  if (name_i < 0 || email_i < 0 || date_i < 0 || err)

Agreed, but +cc Phillip as the original author.

> I haven't validated this, but I suspect that if
> `rebase-merge/author-script` contained two GIT_AUTHOR_NAME fields,
> then name_i would be set to -1 (by the error function), but control
> wouldn't flow to finish, but instead to line 920 ( *name =
> kv.items[name_i].util; ) where it would attempt to access memory
> slightly outside of items' memory space.

Correct. It also happens if GIT_AUTHOR_NAME is missing.

> I haven't been able to actually trigger the bug, but strongly suspect
> I'm just not familiar enough with how rebasing works under the covers.

It's a little tricky, because we avoid writing and reading the
author-script file unless necessary. An easy way to need it is to break
with a conflict (which writes it), and then resume with "git rebase
--continue" (which reads it back while committing).

Here's a patch to fix it. Thanks for your report!

-- >8 --
Subject: sequencer: detect author name errors in read_author_script()

As we parse the author-script file, we check for missing or duplicate
lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
final error conditional checks "date_i" twice and "name_i" not at all.
This not only leads to us failing to abort, but we may do an
out-of-bounds read on the string_list array.

The bug goes back to 442c36bd08 (am: improve author-script error
reporting, 2018-10-31), though the code was soon after moved to this
spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
It was presmably just a typo in 442c36bd08.

We'll add test coverage for all the error cases here, though only the
GIT_AUTHOR_NAME ones fail (even in a vanilla build they to segfault
consistently, but certainly with SANITIZE=address).

Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
The tests kind of feel like overkill, as this is such a specific
condition and I doubt we'd regress to have the same bug twice. But it
was nice at least to confirm the bug and the fix now.

 sequencer.c                    |  2 +-
 t/t3438-rebase-broken-files.sh | 53 ++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100755 t/t3438-rebase-broken-files.sh

diff --git a/sequencer.c b/sequencer.c
index d26ede83c4..83e0425b04 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -915,7 +915,7 @@ int read_author_script(const char *path, char **name, char **email, char **date,
 		error(_("missing 'GIT_AUTHOR_EMAIL'"));
 	if (date_i == -2)
 		error(_("missing 'GIT_AUTHOR_DATE'"));
-	if (date_i < 0 || email_i < 0 || date_i < 0 || err)
+	if (name_i < 0 || email_i < 0 || date_i < 0 || err)
 		goto finish;
 	*name = kv.items[name_i].util;
 	*email = kv.items[email_i].util;
diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
new file mode 100755
index 0000000000..e68aac4b36
--- /dev/null
+++ b/t/t3438-rebase-broken-files.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+test_description='rebase behavior when on-disk files are broken'
+. ./test-lib.sh
+
+test_expect_success 'set up conflicting branches' '
+	test_commit base file &&
+	git checkout -b branch1 &&
+	test_commit one file &&
+	git checkout -b branch2 HEAD^ &&
+	test_commit two file
+'
+
+check_broken_author () {
+	title=$1; shift
+	script=$1; shift
+
+	test_expect_success "$title" '
+		# create conflicted state
+		test_when_finished "git rebase --abort" &&
+		git checkout -B tmp branch2 &&
+		test_must_fail git rebase branch1 &&
+
+		# break author-script
+		'"$script"' &&
+
+		# resolving notices broken author-script
+		echo resolved >file &&
+		git add file &&
+		test_must_fail git rebase --continue
+	'
+}
+
+for item in NAME EMAIL DATE
+do
+	check_broken_author "detect missing GIT_AUTHOR_$item" '
+		grep -v $item .git/rebase-merge/author-script >tmp &&
+		mv tmp .git/rebase-merge/author-script'
+done
+
+for item in NAME EMAIL DATE
+do
+	check_broken_author "detect duplicate GIT_AUTHOR_$item" '
+		grep -i $item .git/rebase-merge/author-script >tmp &&
+		cat tmp >>.git/rebase-merge/author-script'
+done
+
+check_broken_author 'unknown key in author-script' '
+	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
+		>>.git/rebase-merge/author-script'
+
+
+test_done
-- 
2.38.0.rc2.657.gc449b89570


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
@ 2022-10-03  9:29   ` Phillip Wood
  2022-10-03 17:15     ` Jeff King
  2022-10-03  9:40   ` Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Phillip Wood @ 2022-10-03  9:29 UTC (permalink / raw)
  To: Jeff King, Michael V. Scovetta; +Cc: git

Hi Peff

On 03/10/2022 09:45, Jeff King wrote:
> On Sun, Oct 02, 2022 at 11:39:16PM -0700, Michael V. Scovetta wrote:
> .
>> In commit 2a7d63a2, sequencer.c:912 looks like:
>> 912  if (name_i == -2)
>> 913      error(_("missing 'GIT_AUTHOR_NAME'"));
>> 914  if (email_i == -2)
>> 915      error(_("missing 'GIT_AUTHOR_EMAIL'"));
>> 916  if (date_i == -2)
>> 917      error(_("missing 'GIT_AUTHOR_DATE'"));
>> 918  if (date_i < 0 || email_i < 0 || date_i < 0 || err)    <-- date_i
>> is referenced here twice
>> 919      goto finish;
>>
>> I'm fairly sure that line 918 should be:
>> 918  if (name_i < 0 || email_i < 0 || date_i < 0 || err)
> 
> Agreed, but +cc Phillip as the original author.

Thanks, I must have got confused as I was typing.

>> I haven't validated this, but I suspect that if
>> `rebase-merge/author-script` contained two GIT_AUTHOR_NAME fields,
>> then name_i would be set to -1 (by the error function), but control
>> wouldn't flow to finish, but instead to line 920 ( *name =
>> kv.items[name_i].util; ) where it would attempt to access memory
>> slightly outside of items' memory space.
> 
> Correct. It also happens if GIT_AUTHOR_NAME is missing.
> 
>> I haven't been able to actually trigger the bug, but strongly suspect
>> I'm just not familiar enough with how rebasing works under the covers.
> 
> It's a little tricky, because we avoid writing and reading the
> author-script file unless necessary. An easy way to need it is to break
> with a conflict (which writes it), and then resume with "git rebase
> --continue" (which reads it back while committing).
> 
> Here's a patch to fix it. Thanks for your report!

Yes thank you for reporting this Michael

> -- >8 --
> Subject: sequencer: detect author name errors in read_author_script()
> 
> As we parse the author-script file, we check for missing or duplicate
> lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
> final error conditional checks "date_i" twice and "name_i" not at all.
> This not only leads to us failing to abort, but we may do an
> out-of-bounds read on the string_list array.
> 
> The bug goes back to 442c36bd08 (am: improve author-script error
> reporting, 2018-10-31), though the code was soon after moved to this
> spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
> It was presmably just a typo in 442c36bd08.

s/presmbly/presumably/

> We'll add test coverage for all the error cases here, though only the
> GIT_AUTHOR_NAME ones fail (even in a vanilla build they to segfault

Do you mean "they ought to segfault"?

Thanks for fixing this and going to the trouble of adding some tests.

Phillip

> consistently, but certainly with SANITIZE=address).
> 
> Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> The tests kind of feel like overkill, as this is such a specific
> condition and I doubt we'd regress to have the same bug twice. But it
> was nice at least to confirm the bug and the fix now.
> 
>   sequencer.c                    |  2 +-
>   t/t3438-rebase-broken-files.sh | 53 ++++++++++++++++++++++++++++++++++
>   2 files changed, 54 insertions(+), 1 deletion(-)
>   create mode 100755 t/t3438-rebase-broken-files.sh
> 
> diff --git a/sequencer.c b/sequencer.c
> index d26ede83c4..83e0425b04 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -915,7 +915,7 @@ int read_author_script(const char *path, char **name, char **email, char **date,
>   		error(_("missing 'GIT_AUTHOR_EMAIL'"));
>   	if (date_i == -2)
>   		error(_("missing 'GIT_AUTHOR_DATE'"));
> -	if (date_i < 0 || email_i < 0 || date_i < 0 || err)
> +	if (name_i < 0 || email_i < 0 || date_i < 0 || err)
>   		goto finish;
>   	*name = kv.items[name_i].util;
>   	*email = kv.items[email_i].util;
> diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
> new file mode 100755
> index 0000000000..e68aac4b36
> --- /dev/null
> +++ b/t/t3438-rebase-broken-files.sh
> @@ -0,0 +1,53 @@
> +#!/bin/sh
> +
> +test_description='rebase behavior when on-disk files are broken'
> +. ./test-lib.sh
> +
> +test_expect_success 'set up conflicting branches' '
> +	test_commit base file &&
> +	git checkout -b branch1 &&
> +	test_commit one file &&
> +	git checkout -b branch2 HEAD^ &&
> +	test_commit two file
> +'
> +
> +check_broken_author () {
> +	title=$1; shift
> +	script=$1; shift
> +
> +	test_expect_success "$title" '
> +		# create conflicted state
> +		test_when_finished "git rebase --abort" &&
> +		git checkout -B tmp branch2 &&
> +		test_must_fail git rebase branch1 &&
> +
> +		# break author-script
> +		'"$script"' &&
> +
> +		# resolving notices broken author-script
> +		echo resolved >file &&
> +		git add file &&
> +		test_must_fail git rebase --continue
> +	'
> +}
> +
> +for item in NAME EMAIL DATE
> +do
> +	check_broken_author "detect missing GIT_AUTHOR_$item" '
> +		grep -v $item .git/rebase-merge/author-script >tmp &&
> +		mv tmp .git/rebase-merge/author-script'
> +done
> +
> +for item in NAME EMAIL DATE
> +do
> +	check_broken_author "detect duplicate GIT_AUTHOR_$item" '
> +		grep -i $item .git/rebase-merge/author-script >tmp &&
> +		cat tmp >>.git/rebase-merge/author-script'
> +done
> +
> +check_broken_author 'unknown key in author-script' '
> +	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
> +		>>.git/rebase-merge/author-script'
> +
> +
> +test_done

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
  2022-10-03  9:29   ` Phillip Wood
@ 2022-10-03  9:40   ` Ævar Arnfjörð Bjarmason
  2022-10-03 17:27     ` Jeff King
  2022-10-03 16:34   ` Junio C Hamano
  2022-10-03 17:35   ` Jeff King
  3 siblings, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-03  9:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael V. Scovetta, Phillip Wood, git


On Mon, Oct 03 2022, Jeff King wrote:

> On Sun, Oct 02, 2022 at 11:39:16PM -0700, Michael V. Scovetta wrote:
>
>> In commit 2a7d63a2, sequencer.c:912 looks like:
>> 912  if (name_i == -2)
>> 913      error(_("missing 'GIT_AUTHOR_NAME'"));
>> 914  if (email_i == -2)
>> 915      error(_("missing 'GIT_AUTHOR_EMAIL'"));
>> 916  if (date_i == -2)
>> 917      error(_("missing 'GIT_AUTHOR_DATE'"));
>> 918  if (date_i < 0 || email_i < 0 || date_i < 0 || err)    <-- date_i
>> is referenced here twice
>> 919      goto finish;
>> 
>> I'm fairly sure that line 918 should be:
>> 918  if (name_i < 0 || email_i < 0 || date_i < 0 || err)
>
> Agreed, but +cc Phillip as the original author.
>
>> I haven't validated this, but I suspect that if
>> `rebase-merge/author-script` contained two GIT_AUTHOR_NAME fields,
>> then name_i would be set to -1 (by the error function), but control
>> wouldn't flow to finish, but instead to line 920 ( *name =
>> kv.items[name_i].util; ) where it would attempt to access memory
>> slightly outside of items' memory space.
>
> Correct. It also happens if GIT_AUTHOR_NAME is missing.
>
>> I haven't been able to actually trigger the bug, but strongly suspect
>> I'm just not familiar enough with how rebasing works under the covers.
>
> It's a little tricky, because we avoid writing and reading the
> author-script file unless necessary. An easy way to need it is to break
> with a conflict (which writes it), and then resume with "git rebase
> --continue" (which reads it back while committing).
>
> Here's a patch to fix it. Thanks for your report!
>
> -- >8 --
> Subject: sequencer: detect author name errors in read_author_script()
>
> As we parse the author-script file, we check for missing or duplicate
> lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
> final error conditional checks "date_i" twice and "name_i" not at all.
> This not only leads to us failing to abort, but we may do an
> out-of-bounds read on the string_list array.
>
> The bug goes back to 442c36bd08 (am: improve author-script error
> reporting, 2018-10-31), though the code was soon after moved to this
> spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
> It was presmably just a typo in 442c36bd08.
>
> We'll add test coverage for all the error cases here, though only the
> GIT_AUTHOR_NAME ones fail (even in a vanilla build they to segfault
> consistently, but certainly with SANITIZE=address).
>
> Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> The tests kind of feel like overkill, as this is such a specific
> condition and I doubt we'd regress to have the same bug twice. But it
> was nice at least to confirm the bug and the fix now.

Having a regression test never feels like overkill to me :)

>
>  sequencer.c                    |  2 +-
>  t/t3438-rebase-broken-files.sh | 53 ++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+), 1 deletion(-)
>  create mode 100755 t/t3438-rebase-broken-files.sh
>
> diff --git a/sequencer.c b/sequencer.c
> index d26ede83c4..83e0425b04 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -915,7 +915,7 @@ int read_author_script(const char *path, char **name, char **email, char **date,
>  		error(_("missing 'GIT_AUTHOR_EMAIL'"));
>  	if (date_i == -2)
>  		error(_("missing 'GIT_AUTHOR_DATE'"));
> -	if (date_i < 0 || email_i < 0 || date_i < 0 || err)
> +	if (name_i < 0 || email_i < 0 || date_i < 0 || err)
>  		goto finish;
>  	*name = kv.items[name_i].util;
>  	*email = kv.items[email_i].util;
> diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
> new file mode 100755
> index 0000000000..e68aac4b36
> --- /dev/null
> +++ b/t/t3438-rebase-broken-files.sh
> @@ -0,0 +1,53 @@
> +#!/bin/sh
> +
> +test_description='rebase behavior when on-disk files are broken'
> +. ./test-lib.sh
> +
> +test_expect_success 'set up conflicting branches' '
> +	test_commit base file &&
> +	git checkout -b branch1 &&
> +	test_commit one file &&
> +	git checkout -b branch2 HEAD^ &&
> +	test_commit two file
> +'
> +
> +check_broken_author () {
> +	title=$1; shift
> +	script=$1; shift
> +
> +	test_expect_success "$title" '
> +		# create conflicted state
> +		test_when_finished "git rebase --abort" &&
> +		git checkout -B tmp branch2 &&
> +		test_must_fail git rebase branch1 &&
> +
> +		# break author-script
> +		'"$script"' &&
> +
> +		# resolving notices broken author-script
> +		echo resolved >file &&
> +		git add file &&
> +		test_must_fail git rebase --continue
> +	'
> +}
> +
> +for item in NAME EMAIL DATE
> +do
> +	check_broken_author "detect missing GIT_AUTHOR_$item" '
> +		grep -v $item .git/rebase-merge/author-script >tmp &&
> +		mv tmp .git/rebase-merge/author-script'
> +done
> +
> +for item in NAME EMAIL DATE
> +do
> +	check_broken_author "detect duplicate GIT_AUTHOR_$item" '
> +		grep -i $item .git/rebase-merge/author-script >tmp &&
> +		cat tmp >>.git/rebase-merge/author-script'
> +done
> +
> +check_broken_author 'unknown key in author-script' '
> +	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
> +		>>.git/rebase-merge/author-script'
> +
> +
> +test_done

Maybe I just have too much PTSD from dealing with shell quoting issues
with this '"$script"" pattern when you need to pass arguments with
spaces in it, or even quotes. Although it probably won't ever be an
issue here.

But in this case just passing the "script" on stdin nicely avoids any
future quoting issues, maybe this would be good to squash in?:

diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
index e68aac4b36d..6911e145f08 100755
--- a/t/t3438-rebase-broken-files.sh
+++ b/t/t3438-rebase-broken-files.sh
@@ -12,17 +12,19 @@ test_expect_success 'set up conflicting branches' '
 '
 
 check_broken_author () {
-	title=$1; shift
-	script=$1; shift
+	title=$1 && shift &&
 
-	test_expect_success "$title" '
+	# Avoid quoting issues
+	write_script script.sh &&
+
+	test_expect_success "$title of '$script'" '
 		# create conflicted state
 		test_when_finished "git rebase --abort" &&
 		git checkout -B tmp branch2 &&
 		test_must_fail git rebase branch1 &&
 
-		# break author-script
-		'"$script"' &&
+		./script.sh >tmp &&
+		mv tmp .git/rebase-merge/author-script &&
 
 		# resolving notices broken author-script
 		echo resolved >file &&
@@ -33,21 +35,22 @@ check_broken_author () {
 
 for item in NAME EMAIL DATE
 do
-	check_broken_author "detect missing GIT_AUTHOR_$item" '
-		grep -v $item .git/rebase-merge/author-script >tmp &&
-		mv tmp .git/rebase-merge/author-script'
+	check_broken_author "detect missing GIT_AUTHOR_$item" <<-EOF
+	grep -v $item .git/rebase-merge/author-script
+	EOF
 done
 
 for item in NAME EMAIL DATE
 do
-	check_broken_author "detect duplicate GIT_AUTHOR_$item" '
-		grep -i $item .git/rebase-merge/author-script >tmp &&
-		cat tmp >>.git/rebase-merge/author-script'
+	check_broken_author "detect duplicate GIT_AUTHOR_$item" <<-EOF
+	cat .git/rebase-merge/author-script &&
+	grep -i $item .git/rebase-merge/author-script
+	EOF
 done
 
-check_broken_author 'unknown key in author-script' '
-	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
-		>>.git/rebase-merge/author-script'
-
+check_broken_author 'unknown key in author-script' <<-EOF
+cat .git/rebase-merge/author-script &&
+echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}"
+EOF
 
 test_done


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
  2022-10-03  9:29   ` Phillip Wood
  2022-10-03  9:40   ` Ævar Arnfjörð Bjarmason
@ 2022-10-03 16:34   ` Junio C Hamano
  2022-10-03 17:35   ` Jeff King
  3 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2022-10-03 16:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael V. Scovetta, Phillip Wood, git

Jeff King <peff@peff.net> writes:

> On Sun, Oct 02, 2022 at 11:39:16PM -0700, Michael V. Scovetta wrote:
>
>> In commit 2a7d63a2, sequencer.c:912 looks like:
>> 912  if (name_i == -2)
>> 913      error(_("missing 'GIT_AUTHOR_NAME'"));
>> 914  if (email_i == -2)
>> 915      error(_("missing 'GIT_AUTHOR_EMAIL'"));
>> 916  if (date_i == -2)
>> 917      error(_("missing 'GIT_AUTHOR_DATE'"));
>> 918  if (date_i < 0 || email_i < 0 || date_i < 0 || err)    <-- date_i
>> is referenced here twice
>> 919      goto finish;
>> 
>> I'm fairly sure that line 918 should be:
>> 918  if (name_i < 0 || email_i < 0 || date_i < 0 || err)
>
> Agreed, but +cc Phillip as the original author.
>
>> I haven't validated this, but I suspect that if
>> `rebase-merge/author-script` contained two GIT_AUTHOR_NAME fields,
>> then name_i would be set to -1 (by the error function), but control
>> wouldn't flow to finish, but instead to line 920 ( *name =
>> kv.items[name_i].util; ) where it would attempt to access memory
>> slightly outside of items' memory space.
>
> Correct. It also happens if GIT_AUTHOR_NAME is missing.
>
>> I haven't been able to actually trigger the bug, but strongly suspect
>> I'm just not familiar enough with how rebasing works under the covers.
>
> It's a little tricky, because we avoid writing and reading the
> author-script file unless necessary. An easy way to need it is to break
> with a conflict (which writes it), and then resume with "git rebase
> --continue" (which reads it back while committing).
>
> Here's a patch to fix it. Thanks for your report!

And thanks for your fix ;-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  9:29   ` Phillip Wood
@ 2022-10-03 17:15     ` Jeff King
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff King @ 2022-10-03 17:15 UTC (permalink / raw)
  To: phillip.wood; +Cc: Michael V. Scovetta, git

On Mon, Oct 03, 2022 at 10:29:24AM +0100, Phillip Wood wrote:

> > The bug goes back to 442c36bd08 (am: improve author-script error
> > reporting, 2018-10-31), though the code was soon after moved to this
> > spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
> > It was presmably just a typo in 442c36bd08.
> 
> s/presmbly/presumably/

Yep.

> > We'll add test coverage for all the error cases here, though only the
> > GIT_AUTHOR_NAME ones fail (even in a vanilla build they to segfault
> 
> Do you mean "they ought to segfault"?

I think it's just s/to//. I.e., "even in a vanilla build they segfault
consistently".

I rewrote that sentence a few times trying to make it as not-confusing
as possible, but managed to leave in an editing error anyway. ;)

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  9:40   ` Ævar Arnfjörð Bjarmason
@ 2022-10-03 17:27     ` Jeff King
  2022-10-03 17:39       ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2022-10-03 17:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Michael V. Scovetta, Phillip Wood, git

On Mon, Oct 03, 2022 at 11:40:52AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > +check_broken_author 'unknown key in author-script' '
> > +	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
> > +		>>.git/rebase-merge/author-script'
> > +
> > +
> > +test_done
> 
> Maybe I just have too much PTSD from dealing with shell quoting issues
> with this '"$script"" pattern when you need to pass arguments with
> spaces in it, or even quotes. Although it probably won't ever be an
> issue here.

Quoting-wise, it's the exact same as passing the snippet to
test_expect_success that we already do. Which, granted, is full of
horrors, but is well understood within our code base. The "$script" is
evaluated when we pass the snippet to test_expect_success, which sees
the expansion along with the rest of the script. The double-quotes
around it ensure that whitespace is retained.

There is one extra quirk here, which is that it must not end with a
newline, since we stick "&&" on it.

> +	# Avoid quoting issues
> +	write_script script.sh &&
> +
> +	test_expect_success "$title of '$script'" '
>  		# create conflicted state
>  		test_when_finished "git rebase --abort" &&
>  		git checkout -B tmp branch2 &&
>  		test_must_fail git rebase branch1 &&
>  
> -		# break author-script
> -		'"$script"' &&
> +		./script.sh >tmp &&
> +		mv tmp .git/rebase-merge/author-script &&

One of my goals was that the script be expanded inline so that
verbose-mode showed what the test was actually doing. But this hides it
behind a vague "script.sh". Maybe that's not important enough to merit
the admitted ugliness of the loop+function we have.

If we give that up, then a better option is probably to put the
before/after parts in functions, like:

diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
index e68aac4b36..b92a3ce46b 100755
--- a/t/t3438-rebase-broken-files.sh
+++ b/t/t3438-rebase-broken-files.sh
@@ -11,43 +11,49 @@ test_expect_success 'set up conflicting branches' '
 	test_commit two file
 '
 
-check_broken_author () {
-	title=$1; shift
-	script=$1; shift
-
-	test_expect_success "$title" '
-		# create conflicted state
-		test_when_finished "git rebase --abort" &&
-		git checkout -B tmp branch2 &&
-		test_must_fail git rebase branch1 &&
-
-		# break author-script
-		'"$script"' &&
-
-		# resolving notices broken author-script
-		echo resolved >file &&
-		git add file &&
-		test_must_fail git rebase --continue
-	'
+create_conflict () {
+	test_when_finished "git rebase --abort" &&
+	git checkout -B tmp branch2 &&
+	test_must_fail git rebase branch1
+}
+
+check_resolve_fails () {
+	echo resolved >file &&
+	git add file &&
+	test_must_fail git rebase --continue
 }
 
 for item in NAME EMAIL DATE
 do
-	check_broken_author "detect missing GIT_AUTHOR_$item" '
+	test_expect_success "detect missing GIT_AUTHOR_$item" '
+		create_conflict &&
+
 		grep -v $item .git/rebase-merge/author-script >tmp &&
-		mv tmp .git/rebase-merge/author-script'
+		mv tmp .git/rebase-merge/author-script &&
+
+		check_resolve_fails
+	'
 done
 
 for item in NAME EMAIL DATE
 do
-	check_broken_author "detect duplicate GIT_AUTHOR_$item" '
+	test_expect_success "detect duplicate GIT_AUTHOR_$item" '
+		create_conflict &&
+
 		grep -i $item .git/rebase-merge/author-script >tmp &&
-		cat tmp >>.git/rebase-merge/author-script'
+		cat tmp >>.git/rebase-merge/author-script &&
+
+		check_resolve_fails
+	'
 done
 
-check_broken_author 'unknown key in author-script' '
+test_expect_success 'unknown key in author-script' '
+	create_conflict &&
+
 	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
-		>>.git/rebase-merge/author-script'
+		>>.git/rebase-merge/author-script &&
 
+	check_resolve_fails
+'
 
 test_done

That makes the boilerplate shorter in the "-v" output but focuses on the
actual modification that breaks the author-script.

-Peff

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
                     ` (2 preceding siblings ...)
  2022-10-03 16:34   ` Junio C Hamano
@ 2022-10-03 17:35   ` Jeff King
  2022-10-03 18:07     ` Junio C Hamano
  3 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2022-10-03 17:35 UTC (permalink / raw)
  To: Michael V. Scovetta
  Cc: Phillip Wood, git, Ævar Arnfjörð Bjarmason,
	Junio C Hamano

On Mon, Oct 03, 2022 at 04:45:06AM -0400, Jeff King wrote:

> > I haven't been able to actually trigger the bug, but strongly suspect
> > I'm just not familiar enough with how rebasing works under the covers.
> 
> It's a little tricky, because we avoid writing and reading the
> author-script file unless necessary. An easy way to need it is to break
> with a conflict (which writes it), and then resume with "git rebase
> --continue" (which reads it back while committing).
> 
> Here's a patch to fix it. Thanks for your report!

And here's a v2 based on the feedback received. The one-liner fix is the
same, but it tries to be a little less clever when constructing the
tests and has 200% more typo fixes in the commit message. Thanks Phillip
and Ævar for review.

-- >8 --
Subject: [PATCH] sequencer: detect author name errors in read_author_script()

As we parse the author-script file, we check for missing or duplicate
lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our
final error conditional checks "date_i" twice and "name_i" not at all.
This not only leads to us failing to abort, but we may do an
out-of-bounds read on the string_list array.

The bug goes back to 442c36bd08 (am: improve author-script error
reporting, 2018-10-31), though the code was soon after moved to this
spot by bcd33ec25f (add read_author_script() to libgit, 2018-10-31).
It was presumably just a typo in 442c36bd08.

We'll add test coverage for all the error cases here, though only the
GIT_AUTHOR_NAME ones fail (even in a vanilla build they segfault
consistently, but certainly with SANITIZE=address).

Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 sequencer.c                    |  2 +-
 t/t3438-rebase-broken-files.sh | 59 ++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)
 create mode 100755 t/t3438-rebase-broken-files.sh

diff --git a/sequencer.c b/sequencer.c
index d26ede83c4..83e0425b04 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -915,7 +915,7 @@ int read_author_script(const char *path, char **name, char **email, char **date,
 		error(_("missing 'GIT_AUTHOR_EMAIL'"));
 	if (date_i == -2)
 		error(_("missing 'GIT_AUTHOR_DATE'"));
-	if (date_i < 0 || email_i < 0 || date_i < 0 || err)
+	if (name_i < 0 || email_i < 0 || date_i < 0 || err)
 		goto finish;
 	*name = kv.items[name_i].util;
 	*email = kv.items[email_i].util;
diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
new file mode 100755
index 0000000000..b92a3ce46b
--- /dev/null
+++ b/t/t3438-rebase-broken-files.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description='rebase behavior when on-disk files are broken'
+. ./test-lib.sh
+
+test_expect_success 'set up conflicting branches' '
+	test_commit base file &&
+	git checkout -b branch1 &&
+	test_commit one file &&
+	git checkout -b branch2 HEAD^ &&
+	test_commit two file
+'
+
+create_conflict () {
+	test_when_finished "git rebase --abort" &&
+	git checkout -B tmp branch2 &&
+	test_must_fail git rebase branch1
+}
+
+check_resolve_fails () {
+	echo resolved >file &&
+	git add file &&
+	test_must_fail git rebase --continue
+}
+
+for item in NAME EMAIL DATE
+do
+	test_expect_success "detect missing GIT_AUTHOR_$item" '
+		create_conflict &&
+
+		grep -v $item .git/rebase-merge/author-script >tmp &&
+		mv tmp .git/rebase-merge/author-script &&
+
+		check_resolve_fails
+	'
+done
+
+for item in NAME EMAIL DATE
+do
+	test_expect_success "detect duplicate GIT_AUTHOR_$item" '
+		create_conflict &&
+
+		grep -i $item .git/rebase-merge/author-script >tmp &&
+		cat tmp >>.git/rebase-merge/author-script &&
+
+		check_resolve_fails
+	'
+done
+
+test_expect_success 'unknown key in author-script' '
+	create_conflict &&
+
+	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
+		>>.git/rebase-merge/author-script &&
+
+	check_resolve_fails
+'
+
+test_done
-- 
2.38.0.rc2.659.g50b8df2659


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03 17:27     ` Jeff King
@ 2022-10-03 17:39       ` Jeff King
  2022-10-03 18:05         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2022-10-03 17:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Michael V. Scovetta, Phillip Wood, git

On Mon, Oct 03, 2022 at 01:27:37PM -0400, Jeff King wrote:

> -check_broken_author 'unknown key in author-script' '
> +test_expect_success 'unknown key in author-script' '
> +	create_conflict &&
> +
>  	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
> -		>>.git/rebase-merge/author-script'
> +		>>.git/rebase-merge/author-script &&
>  
> +	check_resolve_fails
> +'
>  
>  test_done
> 
> That makes the boilerplate shorter in the "-v" output but focuses on the
> actual modification that breaks the author-script.

Note that we do still keep the ${SQ} bits here. They're necessary for
the same reason: before and after a snippet is being passed through a
variable. Whereas in yours we'd use stdin. I _do_ like that approach in
general, but it is unlike the rest of the test suite. Maybe it's worth
resurrecting:

  https://lore.kernel.org/git/YHDUg6ZR5vu93kGm@coredump.intra.peff.net/

?

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03 17:39       ` Jeff King
@ 2022-10-03 18:05         ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2022-10-03 18:05 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Michael V. Scovetta,
	Phillip Wood, git

Jeff King <peff@peff.net> writes:

> On Mon, Oct 03, 2022 at 01:27:37PM -0400, Jeff King wrote:
>
>> -check_broken_author 'unknown key in author-script' '
>> +test_expect_success 'unknown key in author-script' '
>> +	create_conflict &&
>> +
>>  	echo "GIT_AUTHOR_BOGUS=${SQ}whatever${SQ}" \
>> -		>>.git/rebase-merge/author-script'
>> +		>>.git/rebase-merge/author-script &&
>>  
>> +	check_resolve_fails
>> +'
>>  
>>  test_done
>> 
>> That makes the boilerplate shorter in the "-v" output but focuses on the
>> actual modification that breaks the author-script.
>
> Note that we do still keep the ${SQ} bits here. They're necessary for
> the same reason: before and after a snippet is being passed through a
> variable. Whereas in yours we'd use stdin. I _do_ like that approach in
> general, but it is unlike the rest of the test suite. Maybe it's worth
> resurrecting:
>
>   https://lore.kernel.org/git/YHDUg6ZR5vu93kGm@coredump.intra.peff.net/
>
> ?

Yeah, it is indeed quite tempting.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sequencer: detect author name errors in read_author_script()
  2022-10-03 17:35   ` Jeff King
@ 2022-10-03 18:07     ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2022-10-03 18:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Michael V. Scovetta, Phillip Wood, git,
	Ævar Arnfjörð Bjarmason

Jeff King <peff@peff.net> writes:

Jeff King <peff@peff.net> writes:

> On Mon, Oct 03, 2022 at 04:45:06AM -0400, Jeff King wrote:
>
>> > I haven't been able to actually trigger the bug, but strongly suspect
>> > I'm just not familiar enough with how rebasing works under the covers.
>> 
>> It's a little tricky, because we avoid writing and reading the
>> author-script file unless necessary. An easy way to need it is to break
>> with a conflict (which writes it), and then resume with "git rebase
>> --continue" (which reads it back while committing).
>> 
>> Here's a patch to fix it. Thanks for your report!
>
> And here's a v2 based on the feedback received. The one-liner fix is the
> same, but it tries to be a little less clever when constructing the
> tests and has 200% more typo fixes in the commit message. Thanks Phillip
> and Ævar for review.

The create-conflict & try-resolve pair does make it easier to see
what goes on in these tests.

Will queue.

Thanks.  

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-10-03 18:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03  6:39 Bug Report: Duplicate condition in read_author_script (sequencer.c) Michael V. Scovetta
2022-10-03  8:45 ` [PATCH] sequencer: detect author name errors in read_author_script() Jeff King
2022-10-03  9:29   ` Phillip Wood
2022-10-03 17:15     ` Jeff King
2022-10-03  9:40   ` Ævar Arnfjörð Bjarmason
2022-10-03 17:27     ` Jeff King
2022-10-03 17:39       ` Jeff King
2022-10-03 18:05         ` Junio C Hamano
2022-10-03 16:34   ` Junio C Hamano
2022-10-03 17:35   ` Jeff King
2022-10-03 18:07     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).