git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/4] Fix issues and a regression noted by valgrind
@ 2022-04-21 20:14 Ævar Arnfjörð Bjarmason
  2022-04-21 20:14 ` [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind Ævar Arnfjörð Bjarmason
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-21 20:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Johannes Schindelin,
	Hans Jerry Illikainen, Ævar Arnfjörð Bjarmason

I spotted these when testing v2.36.0, but waited until after the
release since there's no v2.36.0 regressions here.

The scariest issue here by far is one I caused and am now fixing in
4/4. As noted I don't think it had any real practical fallout, but we
should obviously fix it sooner than later.

I think the 3/4 issue is likewise just a bit scary, but I haven't
analyzed it as much. In that case we'd pass uninitialized memory to a
syscall.

It's clear that the --valgrind mode for the test suite is useful, but
it's sooo sloooow (it takes around a full day to complete, and for me
the full tests otherwise run in ~2 minutes). I'll try to run it in the
background with some regularity going forward anyway.

Aside: I do see that there's occasionaly FreeBSD runs in the "main"
git/git CI. How are those hooked up? Running it in GitHub CI would
take approximately forever (I think it would run into a hard timeout),
but if we can hook up a CI runner on some dedicated hardware.

Ævar Arnfjörð Bjarmason (4):
  tests: make RUNTIME_PREFIX compatible with --valgrind
  log test: skip a failing mkstemp() test under valgrind
  commit-graph.c: don't assume that stat() succeeds
  object-file: fix a unpack_loose_header() regression in 3b6a8db3b03

 commit-graph.c        |  6 ++++--
 object-file.c         |  8 ++++++--
 t/t0060-path-utils.sh |  4 ++--
 t/t1006-cat-file.sh   | 10 ++++++++--
 t/t1450-fsck.sh       | 13 +++++++++++--
 t/t4202-log.sh        | 11 +++++++----
 t/test-lib.sh         |  1 +
 7 files changed, 39 insertions(+), 14 deletions(-)

-- 
2.36.0.879.gd068ac2c328


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind
  2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
@ 2022-04-21 20:14 ` Ævar Arnfjörð Bjarmason
  2022-04-21 22:22   ` Junio C Hamano
  2022-04-21 20:14 ` [PATCH 2/4] log test: skip a failing mkstemp() test under valgrind Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-21 20:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Johannes Schindelin,
	Hans Jerry Illikainen, Ævar Arnfjörð Bjarmason

Fix a regression in b7d11a0f5d2 (tests: exercise the RUNTIME_PREFIX
feature, 2021-07-24) where tests that want to set up and test a "git"
wrapper in $PATH conflicted with the t/bin/valgrind wrapper(s) doing
the same.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0060-path-utils.sh | 4 ++--
 t/test-lib.sh         | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 2fe6ae6a4e5..aa35350b6f3 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -542,7 +542,7 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
-test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
 	mkdir -p pretend/bin pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
 	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
@@ -550,7 +550,7 @@ test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
 	echo HERE >expect &&
 	test_cmp expect actual'
 
-test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
 	mkdir -p pretend/bin &&
 	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 531cef097db..7f3d323e937 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1666,6 +1666,7 @@ test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
+test -n "$GIT_VALGRIND_ENABLED" && test_set_prereq VALGRIND
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.36.0.879.gd068ac2c328


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/4] log test: skip a failing mkstemp() test under valgrind
  2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
  2022-04-21 20:14 ` [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind Ævar Arnfjörð Bjarmason
@ 2022-04-21 20:14 ` Ævar Arnfjörð Bjarmason
  2022-04-21 20:14 ` [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-21 20:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Johannes Schindelin,
	Hans Jerry Illikainen, Ævar Arnfjörð Bjarmason

Skip a test added in f1e3df31699 (t: increase test coverage of
signature verification output, 2020-03-04) when running under
valgrind. Due to valgrind's interception of mkstemp() this test will
fail with:

	+ pwd
	+ TMPDIR=[...]/t/trash directory.t4202-log/bogus git log --show-signature -n1 plain-fail
	==7696== VG_(mkstemp): failed to create temp file: [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_d545ddcf
	[... 10 more similar lines omitted ..]
	valgrind: Startup or configuration error:
	valgrind:    Can't create client cmdline file in [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_6e542d1d
	valgrind: Unable to start up properly.  Giving up.
	error: last command exited with $?=1

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index be07407f855..6e663525582 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1992,10 +1992,13 @@ test_expect_success GPG 'log --show-signature for merged tag with GPG failure' '
 	git tag -s -m signed_tag_msg signed_tag_fail &&
 	git checkout plain-fail &&
 	git merge --no-ff -m msg signed_tag_fail &&
-	TMPDIR="$(pwd)/bogus" git log --show-signature -n1 plain-fail >actual &&
-	grep "^merged tag" actual &&
-	grep "^No signature" actual &&
-	! grep "^gpg: Signature made" actual
+	if ! test_have_prereq VALGRIND
+	then
+		TMPDIR="$(pwd)/bogus" git log --show-signature -n1 plain-fail >actual &&
+		grep "^merged tag" actual &&
+		grep "^No signature" actual &&
+		! grep "^gpg: Signature made" actual
+	fi
 '
 
 test_expect_success GPGSM 'log --graph --show-signature for merged tag x509' '
-- 
2.36.0.879.gd068ac2c328


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds
  2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
  2022-04-21 20:14 ` [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind Ævar Arnfjörð Bjarmason
  2022-04-21 20:14 ` [PATCH 2/4] log test: skip a failing mkstemp() test under valgrind Ævar Arnfjörð Bjarmason
@ 2022-04-21 20:14 ` Ævar Arnfjörð Bjarmason
  2022-04-21 22:29   ` Junio C Hamano
  2022-04-21 20:14 ` [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Ævar Arnfjörð Bjarmason
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
  4 siblings, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-21 20:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Johannes Schindelin,
	Hans Jerry Illikainen, Ævar Arnfjörð Bjarmason

Fix code added in 8d84097f965 (commit-graph: expire commit-graph
files, 2019-06-18) to check the return value of the stat() system
call. Not doing so caused us to use uninitialized memory in the "Bloom
generation is limited by --max-new-filters" test in
t4216-log-bloom.sh:

	+ rm -f trace.event
	+ pwd
	+ GIT_TRACE2_EVENT=[...]/t/trash directory.t4216-log-bloom/limits/trace.event git commit-graph write --reachable --split=replace --changed-paths --max-new-filters=2
	==24835== Syscall param utimensat(times[0].tv_sec) points to uninitialised byte(s)
	==24835==    at 0x499E65A: __utimensat64_helper (utimensat.c:34)
	==24835==    by 0x4999142: utime (utime.c:36)
	==24835==    by 0x552BE0: mark_commit_graphs (commit-graph.c:2213)
	==24835==    by 0x550822: write_commit_graph (commit-graph.c:2424)
	==24835==    by 0x54E3A0: write_commit_graph_reachable (commit-graph.c:1681)
	==24835==    by 0x4374BB: graph_write (commit-graph.c:269)
	==24835==    by 0x436F7D: cmd_commit_graph (commit-graph.c:326)
	==24835==    by 0x407B9A: run_builtin (git.c:465)
	==24835==    by 0x406651: handle_builtin (git.c:719)
	==24835==    by 0x407575: run_argv (git.c:786)
	==24835==    by 0x406410: cmd_main (git.c:917)
	==24835==    by 0x511F09: main (common-main.c:56)
	==24835==  Address 0x1ffeffde70 is on thread 1's stack
	==24835==  in frame #1, created by utime (utime.c:25)
	==24835==  Uninitialised value was created by a stack allocation
	==24835==    at 0x552B50: mark_commit_graphs (commit-graph.c:2201)
	==24835==
	[...]
	error: last command exited with $?=126
	not ok 137 - Bloom generation is limited by --max-new-filters

This would happen as we stat'd the non-existing
".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs()
to check the statu() return value, and while we're at it fix another
case added in the same commit to do the same.

The caller in expire_commit_graphs() would have been less likely to
run into this, as it's operating on files it just got from readdir(),
but it could still happen due to a race with e.g. a concurrent "rm
-rf" of the commit-graph files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 441b36016ba..2b528187316 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2206,7 +2206,8 @@ static void mark_commit_graphs(struct write_commit_graph_context *ctx)
 		struct stat st;
 		struct utimbuf updated_time;
 
-		stat(ctx->commit_graph_filenames_before[i], &st);
+		if (stat(ctx->commit_graph_filenames_before[i], &st) < 0)
+			continue;
 
 		updated_time.actime = st.st_atime;
 		updated_time.modtime = now;
@@ -2247,7 +2248,8 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
 		strbuf_setlen(&path, dirnamelen);
 		strbuf_addstr(&path, de->d_name);
 
-		stat(path.buf, &st);
+		if (stat(path.buf, &st) < 0)
+			continue;
 
 		if (st.st_mtime > expire_time)
 			continue;
-- 
2.36.0.879.gd068ac2c328


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2022-04-21 20:14 ` [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds Ævar Arnfjörð Bjarmason
@ 2022-04-21 20:14 ` Ævar Arnfjörð Bjarmason
  2022-04-21 22:39   ` Junio C Hamano
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
  4 siblings, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-21 20:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Johannes Schindelin,
	Hans Jerry Illikainen, Ævar Arnfjörð Bjarmason

Fix a regression in my 3b6a8db3b03 (object-file.c: use "enum" return
type for unpack_loose_header(), 2021-10-01) revealed both by running
the test suite with --valgrind, and with the amended "git fsck" test.

In practice this regression in v2.34.0 caused us to claim that we
couldn't parse the header, as opposed to not being able to unpack
it. Before the change in the C code the test_cmp added here would emit:

	-error: unable to unpack header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
	+error: unable to parse header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391

I.e. we'd proceed to call parse_loose_header() on the uninitialized
"hdr" value, and it would have been very unlikely for that
uninitialized memory to be a valid git object.

The other callers of unpack_loose_header() were already checking the
enum values exhaustively. See 3b6a8db3b03 and
5848fb11acd (object-file.c: return ULHR_TOO_LONG on "header too long",
2021-10-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-file.c       |  8 ++++++--
 t/t1006-cat-file.sh | 10 ++++++++--
 t/t1450-fsck.sh     | 13 +++++++++++--
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index 5ffbf3d4fd4..b5d1d12b68a 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2623,8 +2623,12 @@ int read_loose_object(const char *path,
 		goto out;
 	}
 
-	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
-				NULL) < 0) {
+	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
+				    NULL)) {
+	case ULHR_OK:
+		break;
+	case ULHR_BAD:
+	case ULHR_TOO_LONG:
 		error(_("unable to unpack header of %s"), path);
 		goto out;
 	}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 1b852076944..dadf3b14583 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -681,7 +681,7 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
 
 		# Setup and create the empty blob and its path
 		empty_path=$(git rev-parse --git-path objects/$(test_oid_to_path "$EMPTY_BLOB")) &&
-		git hash-object -w --stdin </dev/null &&
+		empty_blob=$(git hash-object -w --stdin </dev/null) &&
 
 		# Create another blob and its path
 		echo other >other.blob &&
@@ -722,7 +722,13 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
 		# content out as-is. Try to make it zlib-invalid.
 		mv -f other.blob "$empty_path" &&
 		test_must_fail git fsck 2>err.fsck &&
-		grep "^error: inflate: data stream error (" err.fsck
+		cat >expect <<-EOF &&
+		error: inflate: data stream error (incorrect header check)
+		error: unable to unpack header of ./$empty_path
+		error: $empty_blob: object corrupt or missing: ./$empty_path
+		EOF
+		grep "^error: " err.fsck >actual &&
+		test_cmp expect actual
 	)
 '
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index de50c0ea018..ab7f31f1dcd 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -774,10 +774,19 @@ test_expect_success 'fsck finds problems in duplicate loose objects' '
 		# no "-d" here, so we end up with duplicates
 		git repack &&
 		# now corrupt the loose copy
-		file=$(sha1_file "$(git rev-parse HEAD)") &&
+		oid="$(git rev-parse HEAD)" &&
+		file=$(sha1_file "$oid") &&
 		rm "$file" &&
 		echo broken >"$file" &&
-		test_must_fail git fsck
+		test_must_fail git fsck 2>err &&
+
+		cat >expect <<-EOF &&
+		error: inflate: data stream error (incorrect header check)
+		error: unable to unpack header of $file
+		error: $oid: object corrupt or missing: $file
+		EOF
+		grep "^error: " err >actual &&
+		test_cmp expect actual
 	)
 '
 
-- 
2.36.0.879.gd068ac2c328


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind
  2022-04-21 20:14 ` [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind Ævar Arnfjörð Bjarmason
@ 2022-04-21 22:22   ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-04-21 22:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Derrick Stolee, Johannes Schindelin, Hans Jerry Illikainen

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Subject: Re: [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind

The patch text looks more like 

	tests: --valgrind does not work with RUNTIME_PREFIX

to me.  Is it true that a test that only works with RUNTIME_PREFIX
does not work under --valgrind at all?  I am wondering if we can do
this without introducing a new prerequisite, and if it makes sense
to do so if it can be done.  The output from

    $ git grep 'test_expect_success .*RUNTIME_PREFIX' t/

tells me that there are two test pieces that must be run under
RUNTIME_PREFIX, and this patch touches both of them.

I guess it is not RUNTIME_PREFIX build itself per-se, but it is
the way we test RUNTIME_PREFIX is incompatible with how we run our
tests under --valgrind, so neither the title on the Subject: header
of this message or "looks more like" above is a good one.  It is
more like

	tests: using custom GIT_EXEC_PATH breaks --valgrind tests

I think.

And after having looked at the patch text and thought about the
issues enough to come up with the above updated title, I can say
that the change looks quite reasonable, both the body of the
proposed log message and the solution.  Nicely done.

It would have been even nicer if I didn't have to think about the
issues myself, though ;-)

Thanks.

> Fix a regression in b7d11a0f5d2 (tests: exercise the RUNTIME_PREFIX
> feature, 2021-07-24) where tests that want to set up and test a "git"
> wrapper in $PATH conflicted with the t/bin/valgrind wrapper(s) doing
> the same.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  t/t0060-path-utils.sh | 4 ++--
>  t/test-lib.sh         | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> index 2fe6ae6a4e5..aa35350b6f3 100755
> --- a/t/t0060-path-utils.sh
> +++ b/t/t0060-path-utils.sh
> @@ -542,7 +542,7 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
>  	./git rev-parse
>  '
>  
> -test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
> +test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
>  	mkdir -p pretend/bin pretend/libexec/git-core &&
>  	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
>  	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
> @@ -550,7 +550,7 @@ test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
>  	echo HERE >expect &&
>  	test_cmp expect actual'
>  
> -test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
> +test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
>  	mkdir -p pretend/bin &&
>  	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
>  	git config yes.path "%(prefix)/yes" &&
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 531cef097db..7f3d323e937 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1666,6 +1666,7 @@ test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
>  test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
>  test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
>  test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
> +test -n "$GIT_VALGRIND_ENABLED" && test_set_prereq VALGRIND
>  
>  if test -z "$GIT_TEST_CHECK_CACHE_TREE"
>  then

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds
  2022-04-21 20:14 ` [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds Ævar Arnfjörð Bjarmason
@ 2022-04-21 22:29   ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-04-21 22:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Derrick Stolee, Johannes Schindelin, Hans Jerry Illikainen

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix code added in 8d84097f965 (commit-graph: expire commit-graph
> files, 2019-06-18) to check the return value of the stat() system
> call. Not doing so caused us to use uninitialized memory in the "Bloom
> generation is limited by --max-new-filters" test in
> t4216-log-bloom.sh:
>
> 	+ rm -f trace.event
> 	+ pwd
> 	+ GIT_TRACE2_EVENT=[...]/t/trash directory.t4216-log-bloom/limits/trace.event git commit-graph write --reachable --split=replace --changed-paths --max-new-filters=2
> 	==24835== Syscall param utimensat(times[0].tv_sec) points to uninitialised byte(s)
> 	==24835==    at 0x499E65A: __utimensat64_helper (utimensat.c:34)
> 	==24835==    by 0x4999142: utime (utime.c:36)
> 	==24835==    by 0x552BE0: mark_commit_graphs (commit-graph.c:2213)
> 	==24835==    by 0x550822: write_commit_graph (commit-graph.c:2424)
> 	==24835==    by 0x54E3A0: write_commit_graph_reachable (commit-graph.c:1681)
> 	==24835==    by 0x4374BB: graph_write (commit-graph.c:269)
> 	==24835==    by 0x436F7D: cmd_commit_graph (commit-graph.c:326)
> 	==24835==    by 0x407B9A: run_builtin (git.c:465)
> 	==24835==    by 0x406651: handle_builtin (git.c:719)
> 	==24835==    by 0x407575: run_argv (git.c:786)
> 	==24835==    by 0x406410: cmd_main (git.c:917)
> 	==24835==    by 0x511F09: main (common-main.c:56)
> 	==24835==  Address 0x1ffeffde70 is on thread 1's stack
> 	==24835==  in frame #1, created by utime (utime.c:25)
> 	==24835==  Uninitialised value was created by a stack allocation
> 	==24835==    at 0x552B50: mark_commit_graphs (commit-graph.c:2201)
> 	==24835==
> 	[...]
> 	error: last command exited with $?=126
> 	not ok 137 - Bloom generation is limited by --max-new-filters
>
> This would happen as we stat'd the non-existing
> ".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs()
> to check the statu() return value, and while we're at it fix another

"statu" -> "stat".

> case added in the same commit to do the same.
>
> The caller in expire_commit_graphs() would have been less likely to
> run into this, as it's operating on files it just got from readdir(),
> but it could still happen due to a race with e.g. a concurrent "rm
> -rf" of the commit-graph files.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  commit-graph.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 441b36016ba..2b528187316 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2206,7 +2206,8 @@ static void mark_commit_graphs(struct write_commit_graph_context *ctx)
>  		struct stat st;
>  		struct utimbuf updated_time;
>  
> -		stat(ctx->commit_graph_filenames_before[i], &st);
> +		if (stat(ctx->commit_graph_filenames_before[i], &st) < 0)
> +			continue;
>  
>  		updated_time.actime = st.st_atime;
>  		updated_time.modtime = now;

If we think a commit-graph file should exist and it turns out that
the file is gone, we cannot update the file's timestamps, so we need
to do something about it, but is "continue" the most sensible thing,
I have to wonder?

Of course, missing file is not the only reason not being able to
stat(); what I am getting at is if this should be warned to the
user, or perhaps something like "warn unless ENOENT".

> @@ -2247,7 +2248,8 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
>  		strbuf_setlen(&path, dirnamelen);
>  		strbuf_addstr(&path, de->d_name);
>  
> -		stat(path.buf, &st);
> +		if (stat(path.buf, &st) < 0)
> +			continue;

Ditto about "should we warn".

>  		if (st.st_mtime > expire_time)
>  			continue;

But just "continue" is the change with least impact.  In either of
these codepaths, we blindly continued and ran utime() or unlink()
without checking their return status, but we now refrain from
touching them when we cannot stat().

So, I guess the only actionable thing that needs fixing in this
patch is the "statu" typo.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-04-21 20:14 ` [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Ævar Arnfjörð Bjarmason
@ 2022-04-21 22:39   ` Junio C Hamano
  2022-04-22  8:21     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2022-04-21 22:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Derrick Stolee, Johannes Schindelin, Hans Jerry Illikainen

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> -	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> -				NULL) < 0) {
> +	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> +				    NULL)) {
> +	case ULHR_OK:
> +		break;
> +	case ULHR_BAD:
> +	case ULHR_TOO_LONG:
>  		error(_("unable to unpack header of %s"), path);
>  		goto out;
>  	}

Sigh, well spotted.  This is why I hate the application of "enum is
better, let's rewrite the 'negative is error, 0 is good' with it"
and other dogmatic "clean-up" that touch everywhere in the codebase.

Now because it is ULHR_OK or everything else that is an error, I think
the fix should be

	if (unpack_loose_header(...) != ULHR_OK) {
		error(...);
		goto out;
	}

It would also be much closer in spirit to the original code before
the "enum" change broke it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-04-21 22:39   ` Junio C Hamano
@ 2022-04-22  8:21     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-22  8:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Derrick Stolee, Johannes Schindelin, Hans Jerry Illikainen


On Thu, Apr 21 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> -	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
>> -				NULL) < 0) {
>> +	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
>> +				    NULL)) {
>> +	case ULHR_OK:
>> +		break;
>> +	case ULHR_BAD:
>> +	case ULHR_TOO_LONG:
>>  		error(_("unable to unpack header of %s"), path);
>>  		goto out;
>>  	}
>
> Sigh, well spotted.  This is why I hate the application of "enum is
> better, let's rewrite the 'negative is error, 0 is good' with it"
> and other dogmatic "clean-up" that touch everywhere in the codebase.

While this is squarely my fault, I'm FWIW not as dogmatic on that point
as you think. I initially made a new error state a -2, and got feedback
on the series that that was too magical, then ended up turning it into
an enum and missed this callsite.

I think it's less that enums are bad in this case, as it's probably
sensible to consistently use negative values for error states.

> Now because it is ULHR_OK or everything else that is an error, I think
> the fix should be
>
> 	if (unpack_loose_header(...) != ULHR_OK) {
> 		error(...);
> 		goto out;
> 	}
>
> It would also be much closer in spirit to the original code before
> the "enum" change broke it.

We have two other callers of the API using the exhaustive enumeration
pattern, so by doing this we'd have the compiler miss this callsite if
another label is added.

It could be refactored etc., but I think this change as-is is the most
minimal & least invasive. I.e. it just adjusts this one caller to match
the other ones, we could also refactor the interface & pattern we use to
call it.

But if we're doing that I don't see the benefit of doing it for just one
caller, and if we're doing it for all of them surely that's better as
some follow-up cleanup...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 0/4] test fixes around valgrind
  2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2022-04-21 20:14 ` [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Ævar Arnfjörð Bjarmason
@ 2022-05-12 22:32 ` Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 1/4] tests: using custom GIT_EXEC_PATH breaks --valgrind tests Junio C Hamano
                     ` (3 more replies)
  4 siblings, 4 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:32 UTC (permalink / raw)
  To: git

Updates the valgrind test framework a bit, and then fixes a few
issues valgrind has found.

Essentailly the same as the previously posted version, but those
suggested fixes squashed in.

I am sending this out as a final "complain now, or this will go to
'next' soonish" warning.  The "What's cooking" report is getting
crowded with too many topics marked as "Expecting a reroll", and I'm
trying to do easier ones myself to see how much reduction we can
make.

Ævar Arnfjörð Bjarmason (4):
  tests: using custom GIT_EXEC_PATH breaks --valgrind tests
  log test: skip a failing mkstemp() test under valgrind
  commit-graph.c: don't assume that stat() succeeds
  object-file: fix a unpack_loose_header() regression in 3b6a8db3b03

 commit-graph.c        |  6 ++++--
 object-file.c         |  8 ++++++--
 t/t0060-path-utils.sh |  4 ++--
 t/t1006-cat-file.sh   | 10 ++++++++--
 t/t1450-fsck.sh       | 13 +++++++++++--
 t/t4202-log.sh        | 11 +++++++----
 t/test-lib.sh         |  1 +
 7 files changed, 39 insertions(+), 14 deletions(-)


1:  4ebaddb01d ! 1:  3b79af1dde tests: make RUNTIME_PREFIX compatible with --valgrind
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    tests: make RUNTIME_PREFIX compatible with --valgrind
    +    tests: using custom GIT_EXEC_PATH breaks --valgrind tests
     
         Fix a regression in b7d11a0f5d2 (tests: exercise the RUNTIME_PREFIX
         feature, 2021-07-24) where tests that want to set up and test a "git"
2:  b25609d6fc = 2:  35a7226706 log test: skip a failing mkstemp() test under valgrind
3:  d56edb6654 ! 3:  84d1793178 commit-graph.c: don't assume that stat() succeeds
    @@ Commit message
     
         This would happen as we stat'd the non-existing
         ".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs()
    -    to check the statu() return value, and while we're at it fix another
    +    to check the stat()'s return value, and while we're at it fix another
         case added in the same commit to do the same.
     
         The caller in expire_commit_graphs() would have been less likely to
4:  995cfb0439 = 4:  391ae0a276 object-file: fix a unpack_loose_header() regression in 3b6a8db3b03



-- 
2.36.1-338-g1c7f76a54c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/4] tests: using custom GIT_EXEC_PATH breaks --valgrind tests
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
@ 2022-05-12 22:32   ` Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 2/4] log test: skip a failing mkstemp() test under valgrind Junio C Hamano
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:32 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Fix a regression in b7d11a0f5d2 (tests: exercise the RUNTIME_PREFIX
feature, 2021-07-24) where tests that want to set up and test a "git"
wrapper in $PATH conflicted with the t/bin/valgrind wrapper(s) doing
the same.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0060-path-utils.sh | 4 ++--
 t/test-lib.sh         | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 2fe6ae6a4e..aa35350b6f 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -542,7 +542,7 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
-test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
 	mkdir -p pretend/bin pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
 	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
@@ -550,7 +550,7 @@ test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
 	echo HERE >expect &&
 	test_cmp expect actual'
 
-test_expect_success RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
 	mkdir -p pretend/bin &&
 	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 531cef097d..7f3d323e93 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1666,6 +1666,7 @@ test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
+test -n "$GIT_VALGRIND_ENABLED" && test_set_prereq VALGRIND
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.36.1-338-g1c7f76a54c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 2/4] log test: skip a failing mkstemp() test under valgrind
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 1/4] tests: using custom GIT_EXEC_PATH breaks --valgrind tests Junio C Hamano
@ 2022-05-12 22:32   ` Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 3/4] commit-graph.c: don't assume that stat() succeeds Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Junio C Hamano
  3 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:32 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Skip a test added in f1e3df31699 (t: increase test coverage of
signature verification output, 2020-03-04) when running under
valgrind. Due to valgrind's interception of mkstemp() this test will
fail with:

	+ pwd
	+ TMPDIR=[...]/t/trash directory.t4202-log/bogus git log --show-signature -n1 plain-fail
	==7696== VG_(mkstemp): failed to create temp file: [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_d545ddcf
	[... 10 more similar lines omitted ..]
	valgrind: Startup or configuration error:
	valgrind:    Can't create client cmdline file in [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_6e542d1d
	valgrind: Unable to start up properly.  Giving up.
	error: last command exited with $?=1

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t4202-log.sh | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index be07407f85..6e66352558 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1992,10 +1992,13 @@ test_expect_success GPG 'log --show-signature for merged tag with GPG failure' '
 	git tag -s -m signed_tag_msg signed_tag_fail &&
 	git checkout plain-fail &&
 	git merge --no-ff -m msg signed_tag_fail &&
-	TMPDIR="$(pwd)/bogus" git log --show-signature -n1 plain-fail >actual &&
-	grep "^merged tag" actual &&
-	grep "^No signature" actual &&
-	! grep "^gpg: Signature made" actual
+	if ! test_have_prereq VALGRIND
+	then
+		TMPDIR="$(pwd)/bogus" git log --show-signature -n1 plain-fail >actual &&
+		grep "^merged tag" actual &&
+		grep "^No signature" actual &&
+		! grep "^gpg: Signature made" actual
+	fi
 '
 
 test_expect_success GPGSM 'log --graph --show-signature for merged tag x509' '
-- 
2.36.1-338-g1c7f76a54c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 3/4] commit-graph.c: don't assume that stat() succeeds
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 1/4] tests: using custom GIT_EXEC_PATH breaks --valgrind tests Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 2/4] log test: skip a failing mkstemp() test under valgrind Junio C Hamano
@ 2022-05-12 22:32   ` Junio C Hamano
  2022-05-12 22:32   ` [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Junio C Hamano
  3 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:32 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Fix code added in 8d84097f965 (commit-graph: expire commit-graph
files, 2019-06-18) to check the return value of the stat() system
call. Not doing so caused us to use uninitialized memory in the "Bloom
generation is limited by --max-new-filters" test in
t4216-log-bloom.sh:

	+ rm -f trace.event
	+ pwd
	+ GIT_TRACE2_EVENT=[...]/t/trash directory.t4216-log-bloom/limits/trace.event git commit-graph write --reachable --split=replace --changed-paths --max-new-filters=2
	==24835== Syscall param utimensat(times[0].tv_sec) points to uninitialised byte(s)
	==24835==    at 0x499E65A: __utimensat64_helper (utimensat.c:34)
	==24835==    by 0x4999142: utime (utime.c:36)
	==24835==    by 0x552BE0: mark_commit_graphs (commit-graph.c:2213)
	==24835==    by 0x550822: write_commit_graph (commit-graph.c:2424)
	==24835==    by 0x54E3A0: write_commit_graph_reachable (commit-graph.c:1681)
	==24835==    by 0x4374BB: graph_write (commit-graph.c:269)
	==24835==    by 0x436F7D: cmd_commit_graph (commit-graph.c:326)
	==24835==    by 0x407B9A: run_builtin (git.c:465)
	==24835==    by 0x406651: handle_builtin (git.c:719)
	==24835==    by 0x407575: run_argv (git.c:786)
	==24835==    by 0x406410: cmd_main (git.c:917)
	==24835==    by 0x511F09: main (common-main.c:56)
	==24835==  Address 0x1ffeffde70 is on thread 1's stack
	==24835==  in frame #1, created by utime (utime.c:25)
	==24835==  Uninitialised value was created by a stack allocation
	==24835==    at 0x552B50: mark_commit_graphs (commit-graph.c:2201)
	==24835==
	[...]
	error: last command exited with $?=126
	not ok 137 - Bloom generation is limited by --max-new-filters

This would happen as we stat'd the non-existing
".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs()
to check the stat()'s return value, and while we're at it fix another
case added in the same commit to do the same.

The caller in expire_commit_graphs() would have been less likely to
run into this, as it's operating on files it just got from readdir(),
but it could still happen due to a race with e.g. a concurrent "rm
-rf" of the commit-graph files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit-graph.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 441b36016b..2b52818731 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2206,7 +2206,8 @@ static void mark_commit_graphs(struct write_commit_graph_context *ctx)
 		struct stat st;
 		struct utimbuf updated_time;
 
-		stat(ctx->commit_graph_filenames_before[i], &st);
+		if (stat(ctx->commit_graph_filenames_before[i], &st) < 0)
+			continue;
 
 		updated_time.actime = st.st_atime;
 		updated_time.modtime = now;
@@ -2247,7 +2248,8 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx)
 		strbuf_setlen(&path, dirnamelen);
 		strbuf_addstr(&path, de->d_name);
 
-		stat(path.buf, &st);
+		if (stat(path.buf, &st) < 0)
+			continue;
 
 		if (st.st_mtime > expire_time)
 			continue;
-- 
2.36.1-338-g1c7f76a54c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
                     ` (2 preceding siblings ...)
  2022-05-12 22:32   ` [PATCH v2 3/4] commit-graph.c: don't assume that stat() succeeds Junio C Hamano
@ 2022-05-12 22:32   ` Junio C Hamano
  2022-05-12 23:39     ` Junio C Hamano
  3 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:32 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Fix a regression in my 3b6a8db3b03 (object-file.c: use "enum" return
type for unpack_loose_header(), 2021-10-01) revealed both by running
the test suite with --valgrind, and with the amended "git fsck" test.

In practice this regression in v2.34.0 caused us to claim that we
couldn't parse the header, as opposed to not being able to unpack
it. Before the change in the C code the test_cmp added here would emit:

	-error: unable to unpack header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
	+error: unable to parse header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391

I.e. we'd proceed to call parse_loose_header() on the uninitialized
"hdr" value, and it would have been very unlikely for that
uninitialized memory to be a valid git object.

The other callers of unpack_loose_header() were already checking the
enum values exhaustively. See 3b6a8db3b03 and
5848fb11acd (object-file.c: return ULHR_TOO_LONG on "header too long",
2021-10-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 object-file.c       |  8 ++++++--
 t/t1006-cat-file.sh | 10 ++++++++--
 t/t1450-fsck.sh     | 13 +++++++++++--
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index 5ffbf3d4fd..b5d1d12b68 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2623,8 +2623,12 @@ int read_loose_object(const char *path,
 		goto out;
 	}
 
-	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
-				NULL) < 0) {
+	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
+				    NULL)) {
+	case ULHR_OK:
+		break;
+	case ULHR_BAD:
+	case ULHR_TOO_LONG:
 		error(_("unable to unpack header of %s"), path);
 		goto out;
 	}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 1b85207694..dadf3b1458 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -681,7 +681,7 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
 
 		# Setup and create the empty blob and its path
 		empty_path=$(git rev-parse --git-path objects/$(test_oid_to_path "$EMPTY_BLOB")) &&
-		git hash-object -w --stdin </dev/null &&
+		empty_blob=$(git hash-object -w --stdin </dev/null) &&
 
 		# Create another blob and its path
 		echo other >other.blob &&
@@ -722,7 +722,13 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
 		# content out as-is. Try to make it zlib-invalid.
 		mv -f other.blob "$empty_path" &&
 		test_must_fail git fsck 2>err.fsck &&
-		grep "^error: inflate: data stream error (" err.fsck
+		cat >expect <<-EOF &&
+		error: inflate: data stream error (incorrect header check)
+		error: unable to unpack header of ./$empty_path
+		error: $empty_blob: object corrupt or missing: ./$empty_path
+		EOF
+		grep "^error: " err.fsck >actual &&
+		test_cmp expect actual
 	)
 '
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index de50c0ea01..ab7f31f1dc 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -774,10 +774,19 @@ test_expect_success 'fsck finds problems in duplicate loose objects' '
 		# no "-d" here, so we end up with duplicates
 		git repack &&
 		# now corrupt the loose copy
-		file=$(sha1_file "$(git rev-parse HEAD)") &&
+		oid="$(git rev-parse HEAD)" &&
+		file=$(sha1_file "$oid") &&
 		rm "$file" &&
 		echo broken >"$file" &&
-		test_must_fail git fsck
+		test_must_fail git fsck 2>err &&
+
+		cat >expect <<-EOF &&
+		error: inflate: data stream error (incorrect header check)
+		error: unable to unpack header of $file
+		error: $oid: object corrupt or missing: $file
+		EOF
+		grep "^error: " err >actual &&
+		test_cmp expect actual
 	)
 '
 
-- 
2.36.1-338-g1c7f76a54c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-05-12 22:32   ` [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Junio C Hamano
@ 2022-05-12 23:39     ` Junio C Hamano
  2022-05-16 14:59       ` Derrick Stolee
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2022-05-12 23:39 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

Junio C Hamano <gitster@pobox.com> writes:

> diff --git a/object-file.c b/object-file.c
> index 5ffbf3d4fd..b5d1d12b68 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2623,8 +2623,12 @@ int read_loose_object(const char *path,
>  		goto out;
>  	}
>  
> -	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> -				NULL) < 0) {
> +	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> +				    NULL)) {
> +	case ULHR_OK:
> +		break;
> +	case ULHR_BAD:
> +	case ULHR_TOO_LONG:
>  		error(_("unable to unpack header of %s"), path);
>  		goto out;
>  	}

Regarding this hunk, since we only care about a single "did we get
any error, or did we unpack OK" bit, I think this should be more
like

	if (unpack_loose_header(...) != ULHR_OK) {
		error(_("unable to..."), path);
		goto out;
	}

It is true, as Ævar mentioned, that there is another place in the
same file that uses switch() in loose_object_info(), and it should
remain to be switch() on the returned enum because it wants to
behave differnetly depending on the kind of error it gets.  But that
is not a reason to make this part that only cares about a single
"did it fail?" into a switch and force future developers to add a
useless case arm.

I left it there as posted in the previous round because I was too
lazy ;-) and also it is something we can clean up with a follow up
patch outside the series.  As my today's focus has been to reduce
the number of topics waiting for a reroll, I'd rather leave things
that are not outright broken but needs clean up as they are for the
sake of expediency.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03
  2022-05-12 23:39     ` Junio C Hamano
@ 2022-05-16 14:59       ` Derrick Stolee
  2022-05-19 20:09         ` [RFC PATCH 0/2] Alternate ab/valgrind-fixes fix-up Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 20+ messages in thread
From: Derrick Stolee @ 2022-05-16 14:59 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Ævar Arnfjörð Bjarmason

On 5/12/2022 7:39 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> diff --git a/object-file.c b/object-file.c
>> index 5ffbf3d4fd..b5d1d12b68 100644
>> --- a/object-file.c
>> +++ b/object-file.c
>> @@ -2623,8 +2623,12 @@ int read_loose_object(const char *path,
>>  		goto out;
>>  	}
>>  
>> -	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
>> -				NULL) < 0) {
>> +	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
>> +				    NULL)) {
>> +	case ULHR_OK:
>> +		break;
>> +	case ULHR_BAD:
>> +	case ULHR_TOO_LONG:
>>  		error(_("unable to unpack header of %s"), path);
>>  		goto out;
>>  	}
> 
> Regarding this hunk, since we only care about a single "did we get
> any error, or did we unpack OK" bit, I think this should be more
> like
> 
> 	if (unpack_loose_header(...) != ULHR_OK) {
> 		error(_("unable to..."), path);
> 		goto out;
> 	}
> 
> It is true, as Ævar mentioned, that there is another place in the
> same file that uses switch() in loose_object_info(), and it should
> remain to be switch() on the returned enum because it wants to
> behave differnetly depending on the kind of error it gets.  But that
> is not a reason to make this part that only cares about a single
> "did it fail?" into a switch and force future developers to add a
> useless case arm.
> 
> I left it there as posted in the previous round because I was too
> lazy ;-) and also it is something we can clean up with a follow up
> patch outside the series.  As my today's focus has been to reduce
> the number of topics waiting for a reroll, I'd rather leave things
> that are not outright broken but needs clean up as they are for the
> sake of expediency.

Taking a look at your new version, I agree that this use of 'switch'
is out of place and can make things more confusing in the future.

Here is a patch doing exactly what you recommended, which you can
choose to add or squash. I made you co-author, but I expect you to
add your sign-off after mine.

-- >8 --

From 85cd37b4f23e06980ea95311067d735144fe932f Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Mon, 16 May 2022 10:53:27 -0400
Subject: [PATCH] object-file: convert 'switch' back to 'if'

This switch statement was recently added to make it clear that
unpack_loose_header() returns an enum value, not an int. This adds
complications for future developers if that enum gains new values, since
that developer would need to add a case statement to this switch for
little real value.

Instead, we can revert back to an 'if' statement, but make the enum
explicit by using "!= ULHR_OK" instead of assuming it has the numerical
value zero.

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---

 object-file.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index b5d1d12b68a..52e4ae1b5f0 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2623,12 +2623,8 @@ int read_loose_object(const char *path,
 		goto out;
 	}
 
-	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
-				    NULL)) {
-	case ULHR_OK:
-		break;
-	case ULHR_BAD:
-	case ULHR_TOO_LONG:
+	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
+				NULL) != ULHR_OK) {
 		error(_("unable to unpack header of %s"), path);
 		goto out;
 	}
-- 
2.35.3.vfs.0.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 0/2] Alternate ab/valgrind-fixes fix-up
  2022-05-16 14:59       ` Derrick Stolee
@ 2022-05-19 20:09         ` Ævar Arnfjörð Bjarmason
  2022-05-19 20:09           ` [RFC PATCH 1/2] object-file API: fix obscure unpack_loose_header() return Ævar Arnfjörð Bjarmason
  2022-05-19 20:09           ` [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-19 20:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Ævar Arnfjörð Bjarmason

On Mon, May 16 2022, Derrick Stolee wrote:

> On 5/12/2022 7:39 PM, Junio C Hamano wrote:
> [...]
> This switch statement was recently added to make it clear that
> unpack_loose_header() returns an enum value, not an int. This adds
> complications for future developers if that enum gains new values, since
> that developer would need to add a case statement to this switch for
> little real value.
>
> Instead, we can revert back to an 'if' statement, but make the enum
> explicit by using "!= ULHR_OK" instead of assuming it has the numerical
> value zero.
>
> Co-authored-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>
>  object-file.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index b5d1d12b68a..52e4ae1b5f0 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -2623,12 +2623,8 @@ int read_loose_object(const char *path,
>  		goto out;
>  	}
>  
> -	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> -				    NULL)) {
> -	case ULHR_OK:
> -		break;
> -	case ULHR_BAD:
> -	case ULHR_TOO_LONG:
> +	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
> +				NULL) != ULHR_OK) {
>  		error(_("unable to unpack header of %s"), path);
>  		goto out;
>  	}

This whole topic-at-large is a stylistic fix-up for a case where I
obviously got it wrong, so take this with a double grain of salt.

Re the "What's Cooking" mention of
ds/object-file-unpack-loose-header-fix: I don't mind it being merged
down at all. The below is all small potatoes.

I don't think the rationale ("adds complications for future
developers") makes sense in this case.

Let's leave aside the question of whether we exhaustively check enum
arms as in the pre-image, or check "not ok" as in the post-image.

Surely we can agree that whatever pattern is preferred we're better
off consistently picking one or the other?

I think this proposed change would make more sense and be in line with
its commit message if it also proposed this:
	
	diff --git a/streaming.c b/streaming.c
	index fe54665d86e..bb4ed198463 100644
	--- a/streaming.c
	+++ b/streaming.c
	@@ -230,15 +230,10 @@ static int open_istream_loose(struct git_istream *st, struct repository *r,
	 	st->u.loose.mapped = map_loose_object(r, oid, &st->u.loose.mapsize);
	 	if (!st->u.loose.mapped)
	 		return -1;
	-	switch (unpack_loose_header(&st->z, st->u.loose.mapped,
	-				    st->u.loose.mapsize, st->u.loose.hdr,
	-				    sizeof(st->u.loose.hdr), NULL)) {
	-	case ULHR_OK:
	-		break;
	-	case ULHR_BAD:
	-	case ULHR_TOO_LONG:
	+	if (unpack_loose_header(&st->z, st->u.loose.mapped,
	+				st->u.loose.mapsize, st->u.loose.hdr,
	+				sizeof(st->u.loose.hdr), NULL) != ULHR_OK)
	 		goto error;
	-	}
	 	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || *type < 0)
	 		goto error;

I.e. now we've converted the 2/3 callers of the API that only cared
about "not OK", there's a third one that cares about all the enum arms
currently, so that one remains a "switch".

The reason I think the rationale doesn't make sense is because of this
inconsistency. I.e. if we suppose a developer adds another enum value,
they'll then discover those three callers.

Surely whatever our preference for how to handle those 2/3 callers
it's less complicated if they don't use different patterns for no
obvious reason.

But anyway. Looking a bit deeper at this code again I think these two
patches are where we'd eventually want to head with this API. I.e. I
think the whole business of making this a tri-state return was
premature on my part.

After this RFC unpack_loose_header() is again a function that returns
a negative value on error, and the enum is gone. As noted in 2/2
there's a slight trade-off there, which I think is for the better,
both in terms of API simplicity, and in the new "error" output we'll
omit in these obscure cases. I.e.:

	-				error: header for $bogus_long_sha1 too long, exceeds 32 bytes
	+				error: header too long, exceeds 32 bytes
	+				error: unable to unpack $bogus_long_sha1 header

This whole "switch" complexity was because the old error message
wanted to note the OID in the "header too long" message.

Again, I'm perfectly fine with ds/object-file-unpack-loose-header-fix
advancing to "next", I can rebase this on top, or drop it depending on
the consensus about whether it's worthwile. I did want to un-block
that topic one way or the other, so to the extent that it was waiting
on my feedback...

Ævar Arnfjörð Bjarmason (2):
  object-file API: fix obscure unpack_loose_header() return
  object-file API: have unpack_loose_header() return "int" again

 cache.h             | 25 +++++-------------------
 object-file.c       | 46 +++++++++++++++++----------------------------
 streaming.c         | 11 +++--------
 t/t1006-cat-file.sh |  6 ++++--
 4 files changed, 29 insertions(+), 59 deletions(-)

-- 
2.36.1.957.g2c13267e09b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 1/2] object-file API: fix obscure unpack_loose_header() return
  2022-05-19 20:09         ` [RFC PATCH 0/2] Alternate ab/valgrind-fixes fix-up Ævar Arnfjörð Bjarmason
@ 2022-05-19 20:09           ` Ævar Arnfjörð Bjarmason
  2022-05-19 20:09           ` [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-19 20:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Ævar Arnfjörð Bjarmason

Fix an obscure unpack_loose_header() return value. In my
5848fb11acd (object-file.c: return ULHR_TOO_LONG on "header too long",
2021-10-01) this API learned to return ULHR_TOO_LONG, but should not
have done so in the case of parsing a long header where the
terminating \0 cannot be found. We should return a ULHR_BAD in that
case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h       | 3 +--
 object-file.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 595582becc8..aa24d5a609f 100644
--- a/cache.h
+++ b/cache.h
@@ -1342,8 +1342,7 @@ int git_open_cloexec(const char *name, int flags);
  * "hdrbuf" argument is non-NULL. This is intended for use with
  * OBJECT_INFO_ALLOW_UNKNOWN_TYPE to extract the bad type for (error)
  * reporting. The full header will be extracted to "hdrbuf" for use
- * with parse_loose_header(), ULHR_TOO_LONG will still be returned
- * from this function to indicate that the header was too long.
+ * with parse_loose_header().
  */
 enum unpack_loose_header_result {
 	ULHR_OK,
diff --git a/object-file.c b/object-file.c
index b5d1d12b68a..1babe9791f6 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1301,7 +1301,7 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 		stream->next_out = buffer;
 		stream->avail_out = bufsiz;
 	} while (status != Z_STREAM_END);
-	return ULHR_TOO_LONG;
+	return ULHR_BAD;
 }
 
 static void *unpack_loose_rest(git_zstream *stream,
-- 
2.36.1.957.g2c13267e09b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again
  2022-05-19 20:09         ` [RFC PATCH 0/2] Alternate ab/valgrind-fixes fix-up Ævar Arnfjörð Bjarmason
  2022-05-19 20:09           ` [RFC PATCH 1/2] object-file API: fix obscure unpack_loose_header() return Ævar Arnfjörð Bjarmason
@ 2022-05-19 20:09           ` Ævar Arnfjörð Bjarmason
  2022-05-20  4:27             ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-19 20:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Ævar Arnfjörð Bjarmason

Revert the API change in my 5848fb11acd (object-file.c: return
ULHR_TOO_LONG on "header too long", 2021-10-01) in favor of having
unpack_loose_header() return a simple negative value on error.

The more complex API was needed to give us flexibility that we didn't
really need. We only used the ULHR_TOO_LONG return case for headers
which "cat-file --allow-unknown-type" is needed for. Let's instead
error() on those unconditionally.

The slight change in the "error" message is that we won't include the
OID in the error in that case, but we will include it in the "unable
to unpack" error emitted by loose_object_info().

As noted in the preceding commit we were mishandling the case where
we'll now error() with "could not find end of corrupt long header".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h             | 22 ++++------------------
 object-file.c       | 46 +++++++++++++++++----------------------------
 streaming.c         | 11 +++--------
 t/t1006-cat-file.sh |  6 ++++--
 4 files changed, 28 insertions(+), 57 deletions(-)

diff --git a/cache.h b/cache.h
index aa24d5a609f..629c4a84b90 100644
--- a/cache.h
+++ b/cache.h
@@ -1330,13 +1330,7 @@ int git_open_cloexec(const char *name, int flags);
 
 /**
  * unpack_loose_header() initializes the data stream needed to unpack
- * a loose object header.
- *
- * Returns:
- *
- * - ULHR_OK on success
- * - ULHR_BAD on error
- * - ULHR_TOO_LONG if the header was too long
+ * a loose object header. A negative value indicates an error.
  *
  * It will only parse up to MAX_HEADER_LEN bytes unless an optional
  * "hdrbuf" argument is non-NULL. This is intended for use with
@@ -1344,17 +1338,9 @@ int git_open_cloexec(const char *name, int flags);
  * reporting. The full header will be extracted to "hdrbuf" for use
  * with parse_loose_header().
  */
-enum unpack_loose_header_result {
-	ULHR_OK,
-	ULHR_BAD,
-	ULHR_TOO_LONG,
-};
-enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
-						    unsigned char *map,
-						    unsigned long mapsize,
-						    void *buffer,
-						    unsigned long bufsiz,
-						    struct strbuf *hdrbuf);
+int unpack_loose_header(git_zstream *stream, unsigned char *map,
+			unsigned long mapsize, void *buffer,
+			unsigned long bufsiz, struct strbuf *header);
 
 /**
  * parse_loose_header() parses the starting "<type> <len>\0" of an
diff --git a/object-file.c b/object-file.c
index 1babe9791f6..1d2901d0e78 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1245,12 +1245,9 @@ void *map_loose_object(struct repository *r,
 	return map_loose_object_1(r, NULL, oid, size);
 }
 
-enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
-						    unsigned char *map,
-						    unsigned long mapsize,
-						    void *buffer,
-						    unsigned long bufsiz,
-						    struct strbuf *header)
+int unpack_loose_header(git_zstream *stream, unsigned char *map,
+			unsigned long mapsize, void *buffer,
+			unsigned long bufsiz, struct strbuf *header)
 {
 	int status;
 
@@ -1266,13 +1263,13 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 	status = git_inflate(stream, 0);
 	obj_read_lock();
 	if (status < Z_OK)
-		return ULHR_BAD;
+		return -1;
 
 	/*
 	 * Check if entire header is unpacked in the first iteration.
 	 */
 	if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
-		return ULHR_OK;
+		return 0;
 
 	/*
 	 * We have a header longer than MAX_HEADER_LEN. The "header"
@@ -1280,7 +1277,8 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 	 * --allow-unknown-type".
 	 */
 	if (!header)
-		return ULHR_TOO_LONG;
+		return error(_("header too long, exceeds %d bytes"),
+			     MAX_HEADER_LEN);
 
 	/*
 	 * buffer[0..bufsiz] was not large enough.  Copy the partial
@@ -1301,7 +1299,7 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream,
 		stream->next_out = buffer;
 		stream->avail_out = bufsiz;
 	} while (status != Z_STREAM_END);
-	return ULHR_BAD;
+	return error(_("could not find end of corrupt long header"));
 }
 
 static void *unpack_loose_rest(git_zstream *stream,
@@ -1466,32 +1464,26 @@ static int loose_object_info(struct repository *r,
 	if (oi->disk_sizep)
 		*oi->disk_sizep = mapsize;
 
-	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
-				    allow_unknown ? &hdrbuf : NULL)) {
-	case ULHR_OK:
+	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
+				allow_unknown ? &hdrbuf : NULL) < 0) {
+		status = error(_("unable to unpack %s header"),
+			       oid_to_hex(oid));
+	} else {
 		if (parse_loose_header(hdrbuf.len ? hdrbuf.buf : hdr, oi) < 0)
 			status = error(_("unable to parse %s header"), oid_to_hex(oid));
 		else if (!allow_unknown && *oi->typep < 0)
 			die(_("invalid object type"));
 
 		if (!oi->contentp)
-			break;
+			goto inflate_end;
 		*oi->contentp = unpack_loose_rest(&stream, hdr, *oi->sizep, oid);
 		if (*oi->contentp)
 			goto cleanup;
 
 		status = -1;
-		break;
-	case ULHR_BAD:
-		status = error(_("unable to unpack %s header"),
-			       oid_to_hex(oid));
-		break;
-	case ULHR_TOO_LONG:
-		status = error(_("header for %s too long, exceeds %d bytes"),
-			       oid_to_hex(oid), MAX_HEADER_LEN);
-		break;
 	}
 
+inflate_end:
 	git_inflate_end(&stream);
 cleanup:
 	munmap(map, mapsize);
@@ -2623,12 +2615,8 @@ int read_loose_object(const char *path,
 		goto out;
 	}
 
-	switch (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
-				    NULL)) {
-	case ULHR_OK:
-		break;
-	case ULHR_BAD:
-	case ULHR_TOO_LONG:
+	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
+				NULL) < 0) {
 		error(_("unable to unpack header of %s"), path);
 		goto out;
 	}
diff --git a/streaming.c b/streaming.c
index fe54665d86e..4456a60348b 100644
--- a/streaming.c
+++ b/streaming.c
@@ -230,15 +230,10 @@ static int open_istream_loose(struct git_istream *st, struct repository *r,
 	st->u.loose.mapped = map_loose_object(r, oid, &st->u.loose.mapsize);
 	if (!st->u.loose.mapped)
 		return -1;
-	switch (unpack_loose_header(&st->z, st->u.loose.mapped,
-				    st->u.loose.mapsize, st->u.loose.hdr,
-				    sizeof(st->u.loose.hdr), NULL)) {
-	case ULHR_OK:
-		break;
-	case ULHR_BAD:
-	case ULHR_TOO_LONG:
+	if (unpack_loose_header(&st->z, st->u.loose.mapped,
+				st->u.loose.mapsize, st->u.loose.hdr,
+				sizeof(st->u.loose.hdr), NULL) < 0)
 		goto error;
-	}
 	if (parse_loose_header(st->u.loose.hdr, &oi) < 0 || *type < 0)
 		goto error;
 
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index dadf3b14583..d742697d3bf 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -536,12 +536,14 @@ do
 			if test "$arg2" = "-p"
 			then
 				cat >expect <<-EOF
-				error: header for $bogus_long_sha1 too long, exceeds 32 bytes
+				error: header too long, exceeds 32 bytes
+				error: unable to unpack $bogus_long_sha1 header
 				fatal: Not a valid object name $bogus_long_sha1
 				EOF
 			else
 				cat >expect <<-EOF
-				error: header for $bogus_long_sha1 too long, exceeds 32 bytes
+				error: header too long, exceeds 32 bytes
+				error: unable to unpack $bogus_long_sha1 header
 				fatal: git cat-file: could not get object info
 				EOF
 			fi &&
-- 
2.36.1.957.g2c13267e09b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again
  2022-05-19 20:09           ` [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again Ævar Arnfjörð Bjarmason
@ 2022-05-20  4:27             ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-05-20  4:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Derrick Stolee

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +int unpack_loose_header(git_zstream *stream, unsigned char *map,
> +			unsigned long mapsize, void *buffer,
> +			unsigned long bufsiz, struct strbuf *header);

Simpler is better as long as we don't make it too simple ;-)

>  	if (!header)
> -		return ULHR_TOO_LONG;
> +		return error(_("header too long, exceeds %d bytes"),
> +			     MAX_HEADER_LEN);

OK.

> -	return ULHR_BAD;
> +	return error(_("could not find end of corrupt long header"));
>  }

OK.

> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index dadf3b14583..d742697d3bf 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -536,12 +536,14 @@ do
>  			if test "$arg2" = "-p"
>  			then
>  				cat >expect <<-EOF
> -				error: header for $bogus_long_sha1 too long, exceeds 32 bytes
> +				error: header too long, exceeds 32 bytes
> +				error: unable to unpack $bogus_long_sha1 header
>  				fatal: Not a valid object name $bogus_long_sha1
>  				EOF
>  			else
>  				cat >expect <<-EOF
> -				error: header for $bogus_long_sha1 too long, exceeds 32 bytes
> +				error: header too long, exceeds 32 bytes
> +				error: unable to unpack $bogus_long_sha1 header
>  				fatal: git cat-file: could not get object info
>  				EOF
>  			fi &&

Looking good.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-05-20  4:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-21 20:14 [PATCH 0/4] Fix issues and a regression noted by valgrind Ævar Arnfjörð Bjarmason
2022-04-21 20:14 ` [PATCH 1/4] tests: make RUNTIME_PREFIX compatible with --valgrind Ævar Arnfjörð Bjarmason
2022-04-21 22:22   ` Junio C Hamano
2022-04-21 20:14 ` [PATCH 2/4] log test: skip a failing mkstemp() test under valgrind Ævar Arnfjörð Bjarmason
2022-04-21 20:14 ` [PATCH 3/4] commit-graph.c: don't assume that stat() succeeds Ævar Arnfjörð Bjarmason
2022-04-21 22:29   ` Junio C Hamano
2022-04-21 20:14 ` [PATCH 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Ævar Arnfjörð Bjarmason
2022-04-21 22:39   ` Junio C Hamano
2022-04-22  8:21     ` Ævar Arnfjörð Bjarmason
2022-05-12 22:32 ` [PATCH v2 0/4] test fixes around valgrind Junio C Hamano
2022-05-12 22:32   ` [PATCH v2 1/4] tests: using custom GIT_EXEC_PATH breaks --valgrind tests Junio C Hamano
2022-05-12 22:32   ` [PATCH v2 2/4] log test: skip a failing mkstemp() test under valgrind Junio C Hamano
2022-05-12 22:32   ` [PATCH v2 3/4] commit-graph.c: don't assume that stat() succeeds Junio C Hamano
2022-05-12 22:32   ` [PATCH v2 4/4] object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 Junio C Hamano
2022-05-12 23:39     ` Junio C Hamano
2022-05-16 14:59       ` Derrick Stolee
2022-05-19 20:09         ` [RFC PATCH 0/2] Alternate ab/valgrind-fixes fix-up Ævar Arnfjörð Bjarmason
2022-05-19 20:09           ` [RFC PATCH 1/2] object-file API: fix obscure unpack_loose_header() return Ævar Arnfjörð Bjarmason
2022-05-19 20:09           ` [RFC PATCH 2/2] object-file API: have unpack_loose_header() return "int" again Ævar Arnfjörð Bjarmason
2022-05-20  4:27             ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).