git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	Taylor Blau <me@ttaylorr.com>,
	Kousik Sanagavarapu <five231003@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/2] parse_object(): check on-disk type of suspected blob
Date: Fri, 18 Nov 2022 01:36:23 +0100	[thread overview]
Message-ID: <221118.86cz9lgjxu.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <Y3a4jKzsHSooYFqj@coredump.intra.peff.net>


On Thu, Nov 17 2022, Jeff King wrote:

> I reworked the conditional a bit so that instead of:
>
>   if ((suspected_blob && oid_object_info() == OBJ_BLOB)
>       (no_clue && oid_object_info() == OBJ_BLOB)
>
> we have the simpler:
>
>   if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)
> [...]
> @@ -286,8 +286,8 @@ struct object *parse_object_with_flags(struct repository *r,
>  			return &commit->object;
>  	}
>  
> -	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
> -	    (!obj && oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
> +	if ((!obj || (obj && obj->type == OBJ_BLOB)) &&
> +	    oid_object_info(r, oid, NULL) == OBJ_BLOB) {
>  		if (!skip_hash && stream_object_signature(r, repl) < 0) {
>  			error(_("hash mismatch %s"), oid_to_hex(oid));
>  			return NULL;

But why:

	if ((!x || (x && x->m)) && ...)

Instead of:

	if ((!x || x->m)) && ...)

If "!obj" is false then "obj" must be non-NULL, so you don't need to
check it again and can lose the "obj &&".

> [...]
> So this fixes one of the lingering expect_failure cases from 0616617c7e
> (t: introduce tests for unexpected object types, 2019-04-09).  That test
> works by peeling a tag that claims to point to a blob (triggering us to
> create the struct), but really points to something else, which we later
> discover when we call parse_object() as part of the actual traversal).
> Prior to this commit, we'd quietly check the sha1 and mark the blob as
> "parsed". Now we correctly complain about the mismatch.

I applied this on top of "master", and adjusted your test to be this
instead:

	test_expect_success 'traverse unexpected non-blob tag (lone)' '
		cat >expect <<-EOF &&
		error: object $commit is a blob, not a commit
		fatal: bad object $commit
		EOF
		test_must_fail git rev-list --objects $tag >out 2>actual &&
		test_must_be_empty out &&
		test_cmp expect actual
	'

Which passes, showing that we're still not correctly identifying it, but
we are doing it for the purposes of erroring out, but the incorrect type
persists.

Now, this all does seem quite familiar... :) :
https://lore.kernel.org/git/patch-10.11-a84f670ac24-20210328T021238Z-avarab@gmail.com/

I.e. that's the rest of the fix for this issue. I applied this change on
my local branch with that, and they combine nicely. the "test_must_fail"
here works as intended, *and* we'll correctly report & store the type.

> As an aside, I found the "this test is marked as success but testing the
> wrong thing" pattern here confusing to deal with (since I had to dig in
> history to understand what was going on and what the test was _supposed_
> to say).
>
> It comes from cf10c5b4cf (rev-list tests: don't hide abort() in
> "test_expect_failure", 2022-03-07). I'm skeptical that it was worth
> switching those tests for leak detection purposes.

It's not just for leak detection purposes that it was a good idea to
switch it away from test_expect_failure, but also that we've been
ensuring that this didn't turn into a segfault all this time by not
using "test_expect_failure".

> But more importantly, it looks like pw/test-todo would provide us with a
> much nicer pattern there. It seems to be stalled on review, so let's see
> if we can get that moving again.

The "TODO (should fail!)" didn't stand out? But yeah, having a "todo" or
"test_expect_todo" or "test_expect_failure" not suck would be nice.

FWIW I think
https://lore.kernel.org/git/221006.86v8owr986.gmgdl@evledraar.gmail.com/
outlines a good way forward for it that I think should make everyone
happy.

> diff --git a/t/t6102-rev-list-unexpected-objects.sh b/t/t6102-rev-list-unexpected-objects.sh
> index 4a9a4436e2..9350b5fd2c 100755
> --- a/t/t6102-rev-list-unexpected-objects.sh
> +++ b/t/t6102-rev-list-unexpected-objects.sh
> @@ -121,8 +121,8 @@ test_expect_success 'setup unexpected non-blob tag' '
>  	tag=$(git hash-object -w --literally -t tag broken-tag)
>  '
>  
> -test_expect_success 'TODO (should fail!): traverse unexpected non-blob tag (lone)' '
> -	git rev-list --objects $tag
> +test_expect_success 'traverse unexpected non-blob tag (lone)' '
> +	test_must_fail git rev-list --objects $tag
>  '

I think you'll either want the test_cmp I noted above, or to do that in
a subsequent test_expect_failure.

I know that your stance is that you prefer not to "test the bad
behavior" as it were. Personally I thought an all-caps TODO comment
might make it less confusing, but anyway.

In this case your commit message claims you're happy with the end
result, so I think you'd want to test what we actually emit on stderr,
as it's quite ... unintuative.

Or, which I think probably makes more sense, add that as a subsequent
test_expect_failure or whatever. FWIW this somewhat un-idiomatic pattern
will get around the current caveats with it:

	test_expect_success .... '
		... >actual.non-blob-tag
	'
	test_lazy_prereq HAVE_NON_BLOB_TAG 'test -e actual.non-blob-tag'
	test_expect_failure HAVE_NON_BLOB_TAG '...' '
		cat >expect.non-blob-tag <<-\EOF &&
		...
		EOF
		test_cmp expect.non-blob-tag actual.non-blob-tag
	'

I.e. peel off the 'test_cmp" that should have a known-good state from
the already-good status code.

  reply	other threads:[~2022-11-18  0:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16 16:39 [RFC][PATCH] object.c: use has_object() instead of repo_has_object_file() Kousik Sanagavarapu
2022-11-16 18:20 ` Jeff King
2022-11-16 21:14   ` Jonathan Tan
2022-11-17 22:37     ` [PATCH 0/2] fixing parse_object() check for type mismatch Jeff King
2022-11-17 22:37       ` [PATCH 1/2] parse_object(): drop extra "has" check before checking object type Jeff King
2022-11-17 22:41       ` [PATCH 2/2] parse_object(): check on-disk type of suspected blob Jeff King
2022-11-18  0:36         ` Ævar Arnfjörð Bjarmason [this message]
2022-11-21 19:21           ` Jeff King
2022-11-18 11:46       ` [PATCH 0/4] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2022-11-18 11:46         ` [PATCH 1/4] object-file.c: free the "t.tag" in check_tag() Ævar Arnfjörð Bjarmason
2022-11-18 11:46         ` [PATCH 2/4] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2022-11-18 11:46         ` [PATCH 3/4] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2022-11-18 11:46         ` [PATCH 4/4] tag: don't emit potentially incorrect "object is a X, not a Y" Ævar Arnfjörð Bjarmason
2022-12-30  1:52         ` [PATCH v2 0/3] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2022-12-30  1:52           ` [PATCH v2 1/3] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2022-12-30  4:25             ` Junio C Hamano
2022-12-30  1:52           ` [PATCH v2 2/3] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2022-12-30  6:07             ` Junio C Hamano
2022-12-30  1:52           ` [PATCH v2 3/3] tag: don't emit potentially incorrect "object is a X, not a Y" Ævar Arnfjörð Bjarmason
2022-11-18 19:05       ` [PATCH 0/2] fixing parse_object() check for type mismatch Taylor Blau
2022-11-21 19:26         ` Jeff King
2022-11-22  0:05           ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221118.86cz9lgjxu.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=five231003@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).