From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
Elijah Newren <newren@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors
Date: Wed, 31 Mar 2021 22:46:22 +0200 [thread overview]
Message-ID: <87eefvkq5d.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YGTGgFI19fS7Uv6I@coredump.intra.peff.net>
On Wed, Mar 31 2021, Jeff King wrote:
> On Wed, Mar 31, 2021 at 08:31:16PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > Ævar's patch tries to improve the case where we might _know_ which is
>> > correct (because we're actually parsing the object contents), but of
>> > course it covers only a fraction of cases. I'm not really opposed to
>> > that per se, but I probably wouldn't bother myself.
>>
>> What fraction of cases? As far as I can tell it covers all cases where
>> we get this error.
>>
>> If there is a case like what you're describing I haven't found it.
>
> It would happen any time somebody calls lookup_foo() because they saw an
> object referenced, but _doesn't_ parse it. And then somebody later calls
> lookup_bar() in the same way. Neither of them consulted the actual
> object database.
>
> Try this with your patches:
>
> -- >8 --
> git init repo
> cd repo
>
> # just for making things deterministic
> export GIT_COMMITTER_NAME='A U Thor'
> export GIT_COMMITTER_EMAIL='author@example.com'
> export GIT_COMMITTER_DATE='@1234567890 +0000'
>
> blob=$(echo foo | git hash-object -w --stdin)
> git tag -m 'tag of blob' tag-of-blob $blob
> git update-ref refs/tags/tag-of-commit $(
> git cat-file tag tag-of-blob |
> sed s/blob/commit/g |
> git hash-object -w --stdin -t tag
> )
> git update-ref refs/tags/tag-of-tree $(
> git cat-file tag tag-of-blob |
> sed s/blob/tree/g |
> git hash-object -w --stdin -t tag
> )
>
> git fsck
> -- >8 --
>
> That fsck produces (257cc5642 is the blob):
>
> error: object 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 is a blob, not a commit
> error: 257cc5642cb1a054f08cc83f2d943e56fd3ebe99: object could not be parsed: .git/objects/25/7cc5642cb1a054f08cc83f2d943e56fd3ebe99
> error: object 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 is a commit, not a tree
> error: bad tag pointer to 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 in aaff0d42df150e1a734f6a8516878b2ea315ee0a
> error: aaff0d42df150e1a734f6a8516878b2ea315ee0a: object could not be parsed: .git/objects/aa/ff0d42df150e1a734f6a8516878b2ea315ee0a
> error: object 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 is a commit, not a blob
> error: bad tag pointer to 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 in bbd2b7077cd91ee6175cdc0e4c477c25c230cdc7
> error: bbd2b7077cd91ee6175cdc0e4c477c25c230cdc7: object could not be parsed: .git/objects/bb/d2b7077cd91ee6175cdc0e4c477c25c230cdc7
>
> So we claim "is X, not Y" in multiple directions for the same object.
>
> It might just be that there are spots in the fsck code that need to be
> adjusted to use your new function (if they are indeed parsing the
> referred-to object). But there are lots of places that don't actually
> parse the object at the moment they're parsing the tag. E.g.:
>
> $ git for-each-ref --format='%(*objectname)'
> error: object 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 is a commit, not a tree
> error: bad tag pointer to 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 in aaff0d42df150e1a734f6a8516878b2ea315ee0a
> Segmentation fault
>
> Neither of those types is the correct one. And the segfault is just a
> bonus! :)
>
> I'd expect similar cases with parsing commit parents and tree pointers.
> And probably tree entries whose modes are wrong.
So the segfault happens without my patches, but the change is that
before we'd always get it wrong and say "commit, not a tree", but now
we'll get it right some of the time. Patching the relevant object.c code
to emit different messages from the various functions shows that it's
the oid_is_type*() functions that get it right, but object_as_type() is
wrong as before.
So that's certainly something I missed.
But are there any cases where it makes things worse? Or is it just that
it's not a full fix in all cases, but only a partial one?
>> I.e. it happens when we have an un-parsed "struct object" whose type is
>> inferred, and parse it to find out it's not what we expected.
>>
>> It's not ambigious at all what the object actually is. It's just that
>> the previous code was leaking the *assumption* about the type at the
>> time of emitting the error, due to an apparent oversight with parsed
>> v.s. non-parsed.
>>
>> Or in other words, we're leaking the implementation detail that we
>> pre-allocated an object struct of a given type in anticipation of
>> holding a parsed version of that object soon.
>
> Right. In the case that you are indeed parsing the object later, you can
> say definitively "it is X in the odb, but seen as Y previously". But we
> do not always hit the "is X, not Y" error when parsing the object. It
> might be caused by two of these "pre-allocations" (though really I think
> it is not just an implementation detail; the pre-allocation happened
> because some other object referred to us as a given type, so it really
> is a corruption in the repository. Just not in the object we mention).
Indeed, the goal is to emit a sensible message on-the-fly when we see
that corruption.
>> > @@ -169,10 +169,16 @@ void *object_as_type(struct object *obj, enum object_type type, int quiet)
>> > return obj;
>> > }
>> > else {
>> > - if (!quiet)
>> > - error(_("object %s is a %s, not a %s"),
>> > - oid_to_hex(&obj->oid),
>> > - type_name(obj->type), type_name(type));
>> > + if (!(flags & OBJECT_AS_TYPE_QUIET)) {
>> > + if (flags & OBJECT_AS_TYPE_EXPECT_PARSED)
>> > + error(_("object %s is a %s, but was referred to as a %s"),
>> > + oid_to_hex(&obj->oid), type_name(obj->type),
>> > + type_name(type));
>> > + else
>> > + error(_("object %s referred to as both a %s and a %s"),
>> > + oid_to_hex(&obj->oid),
>> > + type_name(obj->type), type_name(type));
>> > + }
>> > return NULL;
>> > }
>> > }
>>
>> Per the above I don't understand how you think there's any uncertainty
>> here.
>>
>> If I'm right and there isn't then first of all I don't see how we could
>> emit 1/2 of those errors. The whole problem here is that we don't know
>> the type of the un-parsed object (and presumably don't want to eagerly
>> know, it would mean hitting the object store).
>
> Forgetting for a moment how to trigger it with actual Git commands, the
> root of the problem is that:
>
> lookup_tree(&oid);
> lookup_blob(&oid);
>
> is going to produce an error message. But we cannot know which object
> type is wrong and which is right (if any). So we'd want to produce the
> "referred to as both" message.
>
> _If_ the caller happens to know that it has just parsed the object
> contents and got a tree, then it would call lookup_parsed_tree(&oid),
> which would pass along OBJECT_AS_TYPE_EXPECT_PARSED, and produce the
> other message.
>
> In practice, of course those two lookup_foo() calls are not right next
> to each other. But they may be triggered on an identical oid by two
> references from different objects.
[...]
>> But when we do know why would we beat around the bush and say "was
>> referred to as X and Y" once we know what it is.
>>
>> AFAICT there's no more reason to think that parse_object_buffer() will
>> be wrong about the type than "git cat-file -t" will be. They both use
>> the same underlying functions to get that information.
>
> My point is that we are not always coming from parse_object_buffer()
> when we see these error messages.
If my solution of relying on the parsed v.s. non-parsed shouldn't we
just devolve to a full object info lookup when emitting the error? It's
more expensive, but we're emitting an error anyway...
next prev parent reply other threads:[~2021-03-31 20:47 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-22 0:33 [PATCH 0/2] Pull objects of various types Daniel Barkalow
2005-06-22 0:35 ` [PATCH 1/2] Parse tags for absent objects Daniel Barkalow
2021-03-08 20:04 ` [PATCH 0/7] improve reporting of unexpected objects Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28 5:35 ` Junio C Hamano
2021-03-28 15:46 ` Ævar Arnfjörð Bjarmason
2021-03-28 18:25 ` Junio C Hamano
2021-04-22 18:09 ` Felipe Contreras
2021-03-28 2:13 ` [PATCH v2 02/10] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 06/10] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 07/10] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 08/10] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 09/10] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-28 2:13 ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2021-03-30 5:50 ` Junio C Hamano
2021-03-31 11:02 ` Jeff King
2021-03-31 18:05 ` Junio C Hamano
2021-03-31 18:31 ` Ævar Arnfjörð Bjarmason
2021-03-31 18:59 ` Jeff King
2021-03-31 20:46 ` Ævar Arnfjörð Bjarmason [this message]
2021-04-01 7:54 ` Jeff King
2021-04-01 8:32 ` [PATCH] ref-filter: fix NULL check for parse object failure Jeff King
2021-04-01 13:56 ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Ævar Arnfjörð Bjarmason
2021-04-01 13:56 ` [PATCH v2 1/5] mktag tests: parse out options in helper Ævar Arnfjörð Bjarmason
2021-04-01 13:56 ` [PATCH v2 2/5] mktag tests: invert --no-strict test Ævar Arnfjörð Bjarmason
2021-04-01 13:56 ` [PATCH v2 3/5] mktag tests: do fsck on failure Ævar Arnfjörð Bjarmason
2021-04-01 13:56 ` [PATCH v2 4/5] mktag tests: test for maybe segfaulting for-each-ref Ævar Arnfjörð Bjarmason
2021-04-01 13:56 ` [PATCH v2 5/5] ref-filter: fix NULL check for parse object failure Ævar Arnfjörð Bjarmason
2021-04-01 19:19 ` Ramsay Jones
2021-04-01 19:56 ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Junio C Hamano
2021-04-02 11:37 ` Ævar Arnfjörð Bjarmason
2021-04-02 20:51 ` Junio C Hamano
2021-04-01 19:52 ` [PATCH] ref-filter: fix NULL check for parse object failure Junio C Hamano
2021-03-31 18:41 ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Junio C Hamano
2021-03-31 19:00 ` Jeff King
2021-03-28 9:27 ` [PATCH v2 00/10] improve reporting of unexpected objects Jeff King
2021-03-29 13:34 ` Ævar Arnfjörð Bjarmason
2021-03-31 10:43 ` Jeff King
2021-04-09 8:07 ` [PATCH 0/2] blob/object.c: trivial readability improvements Ævar Arnfjörð Bjarmason
2021-04-09 8:07 ` [PATCH 1/2] blob.c: remove buffer & size arguments to parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-09 17:51 ` Jeff King
2021-04-09 22:31 ` Junio C Hamano
2021-04-10 12:57 ` Ævar Arnfjörð Bjarmason
2021-04-10 13:01 ` Ævar Arnfjörð Bjarmason
2021-04-13 8:25 ` Jeff King
2021-04-09 8:07 ` [PATCH 2/2] object.c: initialize automatic variable in lookup_object() Ævar Arnfjörð Bjarmason
2021-04-09 17:53 ` Jeff King
2021-04-09 22:32 ` Junio C Hamano
2021-04-09 8:32 ` [PATCH 0/6] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-09 8:32 ` [PATCH 1/6] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:06 ` Jeff King
2021-04-09 18:10 ` Jeff King
2021-04-09 8:32 ` [PATCH 2/6] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:10 ` Jeff King
2021-04-09 8:32 ` [PATCH 3/6] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-09 18:14 ` Jeff King
2021-04-09 19:42 ` Ævar Arnfjörð Bjarmason
2021-04-09 21:29 ` Jeff King
2021-04-09 8:32 ` [PATCH 4/6] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-09 18:24 ` Jeff King
2021-04-09 8:32 ` [PATCH 5/6] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-09 18:36 ` Jeff King
2021-04-09 8:32 ` [PATCH 6/6] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-09 18:42 ` Jeff King
2021-04-09 8:49 ` [PATCH 0/7] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-09 8:49 ` [PATCH 1/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-09 8:49 ` [PATCH 2/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-09 8:49 ` [PATCH 3/7] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-09 8:49 ` [PATCH 4/7] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09 8:50 ` [PATCH 5/7] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09 8:50 ` [PATCH 6/7] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09 8:50 ` [PATCH 7/7] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 0/8] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 1/8] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 2/8] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-21 22:02 ` Jonathan Tan
2021-04-22 6:10 ` Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 3/8] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-21 22:07 ` Jonathan Tan
2021-04-21 23:28 ` Josh Steadmon
2021-04-28 4:12 ` Junio C Hamano
2021-04-20 13:36 ` [PATCH v2 4/8] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 5/8] branch tests: assert lookup_commit_reference_gently() error Ævar Arnfjörð Bjarmason
2021-04-20 13:36 ` [PATCH v2 6/8] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-21 22:26 ` Jonathan Tan
2021-04-20 13:36 ` [PATCH v2 7/8] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:37 ` [PATCH v2 8/8] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 00/10] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 02/10] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 06/10] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 07/10] hash-object: pass along type length to object.c Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 08/10] hash-object: refactor nested else/if/if into else if/else if Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 09/10] hash-object: show usage on invalid --type Ævar Arnfjörð Bjarmason
2021-04-20 13:00 ` [PATCH v2 10/10] object.c: move type_from_string() code to its last user Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 00/10] object.c et al: tests, small bug fixes etc Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 01/10] cat-file tests: test for bogus type name handling Ævar Arnfjörð Bjarmason
2021-04-29 4:15 ` Junio C Hamano
2021-04-20 12:50 ` [PATCH v2 02/10] hash-object tests: more detailed test for invalid type Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 03/10] mktree tests: add test for invalid object type Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 04/10] object-file.c: take type id, not string, in read_object_with_reference() Ævar Arnfjörð Bjarmason
2021-04-29 4:37 ` Junio C Hamano
2021-04-20 12:50 ` [PATCH v2 05/10] {commit,tree,blob,tag}.c: add a create_{commit,tree,blob,tag}() Ævar Arnfjörð Bjarmason
2021-04-29 4:45 ` Junio C Hamano
2021-04-29 12:01 ` Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 06/10] blob.c: remove parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-29 4:51 ` Junio C Hamano
2021-04-20 12:50 ` [PATCH v2 07/10] object.c: simplify return semantic of parse_object_buffer() Ævar Arnfjörð Bjarmason
2021-04-20 12:50 ` [PATCH v2 08/10] object.c: don't go past "len" under die() in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-29 4:55 ` Junio C Hamano
2021-04-20 12:50 ` [PATCH v2 09/10] mktree: stop setting *ntr++ to NIL Ævar Arnfjörð Bjarmason
2021-04-29 5:01 ` Junio C Hamano
2021-04-20 12:50 ` [PATCH v2 10/10] mktree: emit a more detailed error when the <type> is invalid Ævar Arnfjörð Bjarmason
2021-03-08 20:04 ` [PATCH 1/7] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:52 ` Taylor Blau
2021-03-09 10:46 ` Jeff King
2021-03-08 20:04 ` [PATCH 2/7] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-08 20:56 ` Taylor Blau
2021-03-08 21:48 ` Junio C Hamano
2021-03-08 20:04 ` [PATCH 3/7] oid_object_info(): " Ævar Arnfjörð Bjarmason
2021-03-08 21:54 ` Junio C Hamano
2021-03-08 22:32 ` Junio C Hamano
2021-03-09 10:34 ` Jeff King
2021-03-08 20:04 ` [PATCH 4/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:04 ` [PATCH 5/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-08 20:59 ` Taylor Blau
2021-03-08 22:15 ` Junio C Hamano
2021-03-08 20:04 ` [PATCH 6/7] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-09 10:44 ` Jeff King
2021-03-28 1:35 ` Ævar Arnfjörð Bjarmason
2021-03-28 9:06 ` Jeff King
2021-03-28 15:39 ` Ævar Arnfjörð Bjarmason
2021-03-29 9:16 ` Jeff King
2021-03-08 20:04 ` [PATCH 7/7] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2005-06-22 0:35 ` [PATCH 2/2] Pull misc objects Daniel Barkalow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eefvkq5d.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).