From: Jeff King <peff@peff.net>
To: Bryan Turner <bturner@atlassian.com>
Cc: Git Users <git@vger.kernel.org>
Subject: Re: Unexpected cat-file --batch-check output
Date: Tue, 26 Oct 2021 21:28:12 -0400 [thread overview]
Message-ID: <YXirLGXsKj01uNGv@coredump.intra.peff.net> (raw)
In-Reply-To: <CAGyf7-Gaphb9q=4cyT0BQa7oYGKXQQsU-XfqvoxfDyijehJO3Q@mail.gmail.com>
On Tue, Oct 26, 2021 at 04:58:49PM -0700, Bryan Turner wrote:
> > - what does "git rev-parse refs/heads/develop:path/to/parent/file"
> > say? If it comes up with 4c8d566ed80, then the problem is cat-file
> > specific. If not, then it's a problem in the name resolution
> > routines.
>
> $ /usr/bin/git rev-parse refs/heads/develop
> 28a05ce2e3079afcb32e4f1777b42971d7933a91
> $ /usr/bin/git rev-parse refs/heads/develop:path/to/parent/file
> cc10f4b278086325aab2f95df97c807c7c6cd75e
>
> So it looks like rev-parse and cat-file --batch-check both exhibit the
> same behavior.
OK, that's not too surprising, since they're using the same routines
under the hood. But that does imply that the problem is in the get_oid()
family, which is what's doing that name to oid lookup.
I don't recall us ever having a bug of this nature in the history of
Git, nor do I think this code would have changed recently. But of course
there's a first time for everything.
The parser there isn't exactly left-to-right, so perhaps this particular
name is stimulating some corner case. I imagine the answer is "no", or
you'd have said so already, but are there any unusual characters in the
filename path? Colons, curly braces, etc?
> I also had them expand their cat-file --batch-check to include another
> file in the same "path/to/parent" directory:
> $ echo 'refs/heads/develop
> refs/heads/develop:path/to/parent/sibling
> refs/heads/develop:path/to/parent/file' | /usr/bin/git cat-file --batch-check
> 28a05ce2e3079afcb32e4f1777b42971d7933a91 commit 259
> 2bfe7b4b7c7cdeb9653801d99b65dfefe5780dda blob 897
> cc10f4b278086325aab2f95df97c807c7c6cd75e commit 330
>
> So the "sibling" file in the same directory comes out as a "blob", as expected.
Interesting. That again points to their being something funny either
with this filename, or perhaps with the tree that contains it.
> > - likewise, what does "git cat-file -t cc10f4b27808" say? I'd expect
> > it to really be a commit (a bug in batch-check's formatting routines
> > could show the wrong object, but I'd expect the oid to at least
> > match what ls-tree showed).
>
> $ /usr/bin/git cat-file -t cc10f4b278086325aab2f95df97c807c7c6cd75e
> commit
That's not too surprising. I did wonder if refs/replace or something
could be at work here, but I think in that case we'd still report the
expected oid. At any rate, we can probably rule that out as rev-parse is
returning the same unexpected oid, which means the problem is during the
name resolution (and we shouldn't respect refs/replace there at all; we
would respect it while reading the outer tree, but then so would your
ls-tree, etc).
> I've asked them to double-check whether they can provide me with the
> repository, or with an anonymized copy. At this point, it feels like
> there's not a lot more I can do/check without access to data that
> reproduces the issue so I can attach a debugger.
Another possibility, if they would run a custom Git on their end, is to
provide them with a patch that cranks up the debugging output from
get_oid_with_context_1(). Though I feel like it's hard to know where to
sprinkle printf()s until we know where things go wrong. Is it
misinterpreting the name, and not realizing it's a tree:path name? Or is
get_tree_entry() at fault? That kind of thing is much easier to figure
out interactively in a debugger.
-Peff
next prev parent reply other threads:[~2021-10-27 1:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 19:02 Unexpected cat-file --batch-check output Bryan Turner
2021-10-25 19:18 ` Jeff King
2021-10-25 21:48 ` Bryan Turner
2021-10-26 23:58 ` Bryan Turner
2021-10-27 1:28 ` Jeff King [this message]
2021-10-27 8:08 ` Johannes Sixt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YXirLGXsKj01uNGv@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=bturner@atlassian.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).