git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Bryan Turner <bturner@atlassian.com>
Cc: Git Users <git@vger.kernel.org>
Subject: Re: Unexpected cat-file --batch-check output
Date: Tue, 26 Oct 2021 21:28:12 -0400	[thread overview]
Message-ID: <YXirLGXsKj01uNGv@coredump.intra.peff.net> (raw)
In-Reply-To: <CAGyf7-Gaphb9q=4cyT0BQa7oYGKXQQsU-XfqvoxfDyijehJO3Q@mail.gmail.com>

On Tue, Oct 26, 2021 at 04:58:49PM -0700, Bryan Turner wrote:

> >   - what does "git rev-parse refs/heads/develop:path/to/parent/file"
> >     say? If it comes up with 4c8d566ed80, then the problem is cat-file
> >     specific. If not, then it's a problem in the name resolution
> >     routines.
> 
> $ /usr/bin/git rev-parse refs/heads/develop
> 28a05ce2e3079afcb32e4f1777b42971d7933a91
> $ /usr/bin/git rev-parse refs/heads/develop:path/to/parent/file
> cc10f4b278086325aab2f95df97c807c7c6cd75e
> 
> So it looks like rev-parse and cat-file --batch-check both exhibit the
> same behavior.

OK, that's not too surprising, since they're using the same routines
under the hood. But that does imply that the problem is in the get_oid()
family, which is what's doing that name to oid lookup.

I don't recall us ever having a bug of this nature in the history of
Git, nor do I think this code would have changed recently. But of course
there's a first time for everything.

The parser there isn't exactly left-to-right, so perhaps this particular
name is stimulating some corner case. I imagine the answer is "no", or
you'd have said so already, but are there any unusual characters in the
filename path? Colons, curly braces, etc?

> I also had them expand their cat-file --batch-check to include another
> file in the same "path/to/parent" directory:
> $ echo 'refs/heads/develop
> refs/heads/develop:path/to/parent/sibling
> refs/heads/develop:path/to/parent/file' | /usr/bin/git cat-file --batch-check
> 28a05ce2e3079afcb32e4f1777b42971d7933a91 commit 259
> 2bfe7b4b7c7cdeb9653801d99b65dfefe5780dda blob 897
> cc10f4b278086325aab2f95df97c807c7c6cd75e commit 330
> 
> So the "sibling" file in the same directory comes out as a "blob", as expected.

Interesting. That again points to their being something funny either
with this filename, or perhaps with the tree that contains it.

> >   - likewise, what does "git cat-file -t cc10f4b27808" say? I'd expect
> >     it to really be a commit (a bug in batch-check's formatting routines
> >     could show the wrong object, but I'd expect the oid to at least
> >     match what ls-tree showed).
> 
> $ /usr/bin/git cat-file -t cc10f4b278086325aab2f95df97c807c7c6cd75e
> commit

That's not too surprising. I did wonder if refs/replace or something
could be at work here, but I think in that case we'd still report the
expected oid. At any rate, we can probably rule that out as rev-parse is
returning the same unexpected oid, which means the problem is during the
name resolution (and we shouldn't respect refs/replace there at all; we
would respect it while reading the outer tree, but then so would your
ls-tree, etc).

> I've asked them to double-check whether they can provide me with the
> repository, or with an anonymized copy. At this point, it feels like
> there's not a lot more I can do/check without access to data that
> reproduces the issue so I can attach a debugger.

Another possibility, if they would run a custom Git on their end, is to
provide them with a patch that cranks up the debugging output from
get_oid_with_context_1(). Though I feel like it's hard to know where to
sprinkle printf()s until we know where things go wrong. Is it
misinterpreting the name, and not realizing it's a tree:path name? Or is
get_tree_entry() at fault? That kind of thing is much easier to figure
out interactively in a debugger.

-Peff

  reply	other threads:[~2021-10-27  1:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 19:02 Unexpected cat-file --batch-check output Bryan Turner
2021-10-25 19:18 ` Jeff King
2021-10-25 21:48   ` Bryan Turner
2021-10-26 23:58   ` Bryan Turner
2021-10-27  1:28     ` Jeff King [this message]
2021-10-27  8:08     ` Johannes Sixt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXirLGXsKj01uNGv@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=bturner@atlassian.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).