git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Derrick Stolee <derrickstolee@github.com>, git@vger.kernel.org
Subject: Re: commit-graph overflow generation chicken and egg
Date: Mon, 4 Jul 2022 12:46:16 +0200	[thread overview]
Message-ID: <YsLE+DVa5Hd/NqdD@ncase> (raw)
In-Reply-To: <Yr7jY6GjUkOzHNh6@ncase>

[-- Attachment #1: Type: text/plain, Size: 5477 bytes --]

On Fri, Jul 01, 2022 at 02:07:03PM +0200, Patrick Steinhardt wrote:
> On Wed, Jun 08, 2022 at 07:17:39PM -0400, Jeff King wrote:
> > On Wed, Jun 08, 2022 at 04:08:03PM -0400, Derrick Stolee wrote:
> > 
> > > I'd love to see the full binary, but for the sake of sharing on the
> > > list, could you give the following output?
> > > 
> > > 	xxd .git/objects/info/commit-graph | head
> > > 
> > > or any other command that shows the first few hex bytes along with
> > > their ASCII equivalents. Here is one that used Git 2.34.0:
> > > [...]
> > 
> > Interesting. My earlier email was a bit misleading. I do in fact have a
> > GDA2 chunk. And looking at the timestamp on the commit-graph file, it's
> > from May 24th. I hadn't been keeping the repo up to date regularly, but
> > I did occasionally pull and rebuild. So I think it was a much more
> > recent version of Git that built the problematic file, though it's
> > possible it was carrying forward bad data.
> > 
> > So 6dbf4b8172ef may be a bit of a red herring, if the file has a GDA2
> > section that was simply ignored before that commit.
> > 
> > Looking at my reflog, my best guess for the version of Git that produced
> > the file is e46751e96fa.
> > 
> > > However, the lack of the large offset chunk could be due to the bug fixed by
> > > 75979d9460 (commit-graph: fix ordering bug in generation numbers,
> > > 2022-03-01). Perhaps that was the thing that was missing from your version?
> > 
> > So I _think_ I would have had that, though there's a good chance that an
> > older version of the commit-graph file was written using a version of
> > Git without it.
> > 
> > > But otherwise, I'm stumped. I'd be very interested to see a repro from a
> > > fresh repository. That is: what situation do we need to be in to write such
> > > an offset without including the large offset chunk?
> > 
> > Not exactly a fresh reproduction, but you can grab my broken file from:
> > 
> >   https://peff.net/tmp/broken-commit-graph
> > 
> > Dropping it into a fresh clone of git.git shows the problem.
> > 
> > I tried a few obvious from-scratch reproductions like building a file
> > with 75979d9460^ (so with the generation number bug), and then jumping
> > forward to e46751e96fa (so bug fixed, but now we write GDA2), but
> > couldn't get it to trigger.
> > 
> > It may not be worth spending too much time on, if this is a weird
> > one-off caused by a mix of buggy unreleased versions of Git. If real
> > users aren't seeing it, and we know the nuclear option is "rm
> > commit-graph", then that may be enough.
> > 
> > -Peff
> 
> I have also repeatedly run into the same problem. I had already
> discussed this with Derrick in the past in [1], but back then we also
> declared bancruptcy and said that this seems to only be caused by some
> weird in-between states of Git versions.
> 
> I have experienced the issue again in git.git now, again without having
> a clue how I arrived at that state. The funny thing is that I explicitly
> tried to reproduce the error in that repo a few days ago, without any
> success at all, by writing commit-graphs with different Git versions.
> Only today when I got back to it completely unsuspecting did Git start
> to complain.
> 
> But more imporantly, we started to see the issue in one of our repos in
> our staging systems as well [2], where we're currently running with a
> mixture of Git v2.35.1 and v2.36.1 with a small set of patches on top of
> them. None of the patches are related to commit-graphs though. The repo
> in question is a pooled repository (like last time I reported the bug),
> where the pool itself has a single commit-graph with GDAT chunks and the
> pool member has a single commit-graph with GDA2 chunks.
> 
> I spent a lot of time today to try and come up with a reproducer to get
> to this state from a clean repo, but again with no success so far. Also,
> staring at the code for extended periods of time didn't result in any
> insights.
> 
> This issue continues to puzzle me.
> 
> Patrick
> 
> [1]: http://public-inbox.org/git/Yh3rZX6cJpkHmRZc@ncase/
> [2]: https://gitlab.com/gitlab-org/gitlab/-/issues/365903

While I still haven't been able to reproduce the error, I did find a
different error. Here's the reproducer, which works with Git v2.37.0 and
older:

```
+ rm -rf /tmp/repo
+ git init /tmp/repo
Initialized empty Git repository in /tmp/repo/.git/
+ cd /tmp/repo/
+ GIT_COMMITTER_DATE='2000-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main (root-commit) 62ebc8d] x
+ git branch other
+ GIT_COMMITTER_DATE='1970-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main c628d6d] x
+ GIT_COMMITTER_DATE='2040-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main 0d73218] x
+ git commit-graph write --reachable --split=replace
+ git switch other
Switched to branch 'other'
+ GIT_COMMITTER_DATE='2000-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[other 7d03e12] x
+ git commit-graph write --reachable --split=replace
+ git commit-graph verify
commit date for commit c628d6dc7292b6d481f0ec4ed39ed2bb4a8cff49 in commit-graph is 17179865584 != 18446744073709548016
Verifying commits in commit graph: 100% (4/4), done.
```

I may have a look at a later point at what's happening, but for the time
being I'll continue to hunt down the other bug. Still wanted to document
my finding out here.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-07-04 10:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08 19:33 commit-graph overflow generation chicken and egg Jeff King
2022-06-08 20:08 ` Derrick Stolee
2022-06-08 23:17   ` Jeff King
2022-07-01 12:06     ` Patrick Steinhardt
2022-07-04 10:46       ` Patrick Steinhardt [this message]
2022-07-04 20:50         ` Derrick Stolee
2022-07-05 21:03           ` Will Chandler
2022-07-05 22:28             ` Taylor Blau
2022-07-06  8:52               ` Jeff King
2022-07-06  9:11           ` Jeff King
2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
2022-06-09 15:26     ` Jeff King
2022-06-09 15:39       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YsLE+DVa5Hd/NqdD@ncase \
    --to=ps@pks.im \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).