git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Braun <thomas.braun@virtuell-zuhause.de>
To: Derrick Stolee <stolee@gmail.com>, Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	GIT Mailing-list <git@vger.kernel.org>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: 2.29.0.rc0.windows.1: Duplicate commit id error message when fetching
Date: Fri, 9 Oct 2020 17:29:54 +0200	[thread overview]
Message-ID: <267a9f46-cce9-0bd3-f28d-55e71cc8a399@virtuell-zuhause.de> (raw)
In-Reply-To: <5bbdaed5-df29-8bfe-01c2-eb2462dcca22@gmail.com>

On 08.10.2020 15:22, Derrick Stolee wrote:
> On 10/8/2020 8:50 AM, Derrick Stolee wrote:
>> On 10/8/2020 8:06 AM, Jeff King wrote:
>>> But regardless, it seems unfriendly that we can't
>>> get out of it while merging the graphs. Doing this obviously makes the
>>> problem go away:
>>>
>>> diff --git a/commit-graph.c b/commit-graph.c
>>> index cb042bdba8..ae1f94ccc4 100644
>>> --- a/commit-graph.c
>>> +++ b/commit-graph.c
>>> @@ -2023,8 +2023,11 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>  
>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>  			  &ctx->commits.list[i]->object.oid)) {
>>> -			die(_("unexpected duplicate commit id %s"),
>>> -			    oid_to_hex(&ctx->commits.list[i]->object.oid));
>>> +			/*
>>> +			 * quietly ignore duplicates; these could come from
>>> +			 * incremental graph files mentioning the same commit.
>>> +			 */
>>> +			continue;
>>>  		} else {
>>>  			unsigned int num_parents;
>>>  
>>>
>>> but it's not clear to me if that's papering over another bug, or
>>> gracefully handling a situation that we ought to be.
>>
>> I think this is a good thing to do, at minimum. As I discussed above,
>> the "input data" of the incremental commit-graph chain with duplicate
>> commits across layers isn't actually _invalid_. It's unexpected based
>> on what Git "should" be doing.
> 
> As I was working on my own version of this, I realized that just
> commenting here still creates duplicate commits in the new layer,
> which is even MORE unexpected. It could cause some confusion with
> the binary search, but likely that is still fine. The only "real"
> issue is that it is wasted data.
> 
> I'll send [1] to the list soon (after build & test validation),
> but it includes copying the pointers to a new "de-duplicated" list.

Thanks both for digging into it.

I think I have a starting point for what goes wrong. I found a local
repo with another broken commit graph. And after some fiddling the
following script can reproduce it. I tried with git/git first but that
seems not to trigger that.

# rm -rf dummy
mkdir dummy
cd dummy

git init

git remote add origin https://github.com/tango-controls/cppTango
git remote add fork1 https://github.com/bourtemb/cppTango
git remote add fork2 https://github.com/t-b/cppTango
git fetch --all --jobs 12
git commit-graph verify
rm -rf .git/objects/info/commit-graphs/
git commit-graph verify
git fetch --jobs 12
git remote add fork3 git@github.com:t-b/cppTango.git
git commit-graph verify
git remote add fork4 git@github.com:t-b/cppTango.git
git fetch --jobs 12
git commit-graph verify

The last verify outputs

commit-graph generation for commit
029341567c24582030592585b395f4438273263f is 1054 != 1
commit-graph generation for commit
1e8d10aec7ca6075f622c447d416071390698124 is 4294967295 != 1171
commit-graph generation for commit
296e93516189c0134843fd56ac4f10d36ccf284f is 1054 != 1
commit-graph generation for commit
4c0a7a3cd369d06b99d867be6b47a96c519efd7f is 1054 != 1
commit-graph has non-zero generation number for commit
4d39849950d3dc02b7426c780ac7991ec7221176, but zero elsewhere
commit-graph has non-zero generation number for commit 4
[....]

Does that reproduce on your end as well?

Thomas

  reply	other threads:[~2020-10-09 15:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 20:28 2.29.0.rc0.windows.1: Duplicate commit id error message when fetching Thomas Braun
2020-10-07 21:06 ` Jeff King
2020-10-08  9:52   ` Thomas Braun
2020-10-08 12:06     ` Jeff King
2020-10-08 12:50       ` Derrick Stolee
2020-10-08 13:22         ` Derrick Stolee
2020-10-09 15:29           ` Thomas Braun [this message]
2020-10-09 16:49             ` Derrick Stolee
2020-10-09 17:12               ` Thomas Braun
2020-10-09 17:46                 ` Derrick Stolee
2020-10-09 17:55                   ` Jeff King
2020-10-09 18:28                     ` Taylor Blau
2020-10-09 18:33                       ` Derrick Stolee
2020-10-09 18:37                         ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=267a9f46-cce9-0bd3-f28d-55e71cc8a399@virtuell-zuhause.de \
    --to=thomas.braun@virtuell-zuhause.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).