From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
Date: Wed, 22 Nov 2017 17:36:14 -0500 [thread overview]
Message-ID: <20171122223613.GA1405@sigill> (raw)
In-Reply-To: <xmqqwp2jxf5l.fsf@gitster.mtv.corp.google.com>
On Wed, Nov 22, 2017 at 10:42:30AM +0900, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > I'm not sure what the right behavior is, but I'm pretty sure that's not
> > it. Probably one of:
> >
> > - skip updating the ref when we see the breakage
> >
> > - ditto, but terminate the whole operation, since we might be deleting
> > other refs and in a broken repo we're probably best to make as few
> > changes as possible
> >
> > - behave as if it was a non-ff, which would allow "--force" to
> > overwrite the broken ref. Maybe convenient for fixing things, but
> > possibly surprising (and it's not that hard to just delete the
> > broken refs manually before proceeding).
>
> Perhaps the last one would be the ideal endgame, but the second one
> may be a good stopping point in the shorter term.
This turns out to be a lot trickier than I expected. The crux of the
matter is that the case we care about is hidden inside
lookup_commit_reference_gently(), which doesn't distinguish between
corruption and "not a commit".
So there are four cases we care about for this call in fetch:
1. We fed a real sha1 and got a commit (or peeled to one).
2. We fed a real sha1 which resolved to a non-commit, and we got NULL.
3. We fed a real sha1 and the object was missing or corrupted, and we
got NULL.
4. We fed a null sha1 and got NULL.
Right now we lump cases 2-4 together as "do not do a fast-forward
check". That's fine for 2 and 4, but probably not for 3. We can easily
catch case 4 ourselves (if we care to), but distinguishing case 3 from
the others is hard. How should lookup_commit_reference_gently() signal
it to us?
Or should lookup_commit_reference_gently() die on corruption? That's not
very "gentle", but I think the "gently" here is really about "it might
not be a commit", not "the repo might be corrupted". But I think even
that may be the tip of the iceberg. The next thing we do is feed the
commits to in_merge_bases(), which will happily return "nope" if the old
commit cannot be parsed (because it has only a boolean return value).
So I dunno. Maybe it is a losing battle to try to pass this kind of
corruption information up the stack. I'm tempted to say that there
should just be a "paranoid" flag to globally die() whenever we see a
corruption (and you could run with it normally, but relax it whenever
you're investigating a broken repo). But I doubt even that works. Not
having the "old_oid" object at all would be a repo corruption here, but
how are the low-level routines supposed to know when a missing object is
a corruption and when it is not?
-Peff
next prev parent reply other threads:[~2017-11-22 22:36 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
2017-11-20 23:55 ` Eric Sunshine
2017-11-21 15:58 ` Jeff King
2017-11-22 0:32 ` Stefan Beller
2017-11-22 22:38 ` Jeff King
2017-11-23 2:41 ` Junio C Hamano
2017-11-23 5:02 ` Jeff King
2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
2017-11-20 20:47 ` Stefan Beller
2017-11-20 20:58 ` Jeff King
2017-11-21 2:37 ` Junio C Hamano
2017-11-21 22:57 ` Jeff King
2017-11-22 1:42 ` Junio C Hamano
2017-11-22 22:36 ` Jeff King [this message]
2017-11-23 2:35 ` Junio C Hamano
2017-11-24 17:32 ` Jeff King
2017-11-25 3:20 ` Junio C Hamano
2017-11-21 5:20 ` Junio C Hamano
2017-11-21 23:17 ` Jeff King
2017-11-22 1:49 ` Junio C Hamano
2017-11-22 3:17 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171122223613.GA1405@sigill \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).