git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
Date: Wed, 22 Nov 2017 17:36:14 -0500	[thread overview]
Message-ID: <20171122223613.GA1405@sigill> (raw)
In-Reply-To: <xmqqwp2jxf5l.fsf@gitster.mtv.corp.google.com>

On Wed, Nov 22, 2017 at 10:42:30AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'm not sure what the right behavior is, but I'm pretty sure that's not
> > it. Probably one of:
> >
> >   - skip updating the ref when we see the breakage
> >
> >   - ditto, but terminate the whole operation, since we might be deleting
> >     other refs and in a broken repo we're probably best to make as few
> >     changes as possible
> >
> >   - behave as if it was a non-ff, which would allow "--force" to
> >     overwrite the broken ref. Maybe convenient for fixing things, but
> >     possibly surprising (and it's not that hard to just delete the
> >     broken refs manually before proceeding).
> 
> Perhaps the last one would be the ideal endgame, but the second one
> may be a good stopping point in the shorter term.

This turns out to be a lot trickier than I expected. The crux of the
matter is that the case we care about is hidden inside
lookup_commit_reference_gently(), which doesn't distinguish between
corruption and "not a commit".

So there are four cases we care about for this call in fetch:

  1. We fed a real sha1 and got a commit (or peeled to one).

  2. We fed a real sha1 which resolved to a non-commit, and we got NULL.

  3. We fed a real sha1 and the object was missing or corrupted, and we
     got NULL.

  4. We fed a null sha1 and got NULL.

Right now we lump cases 2-4 together as "do not do a fast-forward
check". That's fine for 2 and 4, but probably not for 3. We can easily
catch case 4 ourselves (if we care to), but distinguishing case 3 from
the others is hard. How should lookup_commit_reference_gently() signal
it to us?

Or should lookup_commit_reference_gently() die on corruption? That's not
very "gentle", but I think the "gently" here is really about "it might
not be a commit", not "the repo might be corrupted". But I think even
that may be the tip of the iceberg. The next thing we do is feed the
commits to in_merge_bases(), which will happily return "nope" if the old
commit cannot be parsed (because it has only a boolean return value).

So I dunno. Maybe it is a losing battle to try to pass this kind of
corruption information up the stack.  I'm tempted to say that there
should just be a "paranoid" flag to globally die() whenever we see a
corruption (and you could run with it normally, but relax it whenever
you're investigating a broken repo). But I doubt even that works. Not
having the "old_oid" object at all would be a repo corruption here, but
how are the low-level routines supposed to know when a missing object is
a corruption and when it is not?

-Peff

  reply	other threads:[~2017-11-22 22:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
2017-11-20 23:55   ` Eric Sunshine
2017-11-21 15:58     ` Jeff King
2017-11-22  0:32       ` Stefan Beller
2017-11-22 22:38         ` Jeff King
2017-11-23  2:41           ` Junio C Hamano
2017-11-23  5:02             ` Jeff King
2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
2017-11-20 20:47   ` Stefan Beller
2017-11-20 20:58     ` Jeff King
2017-11-21  2:37   ` Junio C Hamano
2017-11-21 22:57     ` Jeff King
2017-11-22  1:42       ` Junio C Hamano
2017-11-22 22:36         ` Jeff King [this message]
2017-11-23  2:35           ` Junio C Hamano
2017-11-24 17:32             ` Jeff King
2017-11-25  3:20               ` Junio C Hamano
2017-11-21  5:20   ` Junio C Hamano
2017-11-21 23:17     ` Jeff King
2017-11-22  1:49       ` Junio C Hamano
2017-11-22  3:17         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171122223613.GA1405@sigill \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).