git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
Date: Tue, 21 Nov 2017 17:57:44 -0500	[thread overview]
Message-ID: <20171121225744.GA21197@sigill> (raw)
In-Reply-To: <xmqqd14cjr13.fsf@gitster.mtv.corp.google.com>

On Tue, Nov 21, 2017 at 11:37:28AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In an ideal world, we'd simply fix all of the callers to
> > notice the null sha1 and avoid passing it to us. But a
> > simple experiment to catch this with a BUG() shows that
> > there are a large number of code paths.
> 
> Well, we can view this (or the alternative you sent later that does
> the same a bit earlier in the function) as "fixing the caller" but
> has already refactord the common logic to a helper function that all
> of these callers call into ;-).

Yes, I'm definitely tempted by that view. :)

What worries me, though, is that callers who lazily propagate the null
sha1 to the lookup functions cannot reasonably tell the difference
between "this object was corrupt or missing" and "we passed the null
sha1, and a missing object is expected".

For example, look at how fetch.c:update_local_ref() looks up objects.
It feeds the old and new sha1 to lookup_commit_reference_gently(), and
if either is NULL, it skips the fast-forward check. That makes sense if
we expect the null sha1 to get translated into a NULL commit. But it
also triggers for a broken ref, and we'd overwrite it (when the right
thing is probably refusing to update).

Here's a runnable example:

-- >8 --
git init parent
git -C parent commit --allow-empty -m base

git clone parent child
git -C parent commit --allow-empty -m more

cd child
rm -f .git/objects/??/*
git fetch
-- 8< --

That final fetch spews a bunch of errors about broken refs, and then
overwrites the value of origin/master, even though it's broken (in this
case it actually is a fast-forward, but the child repo doesn't even know
that).

I'm not sure what the right behavior is, but I'm pretty sure that's not
it. Probably one of:

  - skip updating the ref when we see the breakage

  - ditto, but terminate the whole operation, since we might be deleting
    other refs and in a broken repo we're probably best to make as few
    changes as possible

  - behave as if it was a non-ff, which would allow "--force" to
    overwrite the broken ref. Maybe convenient for fixing things, but
    possibly surprising (and it's not that hard to just delete the
    broken refs manually before proceeding).

-Peff

  reply	other threads:[~2017-11-21 22:57 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
2017-11-20 23:55   ` Eric Sunshine
2017-11-21 15:58     ` Jeff King
2017-11-22  0:32       ` Stefan Beller
2017-11-22 22:38         ` Jeff King
2017-11-23  2:41           ` Junio C Hamano
2017-11-23  5:02             ` Jeff King
2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
2017-11-20 20:47   ` Stefan Beller
2017-11-20 20:58     ` Jeff King
2017-11-21  2:37   ` Junio C Hamano
2017-11-21 22:57     ` Jeff King [this message]
2017-11-22  1:42       ` Junio C Hamano
2017-11-22 22:36         ` Jeff King
2017-11-23  2:35           ` Junio C Hamano
2017-11-24 17:32             ` Jeff King
2017-11-25  3:20               ` Junio C Hamano
2017-11-21  5:20   ` Junio C Hamano
2017-11-21 23:17     ` Jeff King
2017-11-22  1:49       ` Junio C Hamano
2017-11-22  3:17         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171121225744.GA21197@sigill \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).