From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
Date: Fri, 20 Jul 2018 15:31:50 -0400 [thread overview]
Message-ID: <20180720193150.GC26403@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqeffycl00.fsf@gitster-ct.c.googlers.com>
On Fri, Jul 20, 2018 at 02:30:23AM -0700, Junio C Hamano wrote:
> > The entries in shallow file says that history behind them may not
> > exist in the repository due to its shallowness but history after
> > them are supposed to be traversable (otherwise we have a repository
> > corruption). It is true that an entry that itself no longer exists
> > in this repository should not be in shallow file, as the presence of
> > that entry breaks that promise the file is making---that commit
> > ought to exist and it is safe to traverse down to it, so keeping the
> > entry in the file is absolutely a wrong thing to do.
> >
> > But that does not automatically mean that just simply removing it
> > makes the resulting repository good, does it? Wouldn't the solution
> > for that corruption be to set a new entry to stop history traversal
> > before reaching that (now-missing) commit?
>
> The above is overly pessimistic and worried about an impossible
> situation, I would think. The reason why a commit that used to be
> in the shallow file is being pruned during a "repack" is because it
> has become unreachable. By definition, no future history traversal
> that wants to enumerate reachable commits needs to be stopped from
> finding that commits that are older than this commit being pruned
> are missing by having this in the shallow list. If there is a ref
> or a reflog entry from which such a problematic traversal starts at,
> we wouldn't be pruing this commit in the first place, because the
> commit has not become unreachable yet.
>
> So a repository does not become corrupt by pruning the commit *and*
> removing it from the shallow file at the same time.
Right. I think a lot of this is rethinking how shallow pruning works,
too, which is not something Dscho is trying to change. The simplest
argument (which I think Dscho has made elsewhere, too) is: this is
necessary in the current shallow code when dropping objects. We do it
therefore from prune, but miss the case when git-repack is run itself
outside of git-gc.
I do still think the gc/prune architecture is a bit muddled, but at this
point in the discussion I feel OK saying that people running "git repack
-ad" would not be upset to have their shallows pruned.
But the patch is still not OK as-is because prune_shallow() requires the
SEEN flag on each reachable object struct, which we have not set in the
repack process (hence the failing test I posted earlier). So we need a
solution for that, which may impact ideas about how the call works.
E.g., some possible solutions are:
- teach pack-objects to optionally trigger the shallow prune based on
its internal walk
- have repack use the just-completed pack as a hint about reachability
- introduce a mechanism to trigger the shallow prune based on a
commit-only reachability check, and run that from repack (or from gc
and document that it must be run if you are using repack as a manual
gc replacement)
I'm not advocating any particular solution there, but just showing that
there's an array of them (and probably more that I didn't mention).
-Peff
next prev parent reply other threads:[~2018-07-20 19:31 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-13 20:18 [PATCH 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Johannes Schindelin via GitGitGadget
2018-07-11 22:17 ` [PATCH 1/2] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-07-11 22:23 ` [PATCH 2/2] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-07-13 20:31 ` Jeff King
2018-07-14 21:56 ` Johannes Schindelin
2018-07-16 17:36 ` Jeff King
2018-07-17 16:25 ` Junio C Hamano
2018-07-19 16:42 ` Johannes Schindelin
2018-07-19 20:49 ` Junio C Hamano
2018-07-20 9:30 ` Junio C Hamano
2018-07-20 19:31 ` Jeff King [this message]
2018-07-17 17:28 ` Duy Nguyen
2018-07-17 19:41 ` Jeff King
2018-07-18 17:31 ` Duy Nguyen
2018-07-18 17:45 ` Jeff King
2018-07-18 17:48 ` Duy Nguyen
2018-07-17 16:39 ` Duy Nguyen
2018-07-17 16:48 ` Duy Nguyen
2018-07-19 17:50 ` Johannes Schindelin
2018-07-17 13:51 ` [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Johannes Schindelin via GitGitGadget
2018-07-17 13:51 ` [PATCH v2 1/2] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-07-17 13:51 ` [PATCH v2 2/2] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-07-17 17:45 ` Eric Sunshine
2018-07-17 19:15 ` [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Jeff King
2018-07-17 19:20 ` Jeff King
2018-07-19 17:48 ` Johannes Schindelin
2018-10-22 22:05 ` [PATCH v3 0/3] repack -ad: fix after fetch --prune " Johannes Schindelin via GitGitGadget
2018-10-22 22:05 ` [PATCH v3 1/3] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-10-24 3:39 ` Junio C Hamano
2018-10-24 8:12 ` Johannes Schindelin
2018-10-24 8:38 ` Johannes Schindelin
2018-10-22 22:05 ` [PATCH v3 2/3] shallow: offer to prune only non-existing entries Johannes Schindelin via GitGitGadget
2018-10-24 3:47 ` Junio C Hamano
2018-10-24 8:01 ` Johannes Schindelin
2018-10-24 15:56 ` Johannes Schindelin
2018-10-25 18:54 ` Jonathan Tan
2018-10-26 7:59 ` Johannes Schindelin
2018-10-26 20:49 ` Jonathan Tan
2018-10-29 20:45 ` Johannes Schindelin
2018-10-22 22:05 ` [PATCH v3 3/3] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-10-24 3:56 ` Junio C Hamano
2018-10-24 8:02 ` Johannes Schindelin
2018-10-23 10:15 ` [PATCH v3 0/3] repack -ad: fix after fetch --prune in a shallow repository Johannes Schindelin
2018-10-24 15:56 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
2018-10-24 15:56 ` [PATCH v4 1/3] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-10-24 15:56 ` [PATCH v4 2/3] shallow: offer to prune only non-existing entries Johannes Schindelin via GitGitGadget
2018-10-24 15:56 ` [PATCH v4 3/3] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180720193150.GC26403@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).