git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
Date: Fri, 20 Jul 2018 15:31:50 -0400	[thread overview]
Message-ID: <20180720193150.GC26403@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqeffycl00.fsf@gitster-ct.c.googlers.com>

On Fri, Jul 20, 2018 at 02:30:23AM -0700, Junio C Hamano wrote:

> > The entries in shallow file says that history behind them may not
> > exist in the repository due to its shallowness but history after
> > them are supposed to be traversable (otherwise we have a repository
> > corruption).  It is true that an entry that itself no longer exists
> > in this repository should not be in shallow file, as the presence of
> > that entry breaks that promise the file is making---that commit
> > ought to exist and it is safe to traverse down to it, so keeping the
> > entry in the file is absolutely a wrong thing to do.
> >
> > But that does not automatically mean that just simply removing it
> > makes the resulting repository good, does it?  Wouldn't the solution
> > for that corruption be to set a new entry to stop history traversal
> > before reaching that (now-missing) commit?
> 
> The above is overly pessimistic and worried about an impossible
> situation, I would think.  The reason why a commit that used to be
> in the shallow file is being pruned during a "repack" is because it
> has become unreachable.  By definition, no future history traversal
> that wants to enumerate reachable commits needs to be stopped from
> finding that commits that are older than this commit being pruned
> are missing by having this in the shallow list.  If there is a ref
> or a reflog entry from which such a problematic traversal starts at,
> we wouldn't be pruing this commit in the first place, because the
> commit has not become unreachable yet.
> 
> So a repository does not become corrupt by pruning the commit *and*
> removing it from the shallow file at the same time.

Right. I think a lot of this is rethinking how shallow pruning works,
too, which is not something Dscho is trying to change. The simplest
argument (which I think Dscho has made elsewhere, too) is: this is
necessary in the current shallow code when dropping objects. We do it
therefore from prune, but miss the case when git-repack is run itself
outside of git-gc.

I do still think the gc/prune architecture is a bit muddled, but at this
point in the discussion I feel OK saying that people running "git repack
-ad" would not be upset to have their shallows pruned.

But the patch is still not OK as-is because prune_shallow() requires the
SEEN flag on each reachable object struct, which we have not set in the
repack process (hence the failing test I posted earlier).  So we need a
solution for that, which may impact ideas about how the call works.
E.g., some possible solutions are:

 - teach pack-objects to optionally trigger the shallow prune based on
   its internal walk

 - have repack use the just-completed pack as a hint about reachability

 - introduce a mechanism to trigger the shallow prune based on a
   commit-only reachability check, and run that from repack (or from gc
   and document that it must be run if you are using repack as a manual
   gc replacement)

I'm not advocating any particular solution there, but just showing that
there's an array of them (and probably more that I didn't mention).

-Peff

  reply	other threads:[~2018-07-20 19:31 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-13 20:18 [PATCH 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Johannes Schindelin via GitGitGadget
2018-07-11 22:17 ` [PATCH 1/2] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-07-11 22:23 ` [PATCH 2/2] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-07-13 20:31   ` Jeff King
2018-07-14 21:56     ` Johannes Schindelin
2018-07-16 17:36       ` Jeff King
2018-07-17 16:25         ` Junio C Hamano
2018-07-19 16:42           ` Johannes Schindelin
2018-07-19 20:49             ` Junio C Hamano
2018-07-20  9:30               ` Junio C Hamano
2018-07-20 19:31                 ` Jeff King [this message]
2018-07-17 17:28         ` Duy Nguyen
2018-07-17 19:41           ` Jeff King
2018-07-18 17:31             ` Duy Nguyen
2018-07-18 17:45               ` Jeff King
2018-07-18 17:48                 ` Duy Nguyen
2018-07-17 16:39   ` Duy Nguyen
2018-07-17 16:48     ` Duy Nguyen
2018-07-19 17:50       ` Johannes Schindelin
2018-07-17 13:51 ` [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Johannes Schindelin via GitGitGadget
2018-07-17 13:51   ` [PATCH v2 1/2] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-07-17 13:51   ` [PATCH v2 2/2] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-07-17 17:45     ` Eric Sunshine
2018-07-17 19:15   ` [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository Jeff King
2018-07-17 19:20     ` Jeff King
2018-07-19 17:48       ` Johannes Schindelin
2018-10-22 22:05   ` [PATCH v3 0/3] repack -ad: fix after fetch --prune " Johannes Schindelin via GitGitGadget
2018-10-22 22:05     ` [PATCH v3 1/3] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-10-24  3:39       ` Junio C Hamano
2018-10-24  8:12         ` Johannes Schindelin
2018-10-24  8:38           ` Johannes Schindelin
2018-10-22 22:05     ` [PATCH v3 2/3] shallow: offer to prune only non-existing entries Johannes Schindelin via GitGitGadget
2018-10-24  3:47       ` Junio C Hamano
2018-10-24  8:01         ` Johannes Schindelin
2018-10-24 15:56           ` Johannes Schindelin
2018-10-25 18:54             ` Jonathan Tan
2018-10-26  7:59               ` Johannes Schindelin
2018-10-26 20:49                 ` Jonathan Tan
2018-10-29 20:45                   ` Johannes Schindelin
2018-10-22 22:05     ` [PATCH v3 3/3] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget
2018-10-24  3:56       ` Junio C Hamano
2018-10-24  8:02         ` Johannes Schindelin
2018-10-23 10:15     ` [PATCH v3 0/3] repack -ad: fix after fetch --prune in a shallow repository Johannes Schindelin
2018-10-24 15:56     ` [PATCH v4 " Johannes Schindelin via GitGitGadget
2018-10-24 15:56       ` [PATCH v4 1/3] repack: point out a bug handling stale shallow info Johannes Schindelin via GitGitGadget
2018-10-24 15:56       ` [PATCH v4 2/3] shallow: offer to prune only non-existing entries Johannes Schindelin via GitGitGadget
2018-10-24 15:56       ` [PATCH v4 3/3] repack -ad: prune the list of shallow commits Johannes Schindelin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180720193150.GC26403@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).