git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>,
	"Stefan Beller" <stefanbeller@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Matt McCutchen" <matt@mattmccutchen.net>
Subject: Re: [PATCH 0/4] gc docs: modernize and fix the documentation
Date: Mon, 18 Mar 2019 20:18:29 -0400	[thread overview]
Message-ID: <20190319001829.GL29661@sigill.intra.peff.net> (raw)
In-Reply-To: <87ftrjer8s.fsf@evledraar.gmail.com>

On Mon, Mar 18, 2019 at 11:45:39PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > I don't think the quarantine stuff should impact contention at all. It's
> > only quarantining the objects, which are the least contentious part of
> > Git (because object content is idempotent, so we don't do any locking
> > there, and with two racing processes, one will just "win").
> 
> Without the quarantine, isn't there the race that the NOTES section
> talks about (unless I've misread it).

Ah, OK, I wasn't quite sure which documentation you were talking about.
I see the discussion now in the "NOTES" section of git-gc(1).

> I.e. we have some loose object "ABCD" not referrred to by anything for
> the last 2 weeks, as we're gc-ing a ref update comes in that makes it
> referenced again. We then delete "ABCD" (not used!) at the same time the
> ref update happens, and get corruption.
> 
> Whereas the quarantine might work around since the client will have sent
> ABCD with no reference pointing to it to the server in the temp pack,
> which we then rename in-place and then update the ref, so we don't care
> if "ABCD" goes away.

tl;dr I don't think quarantine impacts this, but if you really want gory
details, read on.

This is a problem with or without the quarantine. It's fundamentally a
race because we do not atomically say "is anybody using X? If not, we
can delete it" and some other process saying "I'd like to use X".

Pushes are actually better off than most operations, because we only
advertise what's reachable, and the client is expected to send
everything else. So with just a normal update-ref call, we could race
like this:

  1. ABCD is ancient.

  2. Process 1 (update-ref) wants to reference ABCD. It sees that we
     have it.

  3. Process 2 (gc/prune) sees that nobody references it. It deletes
     ABCD.

  4. Process 1 writes out the reference.

That doesn't happen with a push, because the server never would have
told the client that it has ABCD in the first place (so process 1 here
is the client). That is true with or without quarantine.

But pushes aren't foolproof either. You said "loose object ABCD not
referred t oby anything for the last 2 weeks". But that's not exactly
how it works. It's "object with an mtime of more than 2 weeks which is
not currently referenced". So imagine a sequence like:

  1. ABCD is ancient.

  2. refs/heads/foo points to ABCD.

  3. Server receive-pack advertises foo pointing to ABCD.

  4. Simultaneous process on the server deletes refs/heads/foo (or
     perhaps somebody force-pushes over it).

  5. Client prepares and sends pack without ABCD.

  6. Server receive-pack checks that yes, we still have ABCD (i.e., the
     usual connectivity check).

  7. Server gc drops ABCD, which is now unreachable (reflogs can help
     here, if you've enabled them; but we do delete reflogs when the
     branch is deleted).

  8. Server receive-pack writes corrupt repo mentioning ABCD.

That's a lot more steps, though they might not be as implausible as you
think (e.g., consider moving "refs/heads/foo" to "refs/heads/bar" in a
single push; that's actually a delete and an update, which is all you
need to race with a simultaneous gc).

I have no idea how often this happens in practice. My subjective
recollection is that most of the race corruptions I've seen were from
local operations on the server. E.g., we compute a tentative merge for
somebody's pull request which shares objects with an older tentative
merge. They click the "merge" button and we reference that commit, which
is recent, but unbeknownst to us, while we were creating our new
tentative merge, a "gc" was deleting the old one.

We're sometimes saved by the "transitive freshness" rules in d3038d22f9
(prune: keep objects reachable from recent objects, 2014-10-15).  But
they're far from perfect:

 - some operations (like the push rename example) aren't writing new
   objects, so the ref write _is_ the moment that gc would find out
   something is reachable

 - the "is it reachable?" and "no, then delete it" steps aren't atomic.
   Unless you want a whole-repo stop-the-world lock, somebody can
   reference the object in between. And since it may take many seconds
   to compute reachability, stop-the-world is not great.

I think there are probably ways to make it better. Perhaps some kind of
lockless delete-but-be-able-to-rollback scheme (but keep in mind this
has to be implemented no top of POSIX filesystem semantics). Or even
just a "compute reachability, mark for deletion, and then hold a
stop-the-world lock briefly to double-check that our reachability is
still up to date".

At least those seem plausible to me. I've never worked out the details,
and our solution was to just stop deleting objects during routine
maintenance (using "repack -adk"). We do still occasionally prune
manually (e.g., when somebody writes to support to remove a confidential
mistake).

Anyway, that was more than you probably wanted to know. The short of it
is that I don't think quarantines help (they may even make things worse
by slightly increasing the length of the race window, though in practice
I doubt it).

> Unless that interacts racily with the receive.unpackLimit, but then I
> have no idea that section is trying to say...

No, I don't think unpackLimit really affects it much either way.

> Also, surely the part where "NOTES" says something to the effect of "you
> are subject to races unless gc.auto=0" is wrong. To the extent that
> there's races it won't matter that you invoke "git gc" or "git gc
> --auto", it's the concurrency that matters. So if there's still races we
> should be saying the repo needs to be locked for writes for the duration
> of the "gc".

Correct. It's the very act of pruning that is problematic. I think the
point is that if you are manually running "git gc", you'd presumably do
it at a time when the repository is not otherwise active.

-Peff

  reply	other threads:[~2019-03-19  0:18 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-18 16:14 [PATCH 0/4] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-03-18 16:14 ` [PATCH 1/4] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:27   ` Jeff King
2019-03-18 22:18     ` Ævar Arnfjörð Bjarmason
2019-03-18 16:15 ` [PATCH 2/4] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:31   ` Jeff King
2019-03-21 22:11     ` Andreas Heiduk
2019-03-19  2:08   ` Duy Nguyen
2019-03-18 16:15 ` [PATCH 3/4] gc docs: de-duplicate "OPTIONS" and "CONFIGURATION" Ævar Arnfjörð Bjarmason
2019-03-18 21:49   ` Jeff King
2019-03-18 22:48     ` Ævar Arnfjörð Bjarmason
2019-03-18 23:42       ` Jeff King
2019-03-18 16:15 ` [PATCH 4/4] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-18 20:28   ` Jonathan Nieder
2019-03-18 21:22     ` Jeff King
2019-03-18 22:13       ` Ævar Arnfjörð Bjarmason
2019-03-18 23:53         ` Jeff King
2019-03-19  6:54   ` Johannes Sixt
2019-03-19  9:28     ` Ævar Arnfjörð Bjarmason
2019-03-18 21:51 ` [PATCH 0/4] gc docs: modernize and fix the documentation Jeff King
2019-03-18 22:45   ` Ævar Arnfjörð Bjarmason
2019-03-19  0:18     ` Jeff King [this message]
2019-05-06  9:44       ` Ævar Arnfjörð Bjarmason
2019-05-07  7:51         ` Jeff King
2019-05-09 23:20           ` Ævar Arnfjörð Bjarmason
2019-07-31  4:26             ` Jeff King
2019-07-31 10:12               ` Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 00/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 01/11] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-30 18:04     ` Todd Zullinger
2019-04-07 19:52     ` [PATCH v4 00/11] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 01/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 01/10] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  6:01   ` Junio C Hamano
2019-03-21 20:50 ` [PATCH v2 02/10] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 03/10] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 04/10] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 05/10] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 06/10] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 07/10] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 08/10] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 09/10] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 10/10] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190319001829.GL29661@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=matt@mattmccutchen.net \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).