git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>,
	"Stefan Beller" <stefanbeller@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Matt McCutchen" <matt@mattmccutchen.net>
Subject: Re: [PATCH 0/4] gc docs: modernize and fix the documentation
Date: Tue, 7 May 2019 03:51:58 -0400	[thread overview]
Message-ID: <20190507075158.GG28060@sigill.intra.peff.net> (raw)
In-Reply-To: <878svjj4t5.fsf@evledraar.gmail.com>

On Mon, May 06, 2019 at 11:44:06AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Maybe there's some case I haven't thought of that makes this stupid, but
> I wonder if something like a "gc quarantine" might be a fix fo both of
> the the issues you noted above.
> 
> I.e. it seems to me that the main issue is that we conflate "mtime 2
> weeks old because it's unreferenced for 2 weeks" v.s. "mtime 2 weeks old
> because we haven't gotten around to a 'gc'".
> 
> So in such a "gc quarantine" mode when we discover an object/pack that's
> unreachable/purely made up of unreachable objects we'd move the relevant
> loose object/"loose" pack to such a quarantine, which would just be
> .git/unreferenced-objects/{??,pack}/ or whatever.
> 
> AFAICT both cases you mentioned above would be mitigated by this because
> we'd no longer conflate "haven't gc'd this yet and it's 2 weeks old"
> v.s. "hasn't been referenced in 2 weeks".

Michael Haggerty and I have (off-list) discussed variations on that, but
it opens up a lot of new issues.  Moving something into quarantine isn't
atomic. So you've still corrupted the repo, but now it's recoverable by
reaching into the quarantine. Who notices that the repo is corrupt, and
how? When do we expire objects from quarantine?

I think the heart of the issue is really the lack of atomicity in the
operations. You need some way to mark "I am using this now" in a way
that cannot race with "looks like nobody is using this, so I'll delete
it".

And ideally without traversing large bits of the graph on the writing
side, and without requiring any stop-the-world locks during pruning.

> I started looking at this initially because I was wondering if the
> --keep-unreachable mode you modified in e26a8c4721 ("repack: extend
> --keep-unreachable to loose objects", 2016-06-13) could be made to write
> out such "unreferenced" objects into their *own* pack, so we could
> delete them all at once as a batch, and wouldn't create the "ref
> explosions" mentioned in [1].
> 
> But of course without an accompanying quarantine described above doing
> that would just make this race condition worse.

I'm not sure it really makes it worse. The pack would have the same
mtime as the loose objects would.

-Peff

  reply	other threads:[~2019-05-07  7:52 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-18 16:14 [PATCH 0/4] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-03-18 16:14 ` [PATCH 1/4] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:27   ` Jeff King
2019-03-18 22:18     ` Ævar Arnfjörð Bjarmason
2019-03-18 16:15 ` [PATCH 2/4] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:31   ` Jeff King
2019-03-21 22:11     ` Andreas Heiduk
2019-03-19  2:08   ` Duy Nguyen
2019-03-18 16:15 ` [PATCH 3/4] gc docs: de-duplicate "OPTIONS" and "CONFIGURATION" Ævar Arnfjörð Bjarmason
2019-03-18 21:49   ` Jeff King
2019-03-18 22:48     ` Ævar Arnfjörð Bjarmason
2019-03-18 23:42       ` Jeff King
2019-03-18 16:15 ` [PATCH 4/4] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-18 20:28   ` Jonathan Nieder
2019-03-18 21:22     ` Jeff King
2019-03-18 22:13       ` Ævar Arnfjörð Bjarmason
2019-03-18 23:53         ` Jeff King
2019-03-19  6:54   ` Johannes Sixt
2019-03-19  9:28     ` Ævar Arnfjörð Bjarmason
2019-03-18 21:51 ` [PATCH 0/4] gc docs: modernize and fix the documentation Jeff King
2019-03-18 22:45   ` Ævar Arnfjörð Bjarmason
2019-03-19  0:18     ` Jeff King
2019-05-06  9:44       ` Ævar Arnfjörð Bjarmason
2019-05-07  7:51         ` Jeff King [this message]
2019-05-09 23:20           ` Ævar Arnfjörð Bjarmason
2019-07-31  4:26             ` Jeff King
2019-07-31 10:12               ` Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 00/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 01/11] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-30 18:04     ` Todd Zullinger
2019-04-07 19:52     ` [PATCH v4 00/11] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 01/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 01/10] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  6:01   ` Junio C Hamano
2019-03-21 20:50 ` [PATCH v2 02/10] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 03/10] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 04/10] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 05/10] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 06/10] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 07/10] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 08/10] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 09/10] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 10/10] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190507075158.GG28060@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=matt@mattmccutchen.net \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).