git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Elijah Newren <newren@gmail.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH] gc: do not warn about too many loose objects
Date: Mon, 16 Jul 2018 16:35:39 -0400	[thread overview]
Message-ID: <20180716203539.GD25189@sigill.intra.peff.net> (raw)
In-Reply-To: <CABPp-BGN4NeH2QAP0WMyuOdr2GYq+hpt9584wdn2jHHinEAgDg@mail.gmail.com>

On Mon, Jul 16, 2018 at 01:21:40PM -0700, Elijah Newren wrote:

> > My understanding is that exploding the objects is intentional behavior,
> > to avoid a race where objects are newly referenced while they are being
> > pruned.
> >
> > I am not a fan of that behavior.  It's still racy.  But when we've
> > brought it up in the past, the consensus seems to have been that it's
> > better than nothing.  Documentation/technical/hash-function-transition.txt
> > section "Loose objects and unreachable objects" describes a way to
> > eliminate the race.
> 
> Ah, that's good to know and at least makes sense.  It seems somewhat
> odd, though; loose objects that are two weeks old are just as
> susceptible to being referenced anew by new commits, so the default of
> running 'git prune --expire=2.weeks.ago' as gc currently does would
> also be unsafe, wouldn't it?  Why is that any more or less unsafe than
> pruning objects only referenced by reflog entries that are more than
> 90 days old?

The 2-week safety isn't primarily about things which just became
unreferenced.  It's about things which are in the act of being
referenced.

Imagine a "git commit" racing with a "git prune". The commit has to
create an object, and then it will update a ref to point to it. But
between those two actions, prune may racily delete the object!
The mtime grace period is what makes that work.

Using 2 weeks is sort of ridiculous for that. But it also helps with
manual recovery (e.g., imagine a blob added to the index but never
committed; 3 days later you may want to try to recover your old work).

And you're correct that a new git-commit may still reference an old
object (e.g., a blob that's 5 seconds shy of being 2 weeks old that
you're including in a new commit). That's why we retain non-fresh
objects that are referenced from fresh ones (so as long as you made the
new commit recently, it transitively infers freshness on the old blob),
and why we fresh mtimes when we elide a write for an existing object.

That's _still_ not race-proof, because none of these operations is
atomic. git-prune can decide the blob is unfresh at the exact moment
you're creating the commit object.

-Peff

  reply	other threads:[~2018-07-16 20:35 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16 17:27 [PATCH] gc: do not warn about too many loose objects Jonathan Tan
2018-07-16 17:51 ` Jeff King
2018-07-16 18:22   ` Jonathan Nieder
2018-07-16 18:52     ` Jeff King
2018-07-16 19:09       ` Jonathan Nieder
2018-07-16 19:41         ` Jeff King
2018-07-16 19:54           ` Jonathan Nieder
2018-07-16 20:29             ` Jeff King
2018-07-16 20:37               ` Jonathan Nieder
2018-07-16 21:09                 ` Jeff King
2018-07-16 21:40                   ` Jonathan Nieder
2018-07-16 21:45                     ` Jeff King
2018-07-16 22:03                       ` Jonathan Nieder
2018-07-16 22:43                         ` Jeff King
2018-07-16 22:56                           ` Jonathan Nieder
2018-07-16 23:26                             ` Jeff King
2018-07-17  1:53                               ` Jonathan Nieder
2018-07-17  8:59                                 ` Ævar Arnfjörð Bjarmason
2018-07-17 14:03                                   ` Jonathan Nieder
2018-07-17 15:24                                     ` Ævar Arnfjörð Bjarmason
2018-07-17 20:27                                   ` Jeff King
2018-07-18 13:11                                     ` Ævar Arnfjörð Bjarmason
2018-07-18 17:29                                       ` Jeff King
2018-07-17 15:59                                 ` Duy Nguyen
2018-07-17 18:09                                 ` Junio C Hamano
2018-07-16 19:15 ` Elijah Newren
2018-07-16 19:19   ` Jonathan Nieder
2018-07-16 20:21     ` Elijah Newren
2018-07-16 20:35       ` Jeff King [this message]
2018-07-16 20:56         ` Jonathan Nieder
2018-07-16 21:12           ` Jeff King
2018-07-16 19:52   ` Jeff King
2018-07-16 20:16     ` Elijah Newren
2018-07-16 20:38       ` Jeff King
2018-07-16 21:09         ` Elijah Newren
2018-07-16 21:21           ` Jeff King
2018-07-16 22:07             ` Elijah Newren
2018-07-16 22:55               ` Jeff King
2018-07-16 23:06                 ` Elijah Newren
2018-07-16 21:31           ` Jonathan Nieder
2018-07-17  6:51 ` [PATCH v2 0/3] gc --auto: do not return error for prior errors in daemonized mode Jonathan Nieder
2018-07-17  6:53   ` [PATCH 1/3] gc: improve handling of errors reading gc.log Jonathan Nieder
2018-07-17 18:19     ` Junio C Hamano
2018-07-17 19:58     ` Jeff King
2018-07-17  6:54   ` [PATCH 2/3] gc: exit with status 128 on failure Jonathan Nieder
2018-07-17 18:22     ` Junio C Hamano
2018-07-17 19:59     ` Jeff King
2018-09-17 18:33       ` Jeff King
2018-09-17 18:40         ` Jonathan Nieder
2018-09-18 17:30           ` Jeff King
2018-07-17  6:57   ` [PATCH 3/3] gc: do not return error for prior errors in daemonized mode Jonathan Nieder
2018-07-17 20:13     ` Jeff King
2018-07-18 16:21       ` Junio C Hamano
2018-07-18 17:22         ` Jeff King
2018-07-18 18:19           ` Junio C Hamano
2018-07-18 19:06             ` Jeff King
2018-07-18 19:55               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180716203539.GD25189@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).