From: "Ævar Arnfjörð Bjarmason" <firstname.lastname@example.org> To: Git Mailing List <email@example.com> Cc: "Junio C Hamano" <firstname.lastname@example.org>, "Nguyễn Thái Ngọc Duy" <email@example.com>, "Christian Couder" <firstname.lastname@example.org> Subject: Re: git gc --auto yelling at users where a repo legitimately has >6700 loose objects Date: Thu, 08 Feb 2018 17:23:47 +0100 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <firstname.lastname@example.org> On Thu, Jan 11 2018, Ævar Arnfjörð Bjarmason jotted: > I recently disabled gc.auto=0 and my nightly aggressive repack script on > our big monorepo across our infra, relying instead on git gc --auto in > the background to just do its thing. > > I didn't want users to wait for git-gc, and I'd written this nightly > cronjob before git-gc learned to detach to the background. > > But now I have git-gc on some servers yelling at users on every pull > command: > > warning: There are too many unreachable loose objects; run 'git prune' to remove them. > > The reason is that I have all the values at git's default settings, and > there legitimately are >~6700 loose objects that were created in the > last 2 weeks. > > For those rusty on git-gc's defaults, this is what it looks like in this > scenario: > > 1. User runs "git pull" > 2. git gc --auto is called, there are >6700 loose objects > 3. it forks into the background, tries to prune and repack, objects > older than gc.pruneExpire (2.weeks.ago) are pruned. > 4. At the end of all this, we check *again* if we have >6700 objects, > if we do we print "run 'git prune'" to .git/gc.log, and will just > emit that error for the next day before trying again, at which point > we unlink the gc.log and retry, see gc.logExpiry. > > Right now I've just worked around this by setting gc.pruneExpire to a > lower value (4.days.ago). But there's a larger issue to be addressed > here, and I'm not sure how. > > When the warning was added in  it didn't know to detach to the > background yet, that came in , shortly after came gc.log in . > > We could add another gc.auto-like limit, which could be set at some > higher value than gc.auto. "Hey if I have more than 6700 loose objects, > prune the <2wks old ones, but if at the end there's still >6700 I don't > want to hear about it unless there's >6700*N". > > I thought I'd just add that, but the details of how to pass that message > around get nasty. With that solution we *also* don't want git gc to > start churning in the background once we reach >6700 objects, so we need > something like gc.logExpiry which defers the gc until the next day. We > might need to create .git/gc-waitabit.marker, ew. > > More generally, these hard limits seem contrary to what the user cares > about. E.g. I suspect that most of these loose objects come from > branches since deleted in upstream, whose objects could have a different > retention policy. > > Or we could say "I want 2 weeks of objects, but if that runs against the > 6700 limit just keep the latest 6700/2". > > 1. a087cc9819 ("git-gc --auto: protect ourselves from accumulated > cruft", 2007-09-17) > 2. 9f673f9477 ("gc: config option for running --auto in background", > 2014-02-08) > 3. 329e6e8794 ("gc: save log from daemonized gc --auto and print it next > time", 2015-09-19) My just-sent "How to produce a loose ref+size explosion via pruning + git-gc", <email@example.com> (https://firstname.lastname@example.org/), shows an easy way to reproduce this. After the steps outlined there git-gc --auto will end up in a state where it'll start telling the user off for having too many loose objects.
prev parent reply other threads:[~2018-02-08 16:23 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-01-11 21:33 Ævar Arnfjörð Bjarmason 2018-01-12 12:07 ` Duy Nguyen 2018-01-12 13:41 ` Duy Nguyen 2018-01-12 14:44 ` Ævar Arnfjörð Bjarmason 2018-01-13 10:07 ` Jeff King 2018-01-12 13:46 ` Jeff King 2018-01-12 14:23 ` Duy Nguyen 2018-01-13 9:58 ` Jeff King 2018-02-08 16:23 ` Ævar Arnfjörð Bjarmason [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: git gc --auto yelling at users where a repo legitimately has >6700 loose objects' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).