git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [RFC/PATCH] gc: run more pre-detach operations under lock
Date: Thu, 20 Jun 2019 17:26:34 +0700	[thread overview]
Message-ID: <CACsJy8AjXXOpcKrSV4z6kEM=eyFDWSyf==tJZzvDyEN591XdGw@mail.gmail.com> (raw)
In-Reply-To: <87imt18a2r.fsf@evledraar.gmail.com>

On Thu, Jun 20, 2019 at 5:49 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Wed, Jun 19 2019, Jeff King wrote:
>
> > On Wed, Jun 19, 2019 at 08:01:55PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >
> >> > You could sort of avoid the problem here too with
> >> >
> >> > parallel 'git fetch --no-auto-gc {}' ::: $(git remote)
> >> > git gc --auto
> >> >
> >> > It's definitely simpler, but of course we have to manually add
> >> > --no-auto-gc in everywhere we need, so not quite as elegant.
> >> >
> >> > Actually you could already do that with 'git -c gc.auto=false fetch', I guess.
> >>
> >> The point of the 'parallel' example is to show disconnected git
> >> commands, think trying to run 'git' in a terminal while your editor
> >> asynchronously runs a polling 'fetch', or a server with multiple
> >> concurrent clients running 'gc --auto'.
> >>
> >> That's the question my RFC patch raises. As far as I can tell the
> >> approach in your patch is only needed because our locking for gc is
> >> buggy, rather than introduce the caveat that an fetch(N) operation won't
> >> do "gc" until it's finished (we may have hundreds, thousands of remotes,
> >> I use that for some more obscure use-cases) shouldn't we just fix the
> >> locking?
> >
> > I think there may be room for both approaches. Yours fixes the repeated
> > message in the more general case, but Duy's suggestion is the most
> > efficient thing.
> >
> > I agree that the "thousands of remotes" case means we might want to gc
> > in the interim. But we probably ought to do that deterministically
> > rather than hoping that the pattern of lock contention makes sense.
>
> We do it deterministically, when gc.auto thresholds et al are exceeded
> we kick one off without waiting for other stuff, if we can get the lock.
>
> I don't think this desire to just wait a bit until all the fetches are
> complete makes sense as a special-case.
>
> If, as you noted in <20190619190845.GD28145@sigill.intra.peff.net>, the
> desire is to reduce GC CPU use then you're better off just tweaking the
> limits upwards. Then you get that with everything, like when you run
> "commit" in a for-loop, not just this one special case of "fetch".
>
> We have existing potentially long-running operations like "fetch",
> "rebase" and "git svn fetch" that run "gc --auto" for their incremental
> steps, and that's a feature.

gc --auto is added at arbitrary points to help garbage collection. I
don't think it's ever intended to "do gc at this and that exact
moment", just "hey this command has taken a lot of time already (i.e.
no instant response needed) and it may have added a bit more garbage,
let's just check real quick".

> It keeps "gc --auto" dumb enough to avoid a pathological case where
> we'll have a ballooning objects dir because we figure we can run
> something "at the end", when "the end" could be hours away, and we're
> adding a new pack or hundreds of loose objects every second.

Are we optimizing for a rare (large scale) case? Such setup requires
tuning regardless to me.

> So I don't think Duy's patch is a good way to go.

This reminds me of being perfect is the enemy of the good. A normal
user has a couple remotes at most, finishing fast (enough) and in such
case it's a good idea to wait until everything is in before running
gc.

Of course making git-gc more robust wrt. parallel access is great, but
it's hard work. Dealing with locks is always tricky, especially when
new locks can come up any time.

Having said that, I don't mind if my patch gets dropped. It was just a
"hey that multiple gc output looks strange, hah the fix is quite
simple" moment for me.
-- 
Duy

  parent reply	other threads:[~2019-06-20 10:27 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-19  9:46 [PATCH] fetch: only run 'gc' once when fetching multiple remotes Nguyễn Thái Ngọc Duy
2019-06-19 10:26 ` [RFC/PATCH] gc: run more pre-detach operations under lock Ævar Arnfjörð Bjarmason
2019-06-19 12:51   ` Duy Nguyen
2019-06-19 18:01     ` Ævar Arnfjörð Bjarmason
2019-06-19 19:10       ` Jeff King
2019-06-19 22:49         ` Ævar Arnfjörð Bjarmason
2019-06-19 23:30           ` [PATCH 0/6] Change <non-empty?> GIT_TEST_* variables to <boolean> Ævar Arnfjörð Bjarmason
2019-06-20 18:13             ` Junio C Hamano
2019-06-20 21:00               ` Ævar Arnfjörð Bjarmason
2019-06-20 20:03             ` Junio C Hamano
2019-06-20 21:09             ` [PATCH v2 0/8] " Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 1/8] config tests: simplify include cycle test Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 2/8] env--helper: new undocumented builtin wrapping git_env_*() Ævar Arnfjörð Bjarmason
2019-06-21 17:07                 ` Junio C Hamano
2019-06-21 10:18               ` [PATCH v3 3/8] config.c: refactor die_bad_number() to not call gettext() early Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 4/8] t6040 test: stop using global "script" variable Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 5/8] tests: make GIT_TEST_GETTEXT_POISON a boolean Ævar Arnfjörð Bjarmason
2019-06-24 16:47                 ` Junio C Hamano
2019-06-21 10:18               ` [PATCH v3 6/8] tests README: re-flow a previously changed paragraph Ævar Arnfjörð Bjarmason
2019-06-21 10:18               ` [PATCH v3 7/8] tests: replace test_tristate with "git env--helper" Ævar Arnfjörð Bjarmason
2019-09-06 12:13                 ` [PATCH 1/2] t/lib-git-svn.sh: check GIT_TEST_SVN_HTTPD when running SVN HTTP tests SZEDER Gábor
2019-09-06 12:13                   ` [PATCH 2/2] ci: restore running httpd tests SZEDER Gábor
2019-09-06 17:03                     ` Junio C Hamano
2019-09-06 19:13                     ` Jeff King
2019-09-07 10:16                       ` SZEDER Gábor
2019-11-22 13:14                         ` [PATCH 0/2] tests: catch non-bool GIT_TEST_* values SZEDER Gábor
2019-11-22 13:14                           ` [PATCH 1/2] tests: add 'test_bool_env' to " SZEDER Gábor
2019-11-25 13:50                             ` Jeff King
2019-11-22 13:14                           ` [PATCH 2/2] t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool SZEDER Gábor
2019-11-25 13:53                             ` Jeff King
2019-06-21 10:18               ` [PATCH v3 8/8] tests: make GIT_TEST_FAIL_PREREQS a boolean Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 1/8] config tests: simplify include cycle test Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 2/8] env--helper: new undocumented builtin wrapping git_env_*() Ævar Arnfjörð Bjarmason
2019-06-20 22:11               ` Junio C Hamano
2019-06-20 22:21                 ` Junio C Hamano
2019-06-21  8:11                   ` Ævar Arnfjörð Bjarmason
2019-06-21 15:04                     ` Junio C Hamano
2019-06-20 21:09             ` [PATCH v2 3/8] config.c: refactor die_bad_number() to not call gettext() early Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 4/8] t6040 test: stop using global "script" variable Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 5/8] tests: make GIT_TEST_GETTEXT_POISON a boolean Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 6/8] tests README: re-flow a previously changed paragraph Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 7/8] tests: replace test_tristate with "git env--helper" Ævar Arnfjörð Bjarmason
2019-06-20 21:09             ` [PATCH v2 8/8] tests: make GIT_TEST_FAIL_PREREQS a boolean Ævar Arnfjörð Bjarmason
2019-06-19 23:30           ` [PATCH 1/6] env--helper: new undocumented builtin wrapping git_env_*() Ævar Arnfjörð Bjarmason
2019-06-20 19:25             ` Junio C Hamano
2019-06-19 23:30           ` [PATCH 2/6] t6040 test: stop using global "script" variable Ævar Arnfjörð Bjarmason
2019-06-20 19:54             ` Junio C Hamano
2019-06-19 23:30           ` [PATCH 3/6] tests: make GIT_TEST_GETTEXT_POISON a boolean Ævar Arnfjörð Bjarmason
2019-06-20 20:00             ` Junio C Hamano
2019-06-19 23:30           ` [PATCH 4/6] tests README: re-flow a previously changed paragraph Ævar Arnfjörð Bjarmason
2019-06-19 23:30           ` [PATCH 5/6] tests: replace test_tristate with "git env--helper" Ævar Arnfjörð Bjarmason
2019-06-19 23:30           ` [PATCH 6/6] tests: make GIT_TEST_FAIL_PREREQS a boolean Ævar Arnfjörð Bjarmason
2019-06-20 10:26           ` Duy Nguyen [this message]
2019-06-20 21:49             ` [RFC/PATCH] gc: run more pre-detach operations under lock Ævar Arnfjörð Bjarmason
2019-06-20 10:43           ` Jeff King
2019-06-20 18:55         ` Junio C Hamano
2019-06-19 19:08     ` Jeff King
2019-06-19 18:59 ` [PATCH] fetch: only run 'gc' once when fetching multiple remotes Jeff King
2019-06-20 10:11   ` Duy Nguyen
2019-06-20 10:21     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8AjXXOpcKrSV4z6kEM=eyFDWSyf==tJZzvDyEN591XdGw@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).