git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Auto-gc in the background can take a long time to be put in the background
@ 2019-03-25 23:22 Mike Hommey
  2019-03-25 23:30 ` Jeff King
  2019-03-26 13:43 ` Johannes Schindelin
  0 siblings, 2 replies; 5+ messages in thread
From: Mike Hommey @ 2019-03-25 23:22 UTC (permalink / raw)
  To: git

Hi,

Recently, I've noticed that whenever the auto-gc message shows up about
being spawned in the background, it still takes a while for git to
return to the shell.

I've finally looked at what it was stuck on, and it's 
`git reflog expire --all` taking more than 30s. I guess the question is
whether there's a reason this shouldn't run in the background? Another
is whether there's something that makes this slower than it should be.

Mike

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Auto-gc in the background can take a long time to be put in the background
  2019-03-25 23:22 Auto-gc in the background can take a long time to be put in the background Mike Hommey
@ 2019-03-25 23:30 ` Jeff King
  2019-03-26  6:50   ` Ævar Arnfjörð Bjarmason
  2019-03-26 13:43 ` Johannes Schindelin
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2019-03-25 23:30 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

On Tue, Mar 26, 2019 at 08:22:23AM +0900, Mike Hommey wrote:

> Recently, I've noticed that whenever the auto-gc message shows up about
> being spawned in the background, it still takes a while for git to
> return to the shell.
> 
> I've finally looked at what it was stuck on, and it's 
> `git reflog expire --all` taking more than 30s. I guess the question is
> whether there's a reason this shouldn't run in the background? Another
> is whether there's something that makes this slower than it should be.

The reason is that it takes locks which can interfere with other
operations; see 62aad1849f (gc --auto: do not lock refs in the
background, 2014-05-25).

Unfortunately making it faster is hard. To handle expiring unreachable
items, it has to know what's reachable. Which implies walking the commit
graph. I don't recall offhand whether setting unreachable-expiration to
"never" would skip that part. But if not, that should be low-hanging
fruit.

(I also wonder whether there is really much valuable in keeping
unreachable things for a shorter period of time, and the default should
simply be to just prune everything after 90 days, unreachable or not).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Auto-gc in the background can take a long time to be put in the background
  2019-03-25 23:30 ` Jeff King
@ 2019-03-26  6:50   ` Ævar Arnfjörð Bjarmason
  2019-03-26 13:25     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-26  6:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Mike Hommey, git


On Tue, Mar 26 2019, Jeff King wrote:

> On Tue, Mar 26, 2019 at 08:22:23AM +0900, Mike Hommey wrote:
>
>> Recently, I've noticed that whenever the auto-gc message shows up about
>> being spawned in the background, it still takes a while for git to
>> return to the shell.
>>
>> I've finally looked at what it was stuck on, and it's
>> `git reflog expire --all` taking more than 30s. I guess the question is
>> whether there's a reason this shouldn't run in the background? Another
>> is whether there's something that makes this slower than it should be.
>
> The reason is that it takes locks which can interfere with other
> operations; see 62aad1849f (gc --auto: do not lock refs in the
> background, 2014-05-25).

Even assuming we can never improve this I think we should make this part
configurable. It's assuming that the contention is otherwise going to be
with yourself in the same terminal, but it doesn't help if the primary
source of contention is going to be e.g. other concurrent processes in
the same repo.

> Unfortunately making it faster is hard. To handle expiring unreachable
> items, it has to know what's reachable. Which implies walking the commit
> graph. I don't recall offhand whether setting unreachable-expiration to
> "never" would skip that part. But if not, that should be low-hanging
> fruit.

I have a recently patch that does this that I need to re-roll:
https://public-inbox.org/git/20190315155959.12390-8-avarab@gmail.com/

> (I also wonder whether there is really much valuable in keeping
> unreachable things for a shorter period of time, and the default should
> simply be to just prune everything after 90 days, unreachable or not).

Do you mean unify gc.reflogExpire & gc.pruneExpire (and other
variables). Would that be cheaper somehow?

Or just blindly remove loose objects that are older than some mtime,
assuming that if anyone cared they'd be in a pack already?

The latter of those would be very useful, but if not carefully handled
could lead to corruption.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Auto-gc in the background can take a long time to be put in the background
  2019-03-26  6:50   ` Ævar Arnfjörð Bjarmason
@ 2019-03-26 13:25     ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2019-03-26 13:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Mike Hommey, git

On Tue, Mar 26, 2019 at 07:50:28AM +0100, Ævar Arnfjörð Bjarmason wrote:

> > Unfortunately making it faster is hard. To handle expiring unreachable
> > items, it has to know what's reachable. Which implies walking the commit
> > graph. I don't recall offhand whether setting unreachable-expiration to
> > "never" would skip that part. But if not, that should be low-hanging
> > fruit.
> 
> I have a recently patch that does this that I need to re-roll:
> https://public-inbox.org/git/20190315155959.12390-8-avarab@gmail.com/

I think your patch skips calling git-reflog when both are set to
"never". What I mean is that if regular expiration is set to 90 days,
and unreachable expiration is set to 90 days (or greater), the there is
no need for us to bother walking any history. An entry is either expired
based on time or it is not, regardless of reachability.

> > (I also wonder whether there is really much valuable in keeping
> > unreachable things for a shorter period of time, and the default should
> > simply be to just prune everything after 90 days, unreachable or not).
> 
> Do you mean unify gc.reflogExpire & gc.pruneExpire (and other
> variables). Would that be cheaper somehow?

Yes, this. If we're just expiring based on the timestamp in the reflog,
we should be able to accomplish this with just a single pass over the
reflog data, and never opening any objects at all.

> Or just blindly remove loose objects that are older than some mtime,
> assuming that if anyone cared they'd be in a pack already?

No, definitely not. We're expiring reflogs here, not objects.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Auto-gc in the background can take a long time to be put in the background
  2019-03-25 23:22 Auto-gc in the background can take a long time to be put in the background Mike Hommey
  2019-03-25 23:30 ` Jeff King
@ 2019-03-26 13:43 ` Johannes Schindelin
  1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2019-03-26 13:43 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

Hi Mike,

On Tue, 26 Mar 2019, Mike Hommey wrote:

> Recently, I've noticed that whenever the auto-gc message shows up about
> being spawned in the background, it still takes a while for git to
> return to the shell.
>
> I've finally looked at what it was stuck on, and it's
> `git reflog expire --all` taking more than 30s. I guess the question is
> whether there's a reason this shouldn't run in the background? Another
> is whether there's something that makes this slower than it should be.

Thanks for tracking this down. I hit this problem yesterday and was too
busy with other things to dig into it.

Thank you!
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-26 13:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-25 23:22 Auto-gc in the background can take a long time to be put in the background Mike Hommey
2019-03-25 23:30 ` Jeff King
2019-03-26  6:50   ` Ævar Arnfjörð Bjarmason
2019-03-26 13:25     ` Jeff King
2019-03-26 13:43 ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).