git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* BUG: Race condition due to reflog expiry in "gc"
@ 2019-03-12 23:28 Ævar Arnfjörð Bjarmason
  2019-03-13  1:43 ` Junio C Hamano
  0 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-12 23:28 UTC (permalink / raw)
  To: Git List
  Cc: Nguyễn Thái Ngọc Duy, Michael Haggerty,
	Stefan Beller, Jeff King, Junio C Hamano, Jonathan Nieder

I'm still working on fixing a race condition I encountered in "gc"
recently, but am not 100% sure of the fix. So I thought I'd send a
braindump of what I have so far in case it jolts any memories.

The problem is that when we "gc" we run "reflog expire --all". This
iterates over the reflogs, and takes a *.lock for each reference.

It'll fail intermittendly in two ways:

 1. If something is concurrently committing to the repo it'll fail
    because we for a tiny amount of time hold a HEAD.lock file, so HEAD
    can't be updated.

 2. On a repository that's just being "git fetch"'d by some concurrent
    process the "gc" will fail, because the ref's SHA1 has changed,
    which we inspect as we aquire the lock.

To reproduce this have a repo getting a bunch of commits and "fetch"
from it in a loop[A]. Then try to "gc" in it (or "reflog expire
--all"). You'll get case #2 above (#1 can also be manually reproduce):

    $ git gc; echo $?
    error: cannot lock ref 'refs/remotes/origin/HEAD': ref 'refs/remotes/origin/HEAD' is at d13c708ae0435c88c8f7580442b70df9a40224d0 but expected 7fb5b7dd9988d040a9d29c0456874472bea2b439
    error: cannot lock ref 'refs/remotes/origin/master': ref 'refs/remotes/origin/master' is at d13c708ae0435c88c8f7580442b70df9a40224d0 but expected 7fb5b7dd9988d040a9d29c0456874472bea2b439
    fatal: failed to run reflog
    128

This behavior is due to a combination of:

 - 62aad1849f ("gc --auto: do not lock refs in the background", 2014-05-25)
 - f3b661f766 ("expire_reflog(): use a lock_file for rewriting the
   reflog file", 2014-12-12)

Discussion around those changes:

    https://public-inbox.org/git/1400978309-25235-1-git-send-email-pclouds@gmail.com/
    https://public-inbox.org/git/CAGZ79kaPzMPcxMqsTeW5LjYNc7LbpjHE5eBPHWKvUkBU3NvGJw@mail.gmail.com/

I haven't yet found a reason for why we'd intrinsically need this lock,
and indeed this doesn't produce corruption AFAICT:

    diff --git a/refs/files-backend.c b/refs/files-backend.c
    index ef053f716c..ffb9fbbf3a 100644
    --- a/refs/files-backend.c
    +++ b/refs/files-backend.c
    @@ -3043,7 +3043,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
     	if (!lock) {
     		error("cannot lock ref '%s': %s", refname, err.buf);
     		strbuf_release(&err);
    -		return -1;
    +		return 0;
     	}
     	if (!refs_reflog_exists(ref_store, refname)) {
     		unlock_ref(lock);

But that just "solves" #2, not #1, and also no-oping the OID check in
verify_lock() still has all tests pass, so we're likely missing the
relevant race condition tests.

A.

    (
        cd /tmp &&
        rm -rf git &&
        git init git &&
        cd git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >out &&
            git add out &&
            git commit -m"out"
        done
    )

    (
        cd /tmp &&
        rm -rf git-clone &&
        git clone file:///tmp/git git-clone &&
        cd git-clone &&
        while true
        do
            git reset --hard &&
            git pull
        done
    )

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: BUG: Race condition due to reflog expiry in "gc"
  2019-03-12 23:28 BUG: Race condition due to reflog expiry in "gc" Ævar Arnfjörð Bjarmason
@ 2019-03-13  1:43 ` Junio C Hamano
  2019-03-13 10:28   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Junio C Hamano @ 2019-03-13  1:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jeff King, Jonathan Nieder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I'm still working on fixing a race condition I encountered in "gc"
> recently, but am not 100% sure of the fix. So I thought I'd send a
> braindump of what I have so far in case it jolts any memories.
>
> The problem is that when we "gc" we run "reflog expire --all". This
> iterates over the reflogs, and takes a *.lock for each reference.
>
> It'll fail intermittendly in two ways:
>
>  1. If something is concurrently committing to the repo it'll fail
>     because we for a tiny amount of time hold a HEAD.lock file, so HEAD
>     can't be updated.
>
>  2. On a repository that's just being "git fetch"'d by some concurrent
>     process the "gc" will fail, because the ref's SHA1 has changed,
>     which we inspect as we aquire the lock.

Both sounds very much expected and expectable outcome.  I am not
sure how they need to be called bugs.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: BUG: Race condition due to reflog expiry in "gc"
  2019-03-13  1:43 ` Junio C Hamano
@ 2019-03-13 10:28   ` Ævar Arnfjörð Bjarmason
  2019-03-13 16:02     ` Jeff King
  0 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 10:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jeff King, Jonathan Nieder


On Wed, Mar 13 2019, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> I'm still working on fixing a race condition I encountered in "gc"
>> recently, but am not 100% sure of the fix. So I thought I'd send a
>> braindump of what I have so far in case it jolts any memories.
>>
>> The problem is that when we "gc" we run "reflog expire --all". This
>> iterates over the reflogs, and takes a *.lock for each reference.
>>
>> It'll fail intermittendly in two ways:
>>
>>  1. If something is concurrently committing to the repo it'll fail
>>     because we for a tiny amount of time hold a HEAD.lock file, so HEAD
>>     can't be updated.
>>
>>  2. On a repository that's just being "git fetch"'d by some concurrent
>>     process the "gc" will fail, because the ref's SHA1 has changed,
>>     which we inspect as we aquire the lock.
>
> Both sounds very much expected and expectable outcome.  I am not
> sure how they need to be called bugs.

Let's leave aside that I started the subject with "BUG:" and let me
rephrase.

I was under the impression that git-gc was supposed to support operating
on a repository that's concurrently being modified, as long as you don't
set the likes of gc.pruneExpire too aggressively.

Running a "gc" in a loop without "git reflog expire --all" and when
watching the repository being GC'd with:

    fswatch -l 0.1 -t -r . 2>&1 | grep lock

We only create .git/MERGE_RR.lock, .git/gc.pid.lock and
git/packed-refs.lock. These are all things that would only cause another
concurrent GC to fail, not a normal git command.

So the only reason a concurrent commit (case #1) fails is because of the
refs being locked during the reflog iteration, and similarly "gc" itself
will fail due to a concurrently updating ref (case #2).

It seems that first of all we need this, I'll submit that as a separate
patch sometime soon:

    diff --git a/builtin/gc.c b/builtin/gc.c
    index 020f725acc..ae488646e1 100644
    --- a/builtin/gc.c
    +++ b/builtin/gc.c
    @@ -127,6 +127,12 @@ static void gc_config(void)
     			pack_refs = git_config_bool("gc.packrefs", value);
     	}

    +	if (!git_config_get_value("gc.reflogexpire", &value) && value &&
    +	    !strcmp(value, "never") &&
    +	    !git_config_get_value("gc.reflogexpireunreachable", &value) && value &&
    +	    !strcmp(value, "never"))
    +		prune_reflogs = 0;
    +
     	git_config_get_int("gc.aggressivewindow", &aggressive_window);
     	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
     	git_config_get_int("gc.auto", &gc_auto_threshold);

I.e. now even if your gc.* config says you don't want the reflogs
touched, we still pointlessly iterate over all of them. The case I'm
running into (a variant of #2) is one solved by that patch, i.e. I'm
fine "gc" just having the reflogs kept forever as a workaround in this
case.

Something like that should have been added back in 62aad1849f ("gc
--auto: do not lock refs in the background", 2014-05-25), i.e. now the
"prune_reflogs" variable is never used, it's just cargo-culted from a
copy/pasting of the "pack_refs" code.

In other "gc" phases in "pack-objects" and "prune" we also look at the
reflogs. This obviously bad patch ignores them entirely:

    diff --git a/builtin/prune.c b/builtin/prune.c
    index 97613eccb5..bccee7813e 100644
    --- a/builtin/prune.c
    +++ b/builtin/prune.c
    @@ -41,7 +41,7 @@ static void perform_reachability_traversal(struct rev_info *revs)

     	if (show_progress)
     		progress = start_delayed_progress(_("Checking connectivity"), 0);
    -	mark_reachable_objects(revs, 1, expire, progress);
    +	mark_reachable_objects(revs, 0, expire, progress);
     	stop_progress(&progress);
     	initialized = 1;
     }
    diff --git a/builtin/repack.c b/builtin/repack.c
    index 67f8978043..618ffbfe0a 100644
    --- a/builtin/repack.c
    +++ b/builtin/repack.c
    @@ -364,7 +364,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
     				 keep_pack_list.items[i].string);
     	argv_array_push(&cmd.args, "--non-empty");
     	argv_array_push(&cmd.args, "--all");
    -	argv_array_push(&cmd.args, "--reflog");
     	argv_array_push(&cmd.args, "--indexed-objects");
     	if (repository_format_partial_clone)
     		argv_array_push(&cmd.args, "--exclude-promisor-objects");

I'm just including that as illustration that add_reflogs_to_pending() in
revision.c during "gc" already iterates over the reflogs without locking
anything, but of course it's just reading them.

So one thing that would mitigate things a lot is if
files_reflog_expire() and its call to expire_reflog_ent() via
refs_for_each_reflog_ent() would lazily aquire the lock on the ref.

Digging a bit further that's actually what we're doing now since
4ff0f01cb7 ("refs: retry acquiring reference locks for 100ms",
2017-08-21).

But this runs into the logic we've had for a long time, or since your
bda3a31cc7 ("reflog-expire: Avoid creating new files in a directory
inside readdir(3) loop", 2008-01-25) where we first loop over all the
refs in the process of finding the reflogs, and then will try to lock
those refs at those expected SHA-1s. If they've changed in the meantime
we error out don't clean up the lockfile.

So just this fixes that:

    diff --git a/refs/files-backend.c b/refs/files-backend.c
    index ef053f716c..b6576f28b9 100644
    --- a/refs/files-backend.c
    +++ b/refs/files-backend.c
    @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
     	 * reference itself, plus we might need to update the
     	 * reference if --updateref was specified:
     	 */
    -	lock = lock_ref_oid_basic(refs, refname, oid,
    +	lock = lock_ref_oid_basic(refs, refname, NULL,
     				  NULL, NULL, REF_NO_DEREF,
     				  &type, &err);
     	if (!lock) {

Which seems sensible to me. We'll still get the lock, we just don't
assert that the refname we use to get the lock must be at that
SHA-1. We'll still use it for the purposes of expiry.

But maybe I've missed some caveat in reflog_expiry_prepare() and friends
and we really do need the reflog at that OID, then this would suck less:

    diff --git a/builtin/reflog.c b/builtin/reflog.c
    index 4d3430900d..4bb272fdc8 100644
    --- a/builtin/reflog.c
    +++ b/builtin/reflog.c
    @@ -625,12 +625,16 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
     		free_worktrees(worktrees);
     		for (i = 0; i < collected.nr; i++) {
     			struct collected_reflog *e = collected.e[i];
    +			int st;
     			set_reflog_expiry_param(&cb.cmd, explicit_expiry, e->reflog);
    -			status |= reflog_expire(e->reflog, &e->oid, flags,
    -						reflog_expiry_prepare,
    -						should_expire_reflog_ent,
    -						reflog_expiry_cleanup,
    -						&cb);
    +			st = reflog_expire(e->reflog, &e->oid, flags,
    +					   reflog_expiry_prepare,
    +					   should_expire_reflog_ent,
    +					   reflog_expiry_cleanup,
    +					   &cb);
    +			if (st == -2)
    +				continue;
    +			status |= st;
     			free(e);
     		}
     		free(collected.e);
    diff --git a/refs/files-backend.c b/refs/files-backend.c
    index ef053f716c..8b0b6b7b85 100644
    --- a/refs/files-backend.c
    +++ b/refs/files-backend.c
    @@ -3041,6 +3041,11 @@ static int files_reflog_expire(struct ref_store *ref_store,
     				  NULL, NULL, REF_NO_DEREF,
     				  &type, &err);
     	if (!lock) {
    +		if (errno == EBUSY) {
    +			warning("cannot lock ref '%s': %s. Skipping!", refname, err.buf);
    +			strbuf_release(&err);
    +			return -2;
    +		}
     		error("cannot lock ref '%s': %s", refname, err.buf);
     		strbuf_release(&err);
     		return -1;

I.e. we just detect the EBUSY that verify_lock() sets if the OID doesn't
match, and don't prune that reflog. As seen above "pack-objects" and
"prune" will still iterate over the same logs later for the purposes of
reachability, so this shouldn't get us into a corrupt state due to
throwing away objects referenced in those logs, we'll just prune fewer
things than we could have.

So I think I'll use the first patch noted above as a hack to solve the
narrow problem I have now, but any comments on the above most
welcome. I'm not very familiar with the ref code in case that wasn't
obvious already.

B.t.w. the mention of f3b661f766 ("expire_reflog(): use a lock_file for
rewriting the reflog file", 2014-12-12) upthread is irrelevant. That's a
commit where we use the lockfile code to write out the *new* reflog,
which is unrelated to all of this.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: BUG: Race condition due to reflog expiry in "gc"
  2019-03-13 10:28   ` Ævar Arnfjörð Bjarmason
@ 2019-03-13 16:02     ` Jeff King
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Jeff King @ 2019-03-13 16:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Git List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Wed, Mar 13, 2019 at 11:28:39AM +0100, Ævar Arnfjörð Bjarmason wrote:

> I was under the impression that git-gc was supposed to support operating
> on a repository that's concurrently being modified, as long as you don't
> set the likes of gc.pruneExpire too aggressively.

To some degree. If it has to take locks to modify items, then inherently
there's going to be lock contention. But we may be able to work around
it some.

A big one we used to hit at GitHub is that running `git pack-refs` needs
to take the individual ref locks when pruning loose refs that have just
been packed. We dealt with that by adding retry-with-timeout logic to
ref lock acquisition, in 4ff0f01cb7 (refs: retry acquiring reference
locks for 100ms, 2017-08-21). Since then, I can't remember seeing a
single instance of this coming up in production use.

I don't think we use a timeout with the reflog lock. Maybe we ought to.
It might need to be longer than the 100ms default for refs, since I
think we'd do more significant work during the reflog expiration. On the
other hand, I think we hold the lock on the ref itself, too, during that
expiration.

We don't actually expire reflogs regularly at GitHub, so I can't say one
way or the other if there would be lock contention there.

> Running a "gc" in a loop without "git reflog expire --all" and when
> watching the repository being GC'd with:
> 
>     fswatch -l 0.1 -t -r . 2>&1 | grep lock
> 
> We only create .git/MERGE_RR.lock, .git/gc.pid.lock and
> git/packed-refs.lock. These are all things that would only cause another
> concurrent GC to fail, not a normal git command.

The packed-refs.lock can conflict with a normal operation; regular
writers need to update it when they delete a ref.

You'd also be locking regular refs as part of "pack-refs --prune", but
you probably don't see it running gc in a loop, because all of the loose
refs are pruned after the first run.

> I'm just including that as illustration that add_reflogs_to_pending() in
> revision.c during "gc" already iterates over the reflogs without locking
> anything, but of course it's just reading them.

Right. It's always safe to read without locking (refs, packed-refs, and
reflogs).

> So just this fixes that:
> 
>     diff --git a/refs/files-backend.c b/refs/files-backend.c
>     index ef053f716c..b6576f28b9 100644
>     --- a/refs/files-backend.c
>     +++ b/refs/files-backend.c
>     @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
>      	 * reference itself, plus we might need to update the
>      	 * reference if --updateref was specified:
>      	 */
>     -	lock = lock_ref_oid_basic(refs, refname, oid,
>     +	lock = lock_ref_oid_basic(refs, refname, NULL,
>      				  NULL, NULL, REF_NO_DEREF,
>      				  &type, &err);
>      	if (!lock) {
> 
> Which seems sensible to me. We'll still get the lock, we just don't
> assert that the refname we use to get the lock must be at that
> SHA-1. We'll still use it for the purposes of expiry.

I _think_ this should be OK, because we don't open the reflog until
after we hold the lock. So we don't really care what the value was at.
Wherever we got the lock, we'll do the expiration process atomically at
that point.

I don't think this makes all your problems go away, though, because
you'd still have the immediate contention of actually taking the lock
(at any value).

> But maybe I've missed some caveat in reflog_expiry_prepare() and friends
> and we really do need the reflog at that OID, then this would suck less:

If I'm reading the code correctly, we don't call reflog_expiry_prepare()
until we're holding the lock (because we pass it in as a callback to
reflog_expire()). So I think that would be OK.

>      	if (!lock) {
>     +		if (errno == EBUSY) {
>     +			warning("cannot lock ref '%s': %s. Skipping!", refname, err.buf);
>     +			strbuf_release(&err);
>     +			return -2;
>     +		}
>      		error("cannot lock ref '%s': %s", refname, err.buf);
>      		strbuf_release(&err);
>      		return -1;
> 
> I.e. we just detect the EBUSY that verify_lock() sets if the OID doesn't
> match, and don't prune that reflog.

That saves git-gc from failing. But do we have the problem in the other
direction? I.e., that the gc would take a lock, and the actual user
request to update a ref would fail?

That's what the retry-with-timeout is supposed to address, so maybe it
works. But I wouldn't be surprised if it's insufficient in practice,
since the reflog code may walk big parts of the graph under lock,
checking reachability.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: BUG: Race condition due to reflog expiry in "gc"
  2019-03-13 16:02     ` Jeff King
@ 2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                           ` (6 more replies)
  0 siblings, 7 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 16:22 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Git List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder


On Wed, Mar 13 2019, Jeff King wrote:

> On Wed, Mar 13, 2019 at 11:28:39AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> I was under the impression that git-gc was supposed to support operating
>> on a repository that's concurrently being modified, as long as you don't
>> set the likes of gc.pruneExpire too aggressively.
>
> To some degree. If it has to take locks to modify items, then inherently
> there's going to be lock contention. But we may be able to work around
> it some.
>
> A big one we used to hit at GitHub is that running `git pack-refs` needs
> to take the individual ref locks when pruning loose refs that have just
> been packed. We dealt with that by adding retry-with-timeout logic to
> ref lock acquisition, in 4ff0f01cb7 (refs: retry acquiring reference
> locks for 100ms, 2017-08-21). Since then, I can't remember seeing a
> single instance of this coming up in production use.
>
> I don't think we use a timeout with the reflog lock. Maybe we ought to.
> It might need to be longer than the 100ms default for refs, since I
> think we'd do more significant work during the reflog expiration. On the
> other hand, I think we hold the lock on the ref itself, too, during that
> expiration.
>
> We don't actually expire reflogs regularly at GitHub, so I can't say one
> way or the other if there would be lock contention there.
>
>> Running a "gc" in a loop without "git reflog expire --all" and when
>> watching the repository being GC'd with:
>>
>>     fswatch -l 0.1 -t -r . 2>&1 | grep lock
>>
>> We only create .git/MERGE_RR.lock, .git/gc.pid.lock and
>> git/packed-refs.lock. These are all things that would only cause another
>> concurrent GC to fail, not a normal git command.
>
> The packed-refs.lock can conflict with a normal operation; regular
> writers need to update it when they delete a ref.
>
> You'd also be locking regular refs as part of "pack-refs --prune", but
> you probably don't see it running gc in a loop, because all of the loose
> refs are pruned after the first run.
>
>> I'm just including that as illustration that add_reflogs_to_pending() in
>> revision.c during "gc" already iterates over the reflogs without locking
>> anything, but of course it's just reading them.
>
> Right. It's always safe to read without locking (refs, packed-refs, and
> reflogs).
>
>> So just this fixes that:
>>
>>     diff --git a/refs/files-backend.c b/refs/files-backend.c
>>     index ef053f716c..b6576f28b9 100644
>>     --- a/refs/files-backend.c
>>     +++ b/refs/files-backend.c
>>     @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
>>      	 * reference itself, plus we might need to update the
>>      	 * reference if --updateref was specified:
>>      	 */
>>     -	lock = lock_ref_oid_basic(refs, refname, oid,
>>     +	lock = lock_ref_oid_basic(refs, refname, NULL,
>>      				  NULL, NULL, REF_NO_DEREF,
>>      				  &type, &err);
>>      	if (!lock) {
>>
>> Which seems sensible to me. We'll still get the lock, we just don't
>> assert that the refname we use to get the lock must be at that
>> SHA-1. We'll still use it for the purposes of expiry.
>
> I _think_ this should be OK, because we don't open the reflog until
> after we hold the lock. So we don't really care what the value was at.
> Wherever we got the lock, we'll do the expiration process atomically at
> that point.
>
> I don't think this makes all your problems go away, though, because
> you'd still have the immediate contention of actually taking the lock
> (at any value).

Yeah, but as far as I can tell core.filesRefLockTimeout added in
4ff0f01cb7 (refs: retry acquiring reference locks for 100ms, 2017-08-21)
solves that.

I.e. the problem after that patch isn't the contention, but the call to
lock_ref_oid_basic() in files_reflog_expire() above which *after* we get
the lock effectively asserts that the value must be at the old OID.

It would then hard-die, so git-reflog would leave behind stay *.lock
files that e.g. a concurrent "commit" would die on (still there after
100ms, and beyond). Whereas if we're not picky about the OID the two
will have contention, but core.filesRefLockTimeout will work around it.

I've been running this whole thing in a loop for a while with that
s/oid/NULL/ fix, and it just works. To the extent that it doesn't (due
to load) I can just bump core.filesRefLockTimeout up a bit.

So with the benefit of some sleep & your comment I'm pretty sure that's
the right fix. Ideally it would have a test, do we have some simliar
contention-creating tests I can piggy-back on, or maybe we should just
change this and hope for the best...

>> But maybe I've missed some caveat in reflog_expiry_prepare() and friends
>> and we really do need the reflog at that OID, then this would suck less:
>
> If I'm reading the code correctly, we don't call reflog_expiry_prepare()
> until we're holding the lock (because we pass it in as a callback to
> reflog_expire()). So I think that would be OK.
>
>>      	if (!lock) {
>>     +		if (errno == EBUSY) {
>>     +			warning("cannot lock ref '%s': %s. Skipping!", refname, err.buf);
>>     +			strbuf_release(&err);
>>     +			return -2;
>>     +		}
>>      		error("cannot lock ref '%s': %s", refname, err.buf);
>>      		strbuf_release(&err);
>>      		return -1;
>>
>> I.e. we just detect the EBUSY that verify_lock() sets if the OID doesn't
>> match, and don't prune that reflog.
>
> That saves git-gc from failing. But do we have the problem in the other
> direction? I.e., that the gc would take a lock, and the actual user
> request to update a ref would fail?

No, because as with s/oid/NULL/ above core.filesRefLockTimeout saves the
day.

> That's what the retry-with-timeout is supposed to address, so maybe it
> works. But I wouldn't be surprised if it's insufficient in practice,
> since the reflog code may walk big parts of the graph under lock,
> checking reachability.

I suppose this can happen if the reflog for a given branch is so big
that it takes us more than 100ms to parse it, but e.g. O(n) refs
shouldn't matter, since we only hold the lock on one ref at a time.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/5] gc: minor code cleanup + contention fixes
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                             ` (7 more replies)
  2019-03-13 23:54         ` [PATCH 1/5] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
                           ` (5 subsequent siblings)
  6 siblings, 8 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

[45]/5 fix a couple of issues I noted upthread. While I was at it I
noticed a few things I could clean up in [123]/5 that I figured I'd
send alongside this.

Ævar Arnfjörð Bjarmason (5):
  gc: remove redundant check for gc_auto_threshold
  gc: convert to using the_hash_algo
  gc: refactor a "call me once" pattern
  gc: don't run "reflog expire" when keeping reflogs
  reflog expire: don't assert the OID when locking refs

 builtin/gc.c         | 33 +++++++++++++++++++++++++--------
 refs/files-backend.c |  2 +-
 t/t6500-gc.sh        | 19 +++++++++++++++++++
 3 files changed, 45 insertions(+), 9 deletions(-)

-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/5] gc: remove redundant check for gc_auto_threshold
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14  0:25           ` Jeff King
  2019-03-13 23:54         ` [PATCH 2/5] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
                           ` (4 subsequent siblings)
  6 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Checking gc_auto_threshold in too_many_loose_objects() was added in
17815501a8 ("git-gc --auto: run "repack -A -d -l" as necessary.",
2007-09-17) when need_to_gc() itself was also reliant on
gc_auto_pack_limit before its early return:

    gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0

When that check was simplified to just checking "gc_auto_threshold <=
0" in b14d255ba8 ("builtin-gc.c: allow disabling all auto-gc'ing by
assigning 0 to gc.auto", 2008-03-19) this unreachable code should have
been removed. We only call too_many_loose_objects() from within
need_to_gc() itself, which will return if this condition holds, and in
cmd_gc() which will return before ever getting to "auto_gc &&
too_many_loose_objects()" if "auto_gc && !need_to_gc()" is true
earlier in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 020f725acc..8c2312681c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -157,9 +157,6 @@ static int too_many_loose_objects(void)
 	int num_loose = 0;
 	int needed = 0;
 
-	if (gc_auto_threshold <= 0)
-		return 0;
-
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
 		return 0;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 2/5] gc: convert to using the_hash_algo
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
  2019-03-13 23:54         ` [PATCH 1/5] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14  0:26           ` Jeff King
  2019-03-13 23:54         ` [PATCH 3/5] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
                           ` (3 subsequent siblings)
  6 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

There's been a lot of changing of the hardcoded "40" values to
the_hash_algo->hexsz, but we've so far missed this one where we
hardcoded 38 for the loose object file length.

This is because a SHA-1 like abcde[...] gets turned into
objects/ab/cde[...]. There's no reason to suppose the same won't be
the case for SHA-256, and reading between the lines in
hash-function-transition.txt the format is planned to be the same.

However, we may want to modify this code for the hash function
transition. There's a potential pathological case here where we'll
only consider the loose objects for the currently active hash, but
objects for that hash will share a directory storage with the other
hash.

Thus we could theoretically have 1k SHA-1 loose objects, and say 1
million SHA-256 objects, and not notice because we're currently using
SHA-1.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 8c2312681c..9c2c63276d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -156,6 +156,8 @@ static int too_many_loose_objects(void)
 	int auto_threshold;
 	int num_loose = 0;
 	int needed = 0;
+	const unsigned hexsz = the_hash_algo->hexsz;
+	const unsigned hexsz_loose = hexsz - 2;
 
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
@@ -163,8 +165,8 @@ static int too_many_loose_objects(void)
 
 	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
-		if (strspn(ent->d_name, "0123456789abcdef") != 38 ||
-		    ent->d_name[38] != '\0')
+		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
+		    ent->d_name[hexsz_loose] != '\0')
 			continue;
 		if (++num_loose > auto_threshold) {
 			needed = 1;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 3/5] gc: refactor a "call me once" pattern
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
                           ` (2 preceding siblings ...)
  2019-03-13 23:54         ` [PATCH 2/5] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14  0:30           ` Jeff King
  2019-03-13 23:54         ` [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
                           ` (2 subsequent siblings)
  6 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change an idiom we're using to ensure that gc_before_repack() only
does work once (see 62aad1849f ("gc --auto: do not lock refs in the
background", 2014-05-25)) to be more obvious.

Nothing except this function cares about the "pack_refs" and
"prune_reflogs" variables, so let's not leave the reader wondering if
they're being zero'd out for later use somewhere else.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 9c2c63276d..425d0fa830 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -490,14 +490,15 @@ static int report_last_gc_error(void)
 
 static void gc_before_repack(void)
 {
+	static int done = 0;
+	if (done++)
+		return;
+
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, pack_refs_cmd.argv[0]);
 
 	if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, reflog.argv[0]);
-
-	pack_refs = 0;
-	prune_reflogs = 0;
 }
 
 int cmd_gc(int argc, const char **argv, const char *prefix)
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
                           ` (3 preceding siblings ...)
  2019-03-13 23:54         ` [PATCH 3/5] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14  0:40           ` Jeff King
  2019-03-13 23:54         ` [PATCH 5/5] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
  2019-03-14  0:25         ` BUG: Race condition due to reflog expiry in "gc" Jeff King
  6 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Don't redundantly run "git reflog expire --all" when gc.reflogExpire
and gc.reflogExpireUnreachable are set to "never".

I'm being careful to not repeat the issue fixed in
8ab5aa4bd8 ("parseopt: handle malformed --expire arguments more
nicely", 2018-04-21). We'll die early if the config variables are set
to invalid values.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c  | 17 +++++++++++++++++
 t/t6500-gc.sh | 19 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 425d0fa830..91b088dbfe 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -116,6 +116,19 @@ static void process_log_file_on_signal(int signo)
 	raise(signo);
 }
 
+static int gc_config_is_timestamp_never(const char *var)
+{
+	const char *value;
+	timestamp_t expire;
+
+	if (!git_config_get_value(var, &value) && value) {
+		if (parse_expiry_date(value, &expire))
+			die(_("failed to parse '%s' value '%s'"), var, value);
+		return expire == 0;
+	}
+	return 0;
+}
+
 static void gc_config(void)
 {
 	const char *value;
@@ -127,6 +140,10 @@ static void gc_config(void)
 			pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
+	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
+	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
+		prune_reflogs = 0;
+
 	git_config_get_int("gc.aggressivewindow", &aggressive_window);
 	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
 	git_config_get_int("gc.auto", &gc_auto_threshold);
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 4684d06552..7411bf7fec 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -120,6 +120,25 @@ test_expect_success 'gc --quiet' '
 	test_must_be_empty stderr
 '
 
+test_expect_success 'gc.reflogExpire{Unreachable,}=never skips "expire" via "gc"' '
+	test_config gc.reflogExpire never &&
+	test_config gc.reflogExpireUnreachable never &&
+
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+
+	# Check that git-pack-refs is run as a sanity check (done via
+	# gc_before_repack()) but that git-expire is not.
+	grep -E "^trace: (built-in|exec|run_command): git pack-refs --" trace.out &&
+	! grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
+test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "expire" via "gc"' '
+	>trace.out &&
+	test_config gc.reflogExpire never &&
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+	grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 5/5] reflog expire: don't assert the OID when locking refs
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
                           ` (4 preceding siblings ...)
  2019-03-13 23:54         ` [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
@ 2019-03-13 23:54         ` Ævar Arnfjörð Bjarmason
  2019-03-14  0:44           ` Jeff King
  2019-03-14  0:25         ` BUG: Race condition due to reflog expiry in "gc" Jeff King
  6 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 23:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

The locking protocol for reflogs involves getting a *.lock file on the
loose ref[1] (even if the actual ref is packed). This isn't needed to
expire the reflog, and needlessly results promotes reference update
contention to hard errors in e.g. "gc".

During reflog expiry, the cmd_reflog_expire() function first iterates
over all known references, and then one-by-one acquires the lock for
each one to expire its reflog. By the time it gets around to
re-visiting the references some of the OIDs may have changed.

This has been the case ever since "reflog expire" was initially
implemented in 4264dc15e1 ("git reflog expire", 2006-12-19). As seen
in that simpler initial version of the code (and the same is the case
before this change) we subsequently use the OID to inform the expiry,
but never needed to use it to lock the reference associated with the
reflog.

Thus the verify_lock() function would fail with e.g. "ref '%s' is at
%s but expected %s" if the repository was being updated concurrent to
the "reflog expire".

This made "reflog expire" exit with a non-zero exit status, which in
turn meant that a "gc" command (which expires reflogs before forking
to the background) would encounter a hard error in such a scenario.

If we instead lock the reference without considering what OID we last
saw it at, we won't encounter user-visible contention to the extent
that core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs:
retry acquiring reference locks for 100ms", 2017-08-21).

Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:

    (
        cd /tmp &&
        rm -rf git &&
        git init git &&
        cd git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >out &&
            git add out &&
            git commit -m"out"
        done
    )

    (
        cd /tmp &&
        rm -rf git-clone &&
        git clone file:///tmp/git git-clone &&
        cd git-clone &&
        while git pull
        do
            date
        done
    )

    (
        cd /tmp/git-clone &&
        while git reflog expire --all
        do
            date
        done
    )

Before this change the "reflog expire" would fail really quickly with
a "but expected" error. After this change both the "pull" and "reflog
expire" will run for a while, but eventually fail because I get
unlucky with core.filesRefLockTimeout (the "reflog expire" is in a
really tight loop). That can be resolved by being more generous with
higher values of core.filesRefLockTimeout than the 100ms default.

1. https://public-inbox.org/git/54857871.5090805@alum.mit.edu/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index ef053f716c..4d4d226601 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
 	 * reference itself, plus we might need to update the
 	 * reference if --updateref was specified:
 	 */
-	lock = lock_ref_oid_basic(refs, refname, oid,
+	lock = lock_ref_oid_basic(refs, refname, NULL /* NOT oid! */,
 				  NULL, NULL, REF_NO_DEREF,
 				  &type, &err);
 	if (!lock) {
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: BUG: Race condition due to reflog expiry in "gc"
  2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
                           ` (5 preceding siblings ...)
  2019-03-13 23:54         ` [PATCH 5/5] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:25         ` Jeff King
  6 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Git List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Wed, Mar 13, 2019 at 05:22:22PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > That's what the retry-with-timeout is supposed to address, so maybe it
> > works. But I wouldn't be surprised if it's insufficient in practice,
> > since the reflog code may walk big parts of the graph under lock,
> > checking reachability.
> 
> I suppose this can happen if the reflog for a given branch is so big
> that it takes us more than 100ms to parse it, but e.g. O(n) refs
> shouldn't matter, since we only hold the lock on one ref at a time.

I think it's more than parsing, though.

It's been a while since I've dug into the reflog expiration code, but my
recollection is that it can actually walk parts of the graph (looking
for what's reachable) while processing the reflog. That can be tens of
seconds in decent-sized repositories.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 1/5] gc: remove redundant check for gc_auto_threshold
  2019-03-13 23:54         ` [PATCH 1/5] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:25           ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 12:54:35AM +0100, Ævar Arnfjörð Bjarmason wrote:

> Checking gc_auto_threshold in too_many_loose_objects() was added in
> 17815501a8 ("git-gc --auto: run "repack -A -d -l" as necessary.",
> 2007-09-17) when need_to_gc() itself was also reliant on
> gc_auto_pack_limit before its early return:
> 
>     gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0
> 
> When that check was simplified to just checking "gc_auto_threshold <=
> 0" in b14d255ba8 ("builtin-gc.c: allow disabling all auto-gc'ing by
> assigning 0 to gc.auto", 2008-03-19) this unreachable code should have
> been removed. We only call too_many_loose_objects() from within
> need_to_gc() itself, which will return if this condition holds, and in
> cmd_gc() which will return before ever getting to "auto_gc &&
> too_many_loose_objects()" if "auto_gc && !need_to_gc()" is true
> earlier in the function.

Yep, this is an obvious good cleanup.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 2/5] gc: convert to using the_hash_algo
  2019-03-13 23:54         ` [PATCH 2/5] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:26           ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 12:54:36AM +0100, Ævar Arnfjörð Bjarmason wrote:

> There's been a lot of changing of the hardcoded "40" values to
> the_hash_algo->hexsz, but we've so far missed this one where we
> hardcoded 38 for the loose object file length.
> 
> This is because a SHA-1 like abcde[...] gets turned into
> objects/ab/cde[...]. There's no reason to suppose the same won't be
> the case for SHA-256, and reading between the lines in
> hash-function-transition.txt the format is planned to be the same.

Yep, makes sense.

> However, we may want to modify this code for the hash function
> transition. There's a potential pathological case here where we'll
> only consider the loose objects for the currently active hash, but
> objects for that hash will share a directory storage with the other
> hash.
> 
> Thus we could theoretically have 1k SHA-1 loose objects, and say 1
> million SHA-256 objects, and not notice because we're currently using
> SHA-1.

I agree that we may end up needing to touch this, but I think this patch
doesn't make anything worse in that respect (and likely makes it better,
since we at least know this "38" is supposed to be a hash).

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 8c2312681c..9c2c63276d 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -156,6 +156,8 @@ static int too_many_loose_objects(void)
>  	int auto_threshold;
>  	int num_loose = 0;
>  	int needed = 0;
> +	const unsigned hexsz = the_hash_algo->hexsz;
> +	const unsigned hexsz_loose = hexsz - 2;

It doesn't look like hexsz gets used anywhere else; is it worth having
the extra variable? (Admittedly this quite a nit).

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/5] gc: refactor a "call me once" pattern
  2019-03-13 23:54         ` [PATCH 3/5] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:30           ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 12:54:37AM +0100, Ævar Arnfjörð Bjarmason wrote:

> Change an idiom we're using to ensure that gc_before_repack() only
> does work once (see 62aad1849f ("gc --auto: do not lock refs in the
> background", 2014-05-25)) to be more obvious.
> 
> Nothing except this function cares about the "pack_refs" and
> "prune_reflogs" variables, so let's not leave the reader wondering if
> they're being zero'd out for later use somewhere else.

I agree the existing code is not very obvious about what it's trying to
do. I think a comment would have helped a lot.

Your rewrite is definitely better than the original, but I think it
might also benefit from a comment. I.e., something like:

>  static void gc_before_repack(void)
>  {
	/*
	 * We may be called twice, as both the pre- and post-daemonized
	 * phases will call us, but running these commands more than
	 * once is pointless and wasteful.
	 */
> +	static int done = 0;
> +	if (done++)
> +		return;

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs
  2019-03-13 23:54         ` [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:40           ` Jeff King
  2019-03-14  4:51             ` Junio C Hamano
  0 siblings, 1 reply; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 12:54:38AM +0100, Ævar Arnfjörð Bjarmason wrote:

> Don't redundantly run "git reflog expire --all" when gc.reflogExpire
> and gc.reflogExpireUnreachable are set to "never".
> 
> I'm being careful to not repeat the issue fixed in
> 8ab5aa4bd8 ("parseopt: handle malformed --expire arguments more
> nicely", 2018-04-21). We'll die early if the config variables are set
> to invalid values.

I think the general sentiment here makes sense, that reflog expiration
should be a noop if we're set to "never" anyway.

It does feel a little funny for "gc" to be peeking into the internals of
how "reflog" will make its decision about how to run, though. I doubt
those rules are likely to change, but if they did, there'd be very
little to remind somebody working on reflog that they need to change the
matching rules here.

Is our problem running "git reflog" at all, or is it that "git reflog"
does too much work even when it's been told "never"? If the latter,
could we just have it exit early as soon as it's figured out the prune
expiry dates it's using?

I.e., something like:

diff --git a/builtin/reflog.c b/builtin/reflog.c
index 4d3430900d..aab221f315 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -594,6 +594,15 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			break;
 	}
 
+	/*
+	 * If we're not expiring anything and not dropping stale entries,
+	 * there's no point in even opening the reflogs, since we're guaranteed
+	 * to do nothing.
+	 */
+	if (!cb.cmd.expire_total && !cb.cmd.expire_unreachable &&
+	    !cb.cmd.stalefix)
+		return 0;
+
 	/*
 	 * We can trust the commits and objects reachable from refs
 	 * even in older repository.  We cannot trust what's reachable

Seeing "--stale-fix" again reminded me: that may be the "oops, we can
spend tons of CPU walking history" bit that I mentioned in the other
part of the thread. But I don't think git-gc would ever run with that
option.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 5/5] reflog expire: don't assert the OID when locking refs
  2019-03-13 23:54         ` [PATCH 5/5] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
@ 2019-03-14  0:44           ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14  0:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 12:54:39AM +0100, Ævar Arnfjörð Bjarmason wrote:

> The locking protocol for reflogs involves getting a *.lock file on the
> loose ref[1] (even if the actual ref is packed). This isn't needed to
> expire the reflog, and needlessly results promotes reference update
> contention to hard errors in e.g. "gc".

This first paragraph threw me off. It sounds like you are saying we
don't need to take a lock, but we absolutely do. It's just that we don't
need to care about the lock having some particular value. Which you do
go on to explain, but I think it would be more clear by simply removing
this first paragraph.

> If we instead lock the reference without considering what OID we last
> saw it at, we won't encounter user-visible contention to the extent
> that core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs:
> retry acquiring reference locks for 100ms", 2017-08-21).

I think this part is true. I'd love to get a confirmation from Michael
Haggerty, who has spent way more time thinking about ref and reflog
locking than any mortal should have to. Hopefully he even still
remembers some of it. :)

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs
  2019-03-14  0:40           ` Jeff King
@ 2019-03-14  4:51             ` Junio C Hamano
  2019-03-14 19:26               ` Jeff King
  0 siblings, 1 reply; 60+ messages in thread
From: Junio C Hamano @ 2019-03-14  4:51 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

Jeff King <peff@peff.net> writes:

> Seeing "--stale-fix" again reminded me: that may be the "oops, we can
> spend tons of CPU walking history" bit that I mentioned in the other
> part of the thread. But I don't think git-gc would ever run with that
> option.

The option was a purely transitional measure to recover from a bad
reflog state that could have been left by earlier versions of
"prune" and "repack" that did not pay attention to the reflog.
Perhaps we should plan to deprecate and remove it by now?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 0/7] gc: minor code cleanup + contention fixes
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                               ` (9 more replies)
  2019-03-14 12:34           ` [PATCH v2 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
                             ` (6 subsequent siblings)
  7 siblings, 10 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Addresse Peff's comments to v1. For a couple of patches I've faked up
his SOB where I copy/pasted a comment or code from a v1 comment
verbatim. Will see if he acks that.

The main change is a better commit message in the last patch (now
7/7), and 2x new "reflog" patches to make it early exit in addition to
"gc" when there's nothing to do.

There was the "why do it at all in gc" feedback on 6/7 in v1. I've
adjusted the commit message there to justify it, but we'll see what
Peff things about it this time around...

Ævar Arnfjörð Bjarmason (7):
  gc: remove redundant check for gc_auto_threshold
  gc: convert to using the_hash_algo
  gc: refactor a "call me once" pattern
  reflog tests: make use of "test_config" idiom
  reflog: exit early if there's no work to do
  gc: don't run "reflog expire" when keeping reflogs
  reflog expire: don't assert the OID when locking refs

 builtin/gc.c         | 37 +++++++++++++++++++++++++++++--------
 builtin/reflog.c     |  7 +++++++
 refs/files-backend.c |  2 +-
 t/t1410-reflog.sh    | 17 ++++++++---------
 t/t6500-gc.sh        | 19 +++++++++++++++++++
 5 files changed, 64 insertions(+), 18 deletions(-)

Range-diff:
1:  1635c7fb22 ! 1:  e18433f9c6 gc: convert to using the_hash_algo
    @@ -11,16 +11,26 @@
         the case for SHA-256, and reading between the lines in
         hash-function-transition.txt the format is planned to be the same.
     
    -    However, we may want to modify this code for the hash function
    -    transition. There's a potential pathological case here where we'll
    -    only consider the loose objects for the currently active hash, but
    -    objects for that hash will share a directory storage with the other
    -    hash.
    +    In the future we may want to further modify this code for the hash
    +    function transition. There's a potential pathological case here where
    +    we'll only consider the loose objects for the currently active hash,
    +    but objects for that hash will share a directory storage with the
    +    other hash.
     
         Thus we could theoretically have 1k SHA-1 loose objects, and say 1
         million SHA-256 objects, and not notice because we're currently using
         SHA-1.
     
    +    So assuming that "gc" eventually learns to pack up both SHA-1 and
    +    SHA-256 objects regardless of what the current the_hash_alg is perhaps
    +    this check should be changed to consider all files in objects/17/
    +    matching [0-9a-f] 38 or 62 characters in length (i.e. both SHA-1 and
    +    SHA-256).
    +
    +    But none of that is something we need to worry about now, and
    +    supporting both 38 and 62 characters depending on "the_hash_algo"
    +    removes another case of SHA-1 hardcoding.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      diff --git a/builtin/gc.c b/builtin/gc.c
    @@ -30,8 +40,7 @@
      	int auto_threshold;
      	int num_loose = 0;
      	int needed = 0;
    -+	const unsigned hexsz = the_hash_algo->hexsz;
    -+	const unsigned hexsz_loose = hexsz - 2;
    ++	const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
      
      	dir = opendir(git_path("objects/17"));
      	if (!dir)
2:  ced972826d ! 2:  54e4bce91c gc: refactor a "call me once" pattern
    @@ -10,6 +10,7 @@
         "prune_reflogs" variables, so let's not leave the reader wondering if
         they're being zero'd out for later use somewhere else.
     
    +    Signed-off-by: Jeff King <peff@peff.net>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      diff --git a/builtin/gc.c b/builtin/gc.c
    @@ -19,6 +20,11 @@
      
      static void gc_before_repack(void)
      {
    ++	/*
    ++	 * We may be called twice, as both the pre- and
    ++	 * post-daemonized phases will call us, but running these
    ++	 * commands more than once is pointless and wasteful.
    ++	 */
     +	static int done = 0;
     +	if (done++)
     +		return;
-:  ---------- > 3:  82f87db134 reflog tests: make use of "test_config" idiom
-:  ---------- > 4:  c79608dbbb reflog: exit early if there's no work to do
3:  599772f2bd ! 5:  c47dedab58 gc: don't run "reflog expire" when keeping reflogs
    @@ -5,10 +5,18 @@
         Don't redundantly run "git reflog expire --all" when gc.reflogExpire
         and gc.reflogExpireUnreachable are set to "never".
     
    -    I'm being careful to not repeat the issue fixed in
    -    8ab5aa4bd8 ("parseopt: handle malformed --expire arguments more
    -    nicely", 2018-04-21). We'll die early if the config variables are set
    -    to invalid values.
    +    An earlier change taught "git reflog expire" itself to exit early
    +    under this scenario, so in some sense this isn't strictly
    +    necessary. Reasons to also do it here:
    +
    +     1) Similar to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    +        arguments more nicely", 2018-04-21). We'll die early if the config
    +        variables are set to invalid values. We run "pack-refs" before
    +        "reflog expire", which can take a while, only to then die on an
    +        invalid gc.reflogExpire{Unreachable,} configuration.
    +
    +     2) Not invoking the command at all means it won't show up in trace
    +        output, which makes what's going on more obvious.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
4:  c127265828 ! 6:  55dd203a04 reflog expire: don't assert the OID when locking refs
    @@ -2,35 +2,39 @@
     
         reflog expire: don't assert the OID when locking refs
     
    -    The locking protocol for reflogs involves getting a *.lock file on the
    -    loose ref[1] (even if the actual ref is packed). This isn't needed to
    -    expire the reflog, and needlessly results promotes reference update
    -    contention to hard errors in e.g. "gc".
    -
         During reflog expiry, the cmd_reflog_expire() function first iterates
    -    over all known references, and then one-by-one acquires the lock for
    -    each one to expire its reflog. By the time it gets around to
    -    re-visiting the references some of the OIDs may have changed.
    +    over all reflogs in logs/*, and then one-by-one acquires the lock for
    +    each one to expire its reflog by getting a *.lock file on the
    +    corresponding loose ref[1] (even if the actual ref is packed).
    +
    +    This lock is needed, but what isn't needed is locking the loose ref as
    +    a function of the OID we found from that first iteration. By the time
    +    we get around to re-visiting the reference some of the OIDs may have
    +    changed.
     
    -    This has been the case ever since "reflog expire" was initially
    -    implemented in 4264dc15e1 ("git reflog expire", 2006-12-19). As seen
    -    in that simpler initial version of the code (and the same is the case
    -    before this change) we subsequently use the OID to inform the expiry,
    -    but never needed to use it to lock the reference associated with the
    -    reflog.
    +    Thus the verify_lock() function called by the lock_ref_oid_basic()
    +    function being changed here would fail with e.g. "ref '%s' is at %s
    +    but expected %s" if the repository was being updated concurrent to the
    +    "reflog expire".
     
    -    Thus the verify_lock() function would fail with e.g. "ref '%s' is at
    -    %s but expected %s" if the repository was being updated concurrent to
    -    the "reflog expire".
    +    By not passing the OID to it we'll try to lock the reference
    +    regardless of it last known OID. Locking as a function of the OID
    +    would make "reflog expire" exit with a non-zero exit status under such
    +    contention, which in turn meant that a "gc" command (which expires
    +    reflogs before forking to the background) would encounter a hard
    +    error.
     
    -    This made "reflog expire" exit with a non-zero exit status, which in
    -    turn meant that a "gc" command (which expires reflogs before forking
    -    to the background) would encounter a hard error in such a scenario.
    +    This behavior of considering the OID when locking has been here ever
    +    since "reflog expire" was initially implemented in 4264dc15e1 ("git
    +    reflog expire", 2006-12-19). As seen in that simpler initial version
    +    of the code we subsequently use the OID to inform the expiry (and
    +    still do), but never needed to use it to lock the reference associated
    +    with the reflog.
     
    -    If we instead lock the reference without considering what OID we last
    -    saw it at, we won't encounter user-visible contention to the extent
    -    that core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs:
    -    retry acquiring reference locks for 100ms", 2017-08-21).
    +    By locking the reference without considering what OID we last saw it
    +    at, we won't encounter user-visible contention to the extent that
    +    core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs: retry
    +    acquiring reference locks for 100ms", 2017-08-21).
     
         Unfortunately this sort of probabilistic contention is hard to turn
         into a test. I've tested this by running the following three subshells
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 1/7] gc: remove redundant check for gc_auto_threshold
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
                             ` (5 subsequent siblings)
  7 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Checking gc_auto_threshold in too_many_loose_objects() was added in
17815501a8 ("git-gc --auto: run "repack -A -d -l" as necessary.",
2007-09-17) when need_to_gc() itself was also reliant on
gc_auto_pack_limit before its early return:

    gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0

When that check was simplified to just checking "gc_auto_threshold <=
0" in b14d255ba8 ("builtin-gc.c: allow disabling all auto-gc'ing by
assigning 0 to gc.auto", 2008-03-19) this unreachable code should have
been removed. We only call too_many_loose_objects() from within
need_to_gc() itself, which will return if this condition holds, and in
cmd_gc() which will return before ever getting to "auto_gc &&
too_many_loose_objects()" if "auto_gc && !need_to_gc()" is true
earlier in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 020f725acc..8c2312681c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -157,9 +157,6 @@ static int too_many_loose_objects(void)
 	int num_loose = 0;
 	int needed = 0;
 
-	if (gc_auto_threshold <= 0)
-		return 0;
-
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
 		return 0;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 2/7] gc: convert to using the_hash_algo
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-15  9:51             ` Duy Nguyen
  2019-03-14 12:34           ` [PATCH v2 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
                             ` (4 subsequent siblings)
  7 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

There's been a lot of changing of the hardcoded "40" values to
the_hash_algo->hexsz, but we've so far missed this one where we
hardcoded 38 for the loose object file length.

This is because a SHA-1 like abcde[...] gets turned into
objects/ab/cde[...]. There's no reason to suppose the same won't be
the case for SHA-256, and reading between the lines in
hash-function-transition.txt the format is planned to be the same.

In the future we may want to further modify this code for the hash
function transition. There's a potential pathological case here where
we'll only consider the loose objects for the currently active hash,
but objects for that hash will share a directory storage with the
other hash.

Thus we could theoretically have 1k SHA-1 loose objects, and say 1
million SHA-256 objects, and not notice because we're currently using
SHA-1.

So assuming that "gc" eventually learns to pack up both SHA-1 and
SHA-256 objects regardless of what the current the_hash_alg is perhaps
this check should be changed to consider all files in objects/17/
matching [0-9a-f] 38 or 62 characters in length (i.e. both SHA-1 and
SHA-256).

But none of that is something we need to worry about now, and
supporting both 38 and 62 characters depending on "the_hash_algo"
removes another case of SHA-1 hardcoding.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 8c2312681c..733bd7bdf4 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
 	int auto_threshold;
 	int num_loose = 0;
 	int needed = 0;
+	const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
 
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
@@ -163,8 +164,8 @@ static int too_many_loose_objects(void)
 
 	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
-		if (strspn(ent->d_name, "0123456789abcdef") != 38 ||
-		    ent->d_name[38] != '\0')
+		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
+		    ent->d_name[hexsz_loose] != '\0')
 			continue;
 		if (++num_loose > auto_threshold) {
 			needed = 1;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 3/7] gc: refactor a "call me once" pattern
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                             ` (2 preceding siblings ...)
  2019-03-14 12:34           ` [PATCH v2 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
                             ` (3 subsequent siblings)
  7 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change an idiom we're using to ensure that gc_before_repack() only
does work once (see 62aad1849f ("gc --auto: do not lock refs in the
background", 2014-05-25)) to be more obvious.

Nothing except this function cares about the "pack_refs" and
"prune_reflogs" variables, so let's not leave the reader wondering if
they're being zero'd out for later use somewhere else.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 733bd7bdf4..ae716a00d4 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -489,14 +489,20 @@ static int report_last_gc_error(void)
 
 static void gc_before_repack(void)
 {
+	/*
+	 * We may be called twice, as both the pre- and
+	 * post-daemonized phases will call us, but running these
+	 * commands more than once is pointless and wasteful.
+	 */
+	static int done = 0;
+	if (done++)
+		return;
+
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, pack_refs_cmd.argv[0]);
 
 	if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, reflog.argv[0]);
-
-	pack_refs = 0;
-	prune_reflogs = 0;
 }
 
 int cmd_gc(int argc, const char **argv, const char *prefix)
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 4/7] reflog tests: make use of "test_config" idiom
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                             ` (3 preceding siblings ...)
  2019-03-14 12:34           ` [PATCH v2 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 5/7] reflog: exit early if there's no work to do Ævar Arnfjörð Bjarmason
                             ` (2 subsequent siblings)
  7 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change a couple of tests that weren't using the helper to use it. This
makes the trailing "--unset" unnecessary.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index ae8a448e34..42f5ac9ed9 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -232,24 +232,21 @@ test_expect_success '--expire=never' '
 '
 
 test_expect_success 'gc.reflogexpire=never' '
+	test_config gc.reflogexpire never &&
+	test_config gc.reflogexpireunreachable never &&
 
-	git config gc.reflogexpire never &&
-	git config gc.reflogexpireunreachable never &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
 
 test_expect_success 'gc.reflogexpire=false' '
+	test_config gc.reflogexpire false &&
+	test_config gc.reflogexpireunreachable false &&
 
-	git config gc.reflogexpire false &&
-	git config gc.reflogexpireunreachable false &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
-	test_line_count = 4 output &&
-
-	git config --unset gc.reflogexpire &&
-	git config --unset gc.reflogexpireunreachable
+	test_line_count = 4 output
 
 '
 
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 5/7] reflog: exit early if there's no work to do
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                             ` (4 preceding siblings ...)
  2019-03-14 12:34           ` [PATCH v2 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-15 10:01             ` Duy Nguyen
  2019-03-14 12:34           ` [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
  2019-03-14 12:34           ` [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
  7 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
and --stale-fix isn't in effect (covered by the first part of the "if"
statement being modified here) we can exit early without pointlessly
looping over all the reflogs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/reflog.c  | 7 +++++++
 t/t1410-reflog.sh | 4 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index 4d3430900d..d95c77ca0e 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -606,6 +606,13 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL);
 		if (flags & EXPIRE_REFLOGS_VERBOSE)
 			putchar('\n');
+	} else if (!cb.cmd.expire_total && !cb.cmd.expire_unreachable) {
+		/*
+		 * If we're not expiring anything and not dropping
+		 * stale entries, there's no point in even opening the
+		 * reflogs, since we're guaranteed to do nothing.
+		 */
+		return 0;
 	}
 
 	if (do_all) {
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index 42f5ac9ed9..b98827f082 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -235,7 +235,9 @@ test_expect_success 'gc.reflogexpire=never' '
 	test_config gc.reflogexpire never &&
 	test_config gc.reflogexpireunreachable never &&
 
-	git reflog expire --verbose --all &&
+	git reflog expire --verbose --all >output &&
+	test_line_count = 0 output &&
+
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                             ` (5 preceding siblings ...)
  2019-03-14 12:34           ` [PATCH v2 5/7] reflog: exit early if there's no work to do Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-15 10:05             ` Duy Nguyen
  2019-03-14 12:34           ` [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
  7 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Don't redundantly run "git reflog expire --all" when gc.reflogExpire
and gc.reflogExpireUnreachable are set to "never".

An earlier change taught "git reflog expire" itself to exit early
under this scenario, so in some sense this isn't strictly
necessary. Reasons to also do it here:

 1) Similar to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    arguments more nicely", 2018-04-21). We'll die early if the config
    variables are set to invalid values. We run "pack-refs" before
    "reflog expire", which can take a while, only to then die on an
    invalid gc.reflogExpire{Unreachable,} configuration.

 2) Not invoking the command at all means it won't show up in trace
    output, which makes what's going on more obvious.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c  | 17 +++++++++++++++++
 t/t6500-gc.sh | 19 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index ae716a00d4..8943bcc300 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -116,6 +116,19 @@ static void process_log_file_on_signal(int signo)
 	raise(signo);
 }
 
+static int gc_config_is_timestamp_never(const char *var)
+{
+	const char *value;
+	timestamp_t expire;
+
+	if (!git_config_get_value(var, &value) && value) {
+		if (parse_expiry_date(value, &expire))
+			die(_("failed to parse '%s' value '%s'"), var, value);
+		return expire == 0;
+	}
+	return 0;
+}
+
 static void gc_config(void)
 {
 	const char *value;
@@ -127,6 +140,10 @@ static void gc_config(void)
 			pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
+	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
+	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
+		prune_reflogs = 0;
+
 	git_config_get_int("gc.aggressivewindow", &aggressive_window);
 	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
 	git_config_get_int("gc.auto", &gc_auto_threshold);
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 4684d06552..7411bf7fec 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -120,6 +120,25 @@ test_expect_success 'gc --quiet' '
 	test_must_be_empty stderr
 '
 
+test_expect_success 'gc.reflogExpire{Unreachable,}=never skips "expire" via "gc"' '
+	test_config gc.reflogExpire never &&
+	test_config gc.reflogExpireUnreachable never &&
+
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+
+	# Check that git-pack-refs is run as a sanity check (done via
+	# gc_before_repack()) but that git-expire is not.
+	grep -E "^trace: (built-in|exec|run_command): git pack-refs --" trace.out &&
+	! grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
+test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "expire" via "gc"' '
+	>trace.out &&
+	test_config gc.reflogExpire never &&
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+	grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs
  2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
                             ` (6 preceding siblings ...)
  2019-03-14 12:34           ` [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
@ 2019-03-14 12:34           ` Ævar Arnfjörð Bjarmason
  2019-03-15 11:10             ` Duy Nguyen
  7 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-14 12:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one to expire its reflog by getting a *.lock file on the
corresponding loose ref[1] (even if the actual ref is packed).

This lock is needed, but what isn't needed is locking the loose ref as
a function of the OID we found from that first iteration. By the time
we get around to re-visiting the reference some of the OIDs may have
changed.

Thus the verify_lock() function called by the lock_ref_oid_basic()
function being changed here would fail with e.g. "ref '%s' is at %s
but expected %s" if the repository was being updated concurrent to the
"reflog expire".

By not passing the OID to it we'll try to lock the reference
regardless of it last known OID. Locking as a function of the OID
would make "reflog expire" exit with a non-zero exit status under such
contention, which in turn meant that a "gc" command (which expires
reflogs before forking to the background) would encounter a hard
error.

This behavior of considering the OID when locking has been here ever
since "reflog expire" was initially implemented in 4264dc15e1 ("git
reflog expire", 2006-12-19). As seen in that simpler initial version
of the code we subsequently use the OID to inform the expiry (and
still do), but never needed to use it to lock the reference associated
with the reflog.

By locking the reference without considering what OID we last saw it
at, we won't encounter user-visible contention to the extent that
core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21).

Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:

    (
        cd /tmp &&
        rm -rf git &&
        git init git &&
        cd git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >out &&
            git add out &&
            git commit -m"out"
        done
    )

    (
        cd /tmp &&
        rm -rf git-clone &&
        git clone file:///tmp/git git-clone &&
        cd git-clone &&
        while git pull
        do
            date
        done
    )

    (
        cd /tmp/git-clone &&
        while git reflog expire --all
        do
            date
        done
    )

Before this change the "reflog expire" would fail really quickly with
a "but expected" error. After this change both the "pull" and "reflog
expire" will run for a while, but eventually fail because I get
unlucky with core.filesRefLockTimeout (the "reflog expire" is in a
really tight loop). That can be resolved by being more generous with
higher values of core.filesRefLockTimeout than the 100ms default.

1. https://public-inbox.org/git/54857871.5090805@alum.mit.edu/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index ef053f716c..4d4d226601 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
 	 * reference itself, plus we might need to update the
 	 * reference if --updateref was specified:
 	 */
-	lock = lock_ref_oid_basic(refs, refname, oid,
+	lock = lock_ref_oid_basic(refs, refname, NULL /* NOT oid! */,
 				  NULL, NULL, REF_NO_DEREF,
 				  &type, &err);
 	if (!lock) {
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs
  2019-03-14  4:51             ` Junio C Hamano
@ 2019-03-14 19:26               ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-14 19:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 01:51:35PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Seeing "--stale-fix" again reminded me: that may be the "oops, we can
> > spend tons of CPU walking history" bit that I mentioned in the other
> > part of the thread. But I don't think git-gc would ever run with that
> > option.
> 
> The option was a purely transitional measure to recover from a bad
> reflog state that could have been left by earlier versions of
> "prune" and "repack" that did not pay attention to the reflog.
> Perhaps we should plan to deprecate and remove it by now?

True, though I have definitely used it over the years to clear out
broken reflog entries from corrupted repositories. In most cases we
should be able to do that much more simply these days, though. Since we
try to keep whole segments of history reachable from an otherwise
unreachable object, you should in general be able to just prune entries
for which we are missing the actual object mentioned in the log.

Of course when you are talking about corruption, all rules go out the
window. So it's possible you'd still need --stale-fix to cover really
broken cases.

I think these days I'd more often just delete the whole reflog in such a
case (once upon a time GitHub tried to use never-expiring reflogs as a
sort of audit trail, but it had all sorts of complications; these days
we log to a separate file).

So I wouldn't be terribly sad to see --stale-fix go away.

All that said, I do think --expire-unreachable still has to traverse to
find out what's reachable. And I think it does it under lock. If I
instrument the tempfile code like the patch below and run:

  GIT_TRACE_TEMPFILE=1 git reflog expire --expire-unreachable=now HEAD

on a copy of my git.git repo, I get:

  15:22:12.163725 tempfile.c:127          activating tempfile '/home/peff/compile/foo/.git/HEAD.lock'
  15:22:12.163769 tempfile.c:127          activating tempfile '/home/peff/compile/foo/.git/logs/HEAD.lock'
  15:22:13.053404 tempfile.c:312          renaming tempfile '/home/peff/compile/foo/.git/logs/HEAD.lock' to '/home/peff/compile/foo/.git/logs/HEAD'
  15:22:13.053416 tempfile.c:327          deleting tempfile '/home/peff/compile/foo/.git/HEAD.lock'

We hold HEAD.lock for almost an entire second.

(Actually, it just occurred to me that "strace -tt git ... 2>&1 | grep
HEAD.lock" would produce basically the same results, no patch needed).

-Peff

---
diff --git a/tempfile.c b/tempfile.c
index d43ad8c191..f7e999d3ca 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -53,6 +53,9 @@
 #include "cache.h"
 #include "tempfile.h"
 #include "sigchain.h"
+#include "trace.h"
+
+static struct trace_key trace_tempfile = TRACE_KEY_INIT(TEMPFILE);
 
 static VOLATILE_LIST_HEAD(tempfile_list);
 
@@ -119,6 +122,9 @@ static void activate_tempfile(struct tempfile *tempfile)
 	volatile_list_add(&tempfile->list, &tempfile_list);
 	tempfile->owner = getpid();
 	tempfile->active = 1;
+
+	trace_printf_key(&trace_tempfile, "activating tempfile '%s'",
+			 tempfile->filename.buf);
 }
 
 static void deactivate_tempfile(struct tempfile *tempfile)
@@ -302,6 +308,9 @@ int rename_tempfile(struct tempfile **tempfile_p, const char *path)
 		return -1;
 	}
 
+	trace_printf_key(&trace_tempfile, "renaming tempfile '%s' to '%s'",
+			 tempfile->filename.buf, path);
+
 	deactivate_tempfile(tempfile);
 	*tempfile_p = NULL;
 	return 0;
@@ -314,6 +323,9 @@ void delete_tempfile(struct tempfile **tempfile_p)
 	if (!is_tempfile_active(tempfile))
 		return;
 
+	trace_printf_key(&trace_tempfile, "deleting tempfile '%s'",
+			 tempfile->filename.buf);
+
 	close_tempfile_gently(tempfile);
 	unlink_or_warn(tempfile->filename.buf);
 	deactivate_tempfile(tempfile);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 2/7] gc: convert to using the_hash_algo
  2019-03-14 12:34           ` [PATCH v2 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-15  9:51             ` Duy Nguyen
  2019-03-15 10:42               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Duy Nguyen @ 2019-03-15  9:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 7:34 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> There's been a lot of changing of the hardcoded "40" values to
> the_hash_algo->hexsz, but we've so far missed this one where we
> hardcoded 38 for the loose object file length.

Wow. Good catch.

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 8c2312681c..733bd7bdf4 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
>         int auto_threshold;
>         int num_loose = 0;
>         int needed = 0;
> +       const unsigned hexsz_loose = the_hash_algo->hexsz - 2;

Since you're removing hard coded numbers. Another option is a
combination of strlen(basename()) and
loose_object_path(the_repository, , null_oid). They should also give
the same 38. Then if loose object format is updated to use 3+ chars
for the directory part, this code will not need update anymore.

The downside of course is a lot more code. Or you can just introduce a
new function to return this "hexsz - 2", keep that function close to
fill_loose_path() with a comment there that the two functions should
be aligned.

>
>         dir = opendir(git_path("objects/17"));
>         if (!dir)
-- 
Duy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 5/7] reflog: exit early if there's no work to do
  2019-03-14 12:34           ` [PATCH v2 5/7] reflog: exit early if there's no work to do Ævar Arnfjörð Bjarmason
@ 2019-03-15 10:01             ` Duy Nguyen
  2019-03-15 15:51               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Duy Nguyen @ 2019-03-15 10:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
> and --stale-fix isn't in effect (covered by the first part of the "if"
> statement being modified here) we can exit early without pointlessly
> looping over all the reflogs.
>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/reflog.c  | 7 +++++++
>  t/t1410-reflog.sh | 4 +++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 4d3430900d..d95c77ca0e 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -606,6 +606,13 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>                 mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL);
>                 if (flags & EXPIRE_REFLOGS_VERBOSE)
>                         putchar('\n');
> +       } else if (!cb.cmd.expire_total && !cb.cmd.expire_unreachable) {
> +               /*
> +                * If we're not expiring anything and not dropping
> +                * stale entries, there's no point in even opening the
> +                * reflogs, since we're guaranteed to do nothing.
> +                */

I'm checking should_expire_reflog_ent(). With both of these being
zero, we skip most of the "return 1;" except the last one
cb->cmd.recno, added in 552cecc214 (Teach "git reflog" a subcommand to
delete single entries, 2007-10-17). Will this shortcut affect that use
case (I haven't spent time understanding that commit yet, gotta run
soon)?


> +               return 0;
>         }
-- 
Duy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-14 12:34           ` [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
@ 2019-03-15 10:05             ` Duy Nguyen
  2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Duy Nguyen @ 2019-03-15 10:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> @@ -127,6 +140,10 @@ static void gc_config(void)
>                         pack_refs = git_config_bool("gc.packrefs", value);
>         }
>
> +       if (gc_config_is_timestamp_never("gc.reflogexpire") &&
> +           gc_config_is_timestamp_never("gc.reflogexpireunreachable"))

Nit. configset api normalizes the key internally, so we can safely
write gc.reflogExpireUnreachable here, which is a bit easier to read.
-- 
Duy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-15 10:05             ` Duy Nguyen
@ 2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
  2019-03-15 10:32                 ` Duy Nguyen
                                   ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 10:24 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder


On Fri, Mar 15 2019, Duy Nguyen wrote:

> On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> @@ -127,6 +140,10 @@ static void gc_config(void)
>>                         pack_refs = git_config_bool("gc.packrefs", value);
>>         }
>>
>> +       if (gc_config_is_timestamp_never("gc.reflogexpire") &&
>> +           gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
>
> Nit. configset api normalizes the key internally, so we can safely
> write gc.reflogExpireUnreachable here, which is a bit easier to read.

I didn't know that, do we want to do that?

I'd think that as a matter of coding style always sticking to lower case
in the C code made more sense, because e.g. you might #define a config
key, and then use it both in git_config_*() as well as via strcmp() in
some callback. Mixing the two would lead to confusion and possible bugs
as we'd refactor the former of those patterns to the latter.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
@ 2019-03-15 10:32                 ` Duy Nguyen
  2019-03-15 15:51                 ` Johannes Schindelin
  2019-03-18  6:04                 ` Junio C Hamano
  2 siblings, 0 replies; 60+ messages in thread
From: Duy Nguyen @ 2019-03-15 10:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Fri, Mar 15, 2019 at 5:24 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Mar 15 2019, Duy Nguyen wrote:
>
> > On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >> @@ -127,6 +140,10 @@ static void gc_config(void)
> >>                         pack_refs = git_config_bool("gc.packrefs", value);
> >>         }
> >>
> >> +       if (gc_config_is_timestamp_never("gc.reflogexpire") &&
> >> +           gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
> >
> > Nit. configset api normalizes the key internally, so we can safely
> > write gc.reflogExpireUnreachable here, which is a bit easier to read.
>
> I didn't know that, do we want to do that?
>
> I'd think that as a matter of coding style always sticking to lower case
> in the C code made more sense, because e.g. you might #define a config
> key, and then use it both in git_config_*() as well as via strcmp() in
> some callback. Mixing the two would lead to confusion and possible bugs
> as we'd refactor the former of those patterns to the latter.

I did a quick git grep on "git_config_get_". The majority is still in
lower case. And given that this feature is not documented anywhere (I
had a quick look at config.h and api-config.txt and ended up checking
the code before the previous mail) yeah it's probably just safer to go
with lowercase. At least until the old callback style is gone.
-- 
Duy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 2/7] gc: convert to using the_hash_algo
  2019-03-15  9:51             ` Duy Nguyen
@ 2019-03-15 10:42               ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:49                 ` Johannes Schindelin
  0 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 10:42 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder


On Fri, Mar 15 2019, Duy Nguyen wrote:

> On Thu, Mar 14, 2019 at 7:34 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> There's been a lot of changing of the hardcoded "40" values to
>> the_hash_algo->hexsz, but we've so far missed this one where we
>> hardcoded 38 for the loose object file length.
>
> Wow. Good catch.
>
>> diff --git a/builtin/gc.c b/builtin/gc.c
>> index 8c2312681c..733bd7bdf4 100644
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
>>         int auto_threshold;
>>         int num_loose = 0;
>>         int needed = 0;
>> +       const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
>
> Since you're removing hard coded numbers. Another option is a
> combination of strlen(basename()) and
> loose_object_path(the_repository, , null_oid). They should also give
> the same 38. Then if loose object format is updated to use 3+ chars
> for the directory part, this code will not need update anymore.
>
> The downside of course is a lot more code. Or you can just introduce a
> new function to return this "hexsz - 2", keep that function close to
> fill_loose_path() with a comment there that the two functions should
> be aligned.

I think it's better to just keep hardcoding "2". We're very unlikely to
ever change objects/?? in favor of e.g. objects/???, and if we were that
would be a huge "hash function transition" of its own.

>>
>>         dir = opendir(git_path("objects/17"));
>>         if (!dir)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs
  2019-03-14 12:34           ` [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
@ 2019-03-15 11:10             ` Duy Nguyen
  2019-03-15 15:52               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 60+ messages in thread
From: Duy Nguyen @ 2019-03-15 11:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> During reflog expiry, the cmd_reflog_expire() function first iterates
> over all reflogs in logs/*, and then one-by-one acquires the lock for
> each one to expire its reflog by getting a *.lock file on the
> corresponding loose ref[1] (even if the actual ref is packed).
>
> This lock is needed, but what isn't needed is locking the loose ref as
> a function of the OID we found from that first iteration. By the time
> we get around to re-visiting the reference some of the OIDs may have
> changed.
>
> Thus the verify_lock() function called by the lock_ref_oid_basic()
> function being changed here would fail with e.g. "ref '%s' is at %s
> but expected %s" if the repository was being updated concurrent to the
> "reflog expire".
>
> By not passing the OID to it we'll try to lock the reference
> regardless of it last known OID. Locking as a function of the OID
> would make "reflog expire" exit with a non-zero exit status under such
> contention, which in turn meant that a "gc" command (which expires
> reflogs before forking to the background) would encounter a hard
> error.

Passing NULL oid has another side(?) effect, which I don't know if it
matters at all. Before, the mustexist flag in lock_ref_oid_basic() is
true. Now it's false. This affects refs_resolve_ref_unsafe() calls in
there. But that's where I'm stuck.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index ef053f716c..4d4d226601 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
>          * reference itself, plus we might need to update the
>          * reference if --updateref was specified:
>          */
> -       lock = lock_ref_oid_basic(refs, refname, oid,
> +       lock = lock_ref_oid_basic(refs, refname, NULL /* NOT oid! */,

Maybe mention this not oid in the comment block above. Or just drop
it. Reading this comment without the commit message does not really
help answer "why not oid?". Or perhaps /* don't verify oid */

>                                   NULL, NULL, REF_NO_DEREF,
>                                   &type, &err);
>         if (!lock) {
> --
-- 
Duy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 2/7] gc: convert to using the_hash_algo
  2019-03-15 10:42               ` Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:49                 ` Johannes Schindelin
  0 siblings, 0 replies; 60+ messages in thread
From: Johannes Schindelin @ 2019-03-15 15:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Duy Nguyen, Git Mailing List, Junio C Hamano, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

[-- Attachment #1: Type: text/plain, Size: 1738 bytes --]

Hi Ævar,

On Fri, 15 Mar 2019, Ævar Arnfjörð Bjarmason wrote:

> 
> On Fri, Mar 15 2019, Duy Nguyen wrote:
> 
> > On Thu, Mar 14, 2019 at 7:34 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> There's been a lot of changing of the hardcoded "40" values to
> >> the_hash_algo->hexsz, but we've so far missed this one where we
> >> hardcoded 38 for the loose object file length.
> >
> > Wow. Good catch.
> >
> >> diff --git a/builtin/gc.c b/builtin/gc.c
> >> index 8c2312681c..733bd7bdf4 100644
> >> --- a/builtin/gc.c
> >> +++ b/builtin/gc.c
> >> @@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
> >>         int auto_threshold;
> >>         int num_loose = 0;
> >>         int needed = 0;
> >> +       const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
> >
> > Since you're removing hard coded numbers. Another option is a
> > combination of strlen(basename()) and
> > loose_object_path(the_repository, , null_oid). They should also give
> > the same 38. Then if loose object format is updated to use 3+ chars
> > for the directory part, this code will not need update anymore.
> >
> > The downside of course is a lot more code. Or you can just introduce a
> > new function to return this "hexsz - 2", keep that function close to
> > fill_loose_path() with a comment there that the two functions should
> > be aligned.
> 
> I think it's better to just keep hardcoding "2". We're very unlikely to
> ever change objects/?? in favor of e.g. objects/???, and if we were that
> would be a huge "hash function transition" of its own.

Of course, since it is *such* a common use case, we could also simply add
another field in `the_hash_algo`. That would also save on repeated magic
constants.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
  2019-03-15 10:32                 ` Duy Nguyen
@ 2019-03-15 15:51                 ` Johannes Schindelin
  2019-03-18  6:04                 ` Junio C Hamano
  2 siblings, 0 replies; 60+ messages in thread
From: Johannes Schindelin @ 2019-03-15 15:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Duy Nguyen, Git Mailing List, Junio C Hamano, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

[-- Attachment #1: Type: text/plain, Size: 1146 bytes --]

Hi Ævar,

On Fri, 15 Mar 2019, Ævar Arnfjörð Bjarmason wrote:

> 
> On Fri, Mar 15 2019, Duy Nguyen wrote:
> 
> > On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >> @@ -127,6 +140,10 @@ static void gc_config(void)
> >>                         pack_refs = git_config_bool("gc.packrefs", value);
> >>         }
> >>
> >> +       if (gc_config_is_timestamp_never("gc.reflogexpire") &&
> >> +           gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
> >
> > Nit. configset api normalizes the key internally, so we can safely
> > write gc.reflogExpireUnreachable here, which is a bit easier to read.
> 
> I didn't know that, do we want to do that?
> 
> I'd think that as a matter of coding style always sticking to lower case
> in the C code made more sense, because e.g. you might #define a config
> key, and then use it both in git_config_*() as well as via strcmp() in
> some callback. Mixing the two would lead to confusion and possible bugs
> as we'd refactor the former of those patterns to the latter.

It also saves on developer and reviewer time to stick to lower case.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 5/7] reflog: exit early if there's no work to do
  2019-03-15 10:01             ` Duy Nguyen
@ 2019-03-15 15:51               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:51 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder


On Fri, Mar 15 2019, Duy Nguyen wrote:

> On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
>> and --stale-fix isn't in effect (covered by the first part of the "if"
>> statement being modified here) we can exit early without pointlessly
>> looping over all the reflogs.
>>
>> Signed-off-by: Jeff King <peff@peff.net>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/reflog.c  | 7 +++++++
>>  t/t1410-reflog.sh | 4 +++-
>>  2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/builtin/reflog.c b/builtin/reflog.c
>> index 4d3430900d..d95c77ca0e 100644
>> --- a/builtin/reflog.c
>> +++ b/builtin/reflog.c
>> @@ -606,6 +606,13 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>>                 mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL);
>>                 if (flags & EXPIRE_REFLOGS_VERBOSE)
>>                         putchar('\n');
>> +       } else if (!cb.cmd.expire_total && !cb.cmd.expire_unreachable) {
>> +               /*
>> +                * If we're not expiring anything and not dropping
>> +                * stale entries, there's no point in even opening the
>> +                * reflogs, since we're guaranteed to do nothing.
>> +                */
>
> I'm checking should_expire_reflog_ent(). With both of these being
> zero, we skip most of the "return 1;" except the last one
> cb->cmd.recno, added in 552cecc214 (Teach "git reflog" a subcommand to
> delete single entries, 2007-10-17). Will this shortcut affect that use
> case (I haven't spent time understanding that commit yet, gotta run
> soon)?

There was a bug related to this. Fixed in v3.

>
>> +               return 0;
>>         }

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs
  2019-03-15 11:10             ` Duy Nguyen
@ 2019-03-15 15:52               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:52 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Junio C Hamano, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder


On Fri, Mar 15 2019, Duy Nguyen wrote:

> On Thu, Mar 14, 2019 at 7:35 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> During reflog expiry, the cmd_reflog_expire() function first iterates
>> over all reflogs in logs/*, and then one-by-one acquires the lock for
>> each one to expire its reflog by getting a *.lock file on the
>> corresponding loose ref[1] (even if the actual ref is packed).
>>
>> This lock is needed, but what isn't needed is locking the loose ref as
>> a function of the OID we found from that first iteration. By the time
>> we get around to re-visiting the reference some of the OIDs may have
>> changed.
>>
>> Thus the verify_lock() function called by the lock_ref_oid_basic()
>> function being changed here would fail with e.g. "ref '%s' is at %s
>> but expected %s" if the repository was being updated concurrent to the
>> "reflog expire".
>>
>> By not passing the OID to it we'll try to lock the reference
>> regardless of it last known OID. Locking as a function of the OID
>> would make "reflog expire" exit with a non-zero exit status under such
>> contention, which in turn meant that a "gc" command (which expires
>> reflogs before forking to the background) would encounter a hard
>> error.
>
> Passing NULL oid has another side(?) effect, which I don't know if it
> matters at all. Before, the mustexist flag in lock_ref_oid_basic() is
> true. Now it's false. This affects refs_resolve_ref_unsafe() calls in
> there. But that's where I'm stuck.

This case was tricky, I ended up doing the same thing in v3, but now
very carefully explain the rationale since none of it is obvious.

>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index ef053f716c..4d4d226601 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -3037,7 +3037,7 @@ static int files_reflog_expire(struct ref_store *ref_store,
>>          * reference itself, plus we might need to update the
>>          * reference if --updateref was specified:
>>          */
>> -       lock = lock_ref_oid_basic(refs, refname, oid,
>> +       lock = lock_ref_oid_basic(refs, refname, NULL /* NOT oid! */,
>
> Maybe mention this not oid in the comment block above. Or just drop
> it. Reading this comment without the commit message does not really
> help answer "why not oid?". Or perhaps /* don't verify oid */

Also fixed.

>>                                   NULL, NULL, REF_NO_DEREF,
>>                                   &type, &err);
>>         if (!lock) {
>> --

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 0/8] gc: minor code cleanup + contention fixes
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-18  6:13               ` Junio C Hamano
                                 ` (8 more replies)
  2019-03-15 15:59             ` [PATCH v3 1/8] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
                               ` (8 subsequent siblings)
  9 siblings, 9 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Changes since v2:

 * The "let's detect that we don't need work in reflog expire" patch
   is gone, as noted in a new set of tests/commit messages that was
   tricker than expected, and not worth it.

 * Minor rewording fixes to commit messages.

 * Further paraphrased rationale in commit messages from replies to v2
   feedback (thanks all!)

 * The "no OID" and "no mustexist" case was confusingly conflated, now
   we still do that in 8/8, but with a commit message & commit
   explaining why and how that works.

Ævar Arnfjörð Bjarmason (8):
  gc: remove redundant check for gc_auto_threshold
  gc: convert to using the_hash_algo
  gc: refactor a "call me once" pattern
  reflog tests: make use of "test_config" idiom
  reflog tests: test for the "points nowhere" warning
  reflog tests: assert lack of early exit with expiry="never"
  gc: handle & check gc.reflogExpire config
  reflog expire: don't assert the OID when locking refs

 builtin/gc.c         | 37 +++++++++++++++++++++++++++++--------
 refs/files-backend.c | 15 ++++++++++++++-
 t/t1410-reflog.sh    | 25 +++++++++++++++++--------
 t/t6500-gc.sh        | 19 +++++++++++++++++++
 4 files changed, 79 insertions(+), 17 deletions(-)

Range-diff:
1:  f11699d8e77 = 1:  81694c82130 gc: remove redundant check for gc_auto_threshold
2:  e18433f9c6b ! 2:  4bdcf1d0be2 gc: convert to using the_hash_algo
    @@ -17,20 +17,33 @@
         but objects for that hash will share a directory storage with the
         other hash.
     
    -    Thus we could theoretically have 1k SHA-1 loose objects, and say 1
    -    million SHA-256 objects, and not notice because we're currently using
    -    SHA-1.
    +    Thus we could theoretically have e.g. 1k SHA-1 loose objects, and 1
    +    million SHA-256 objects. Then not notice that we need to pack them
    +    because we're currently using SHA-1, even though our FS may be
    +    straining under the stress of such humongous directories.
     
         So assuming that "gc" eventually learns to pack up both SHA-1 and
    -    SHA-256 objects regardless of what the current the_hash_alg is perhaps
    -    this check should be changed to consider all files in objects/17/
    -    matching [0-9a-f] 38 or 62 characters in length (i.e. both SHA-1 and
    -    SHA-256).
    +    SHA-256 objects regardless of what the current the_hash_algo is,
    +    perhaps this check should be changed to consider all files in
    +    objects/17/ matching [0-9a-f] 38 or 62 characters in length (i.e. both
    +    SHA-1 and SHA-256).
     
         But none of that is something we need to worry about now, and
         supporting both 38 and 62 characters depending on "the_hash_algo"
         removes another case of SHA-1 hardcoding.
     
    +    As noted in [1] I'm making no effort to somehow remove the hardcoding
    +    for "2" as in "use the first two hexdigits for the directory
    +    name". There's no indication that that'll ever change, and somehow
    +    generalizing it here would be a drop in the ocean, so there's no point
    +    in doing that. It also couldn't be done without coming up with some
    +    generalized version of the magical "objects/17" directory. See [2] for
    +    a discussion of that directory.
    +
    +    1. https://public-inbox.org/git/874l84ber7.fsf@evledraar.gmail.com/
    +
    +    2. https://public-inbox.org/git/87k1mta9x5.fsf@evledraar.gmail.com/
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      diff --git a/builtin/gc.c b/builtin/gc.c
3:  54e4bce91c3 = 3:  9444a1233af gc: refactor a "call me once" pattern
4:  82f87db1348 = 4:  60a06ae6185 reflog tests: make use of "test_config" idiom
-:  ----------- > 5:  52838fdc449 reflog tests: test for the "points nowhere" warning
5:  c79608dbbb3 ! 6:  6063429f108 reflog: exit early if there's no work to do
    @@ -1,32 +1,25 @@
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    -    reflog: exit early if there's no work to do
    +    reflog tests: assert lack of early exit with expiry="never"
     
         When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
         and --stale-fix isn't in effect (covered by the first part of the "if"
    -    statement being modified here) we can exit early without pointlessly
    -    looping over all the reflogs.
    +    statement being modified here) we *could* exit early without
    +    pointlessly looping over all the reflogs.
     
    -    Signed-off-by: Jeff King <peff@peff.net>
    -    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    However, as an earlier change to add a test for the "points nowhere"
    +    warning shows even in such a mode we might want to print out a
    +    warning.
     
    - diff --git a/builtin/reflog.c b/builtin/reflog.c
    - --- a/builtin/reflog.c
    - +++ b/builtin/reflog.c
    -@@
    - 		mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL);
    - 		if (flags & EXPIRE_REFLOGS_VERBOSE)
    - 			putchar('\n');
    -+	} else if (!cb.cmd.expire_total && !cb.cmd.expire_unreachable) {
    -+		/*
    -+		 * If we're not expiring anything and not dropping
    -+		 * stale entries, there's no point in even opening the
    -+		 * reflogs, since we're guaranteed to do nothing.
    -+		 */
    -+		return 0;
    - 	}
    - 
    - 	if (do_all) {
    +    So while it's conceivable to implement this, I don't think it's worth
    +    it. It's going to be too easy to inadvertently add some flag that'll
    +    make the expiry happen anyway, and even with "never" we'd like to see
    +    all the lines we're going to keep.
    +
    +    So let's assert that we're going to loop over all the references even
    +    when this configuration is in effect.
    +
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
      --- a/t/t1410-reflog.sh
    @@ -37,7 +30,7 @@
      
     -	git reflog expire --verbose --all &&
     +	git reflog expire --verbose --all >output &&
    -+	test_line_count = 0 output &&
    ++	test_line_count = 9 output &&
     +
      	git reflog refs/heads/master >output &&
      	test_line_count = 4 output
6:  c47dedab58d ! 7:  6693d1d84da gc: don't run "reflog expire" when keeping reflogs
    @@ -1,22 +1,38 @@
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    -    gc: don't run "reflog expire" when keeping reflogs
    +    gc: handle & check gc.reflogExpire config
     
         Don't redundantly run "git reflog expire --all" when gc.reflogExpire
    -    and gc.reflogExpireUnreachable are set to "never".
    +    and gc.reflogExpireUnreachable are set to "never", and die immediately
    +    if those configuration valuer are bad.
     
    -    An earlier change taught "git reflog expire" itself to exit early
    -    under this scenario, so in some sense this isn't strictly
    -    necessary. Reasons to also do it here:
    +    As an earlier "assert lack of early exit" change to the tests for "git
    +    reflog expire" shows, an early check of gc.reflogExpire{Unreachable,}
    +    isn't wanted in general for "git reflog expire", but it makes sense
    +    for "gc" because:
     
    -     1) Similar to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    -        arguments more nicely", 2018-04-21). We'll die early if the config
    -        variables are set to invalid values. We run "pack-refs" before
    -        "reflog expire", which can take a while, only to then die on an
    -        invalid gc.reflogExpire{Unreachable,} configuration.
    +     1) Similarly to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    +        arguments more nicely", 2018-04-21) we'll now die early if the
    +        config variables are set to invalid values.
    +
    +        We run "pack-refs" before "reflog expire", which can take a while,
    +        only to then die on an invalid gc.reflogExpire{Unreachable,}
    +        configuration.
     
          2) Not invoking the command at all means it won't show up in trace
    -        output, which makes what's going on more obvious.
    +        output, which makes what's going on more obvious when the two are
    +        set to "never".
    +
    +     3) As a later change documents we lock the refs when looping over the
    +        refs to expire, even in cases where we end up doing nothing due to
    +        this config.
    +
    +        For the reasons noted in the earlier "assert lack of early exit"
    +        change I don't think it's worth it to bend over backwards in "git
    +        reflog expire" itself to carefully detect if we'll really do
    +        nothing given the combination of all its possible options and skip
    +        that locking, but that's easy to detect here in "gc" where we'll
    +        only run "reflog expire" in a relatively simple mode.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
7:  55dd203a042 ! 8:  e0814569aba reflog expire: don't assert the OID when locking refs
    @@ -79,6 +79,25 @@
         really tight loop). That can be resolved by being more generous with
         higher values of core.filesRefLockTimeout than the 100ms default.
     
    +    As noted in the commentary being added here we also need to handle the
    +    case of references being racily deleted, that can be tested by adding
    +    this to the above:
    +
    +        (
    +            cd /tmp/git-clone &&
    +            while git branch topic master && git branch -D topic
    +            do
    +                date
    +            done
    +        )
    +
    +    We could change lock_ref_oid_basic() to always pass down
    +    RESOLVE_REF_READING to refs_resolve_ref_unsafe() and then
    +    files_reflog_expire() to detect the "is it deleted?" state. But let's
    +    not bother, in the event of such a race we're going to redundantly
    +    create a lock on the deleted reference, and shortly afterwards handle
    +    that case and others with the refs_reflog_exists() check.
    +
         1. https://public-inbox.org/git/54857871.5090805@alum.mit.edu/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    @@ -87,11 +106,32 @@
      --- a/refs/files-backend.c
      +++ b/refs/files-backend.c
     @@
    + 	 * The reflog file is locked by holding the lock on the
      	 * reference itself, plus we might need to update the
      	 * reference if --updateref was specified:
    ++	 *
    ++	 * We don't pass down the oid here because we'd like to be
    ++	 * tolerant to the OID of the ref having changed, and to
    ++	 * gracefully handle the case where it's been deleted (see oid
    ++	 * -> mustexist -> RESOLVE_REF_READING in
    ++	 * lock_ref_oid_basic()) ...
      	 */
     -	lock = lock_ref_oid_basic(refs, refname, oid,
    -+	lock = lock_ref_oid_basic(refs, refname, NULL /* NOT oid! */,
    ++	lock = lock_ref_oid_basic(refs, refname, NULL,
      				  NULL, NULL, REF_NO_DEREF,
      				  &type, &err);
      	if (!lock) {
    +@@
    + 		strbuf_release(&err);
    + 		return -1;
    + 	}
    ++	/*
    ++	 * When refs are deleted their reflog is deleted before the
    ++	 * loose ref is deleted. This catches that case, i.e. when
    ++	 * racing against a ref deletion lock_ref_oid_basic() will
    ++	 * have acquired a lock on the now-deleted ref, but here's
    ++	 * where we find out it has no reflog anymore.
    ++	 */
    + 	if (!refs_reflog_exists(ref_store, refname)) {
    + 		unlock_ref(lock);
    + 		return 0;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 1/8] gc: remove redundant check for gc_auto_threshold
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 2/8] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
                               ` (7 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Checking gc_auto_threshold in too_many_loose_objects() was added in
17815501a8 ("git-gc --auto: run "repack -A -d -l" as necessary.",
2007-09-17) when need_to_gc() itself was also reliant on
gc_auto_pack_limit before its early return:

    gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0

When that check was simplified to just checking "gc_auto_threshold <=
0" in b14d255ba8 ("builtin-gc.c: allow disabling all auto-gc'ing by
assigning 0 to gc.auto", 2008-03-19) this unreachable code should have
been removed. We only call too_many_loose_objects() from within
need_to_gc() itself, which will return if this condition holds, and in
cmd_gc() which will return before ever getting to "auto_gc &&
too_many_loose_objects()" if "auto_gc && !need_to_gc()" is true
earlier in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 020f725acc4..8c2312681ce 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -157,9 +157,6 @@ static int too_many_loose_objects(void)
 	int num_loose = 0;
 	int needed = 0;
 
-	if (gc_auto_threshold <= 0)
-		return 0;
-
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
 		return 0;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 2/8] gc: convert to using the_hash_algo
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 1/8] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 3/8] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
                               ` (6 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

There's been a lot of changing of the hardcoded "40" values to
the_hash_algo->hexsz, but we've so far missed this one where we
hardcoded 38 for the loose object file length.

This is because a SHA-1 like abcde[...] gets turned into
objects/ab/cde[...]. There's no reason to suppose the same won't be
the case for SHA-256, and reading between the lines in
hash-function-transition.txt the format is planned to be the same.

In the future we may want to further modify this code for the hash
function transition. There's a potential pathological case here where
we'll only consider the loose objects for the currently active hash,
but objects for that hash will share a directory storage with the
other hash.

Thus we could theoretically have e.g. 1k SHA-1 loose objects, and 1
million SHA-256 objects. Then not notice that we need to pack them
because we're currently using SHA-1, even though our FS may be
straining under the stress of such humongous directories.

So assuming that "gc" eventually learns to pack up both SHA-1 and
SHA-256 objects regardless of what the current the_hash_algo is,
perhaps this check should be changed to consider all files in
objects/17/ matching [0-9a-f] 38 or 62 characters in length (i.e. both
SHA-1 and SHA-256).

But none of that is something we need to worry about now, and
supporting both 38 and 62 characters depending on "the_hash_algo"
removes another case of SHA-1 hardcoding.

As noted in [1] I'm making no effort to somehow remove the hardcoding
for "2" as in "use the first two hexdigits for the directory
name". There's no indication that that'll ever change, and somehow
generalizing it here would be a drop in the ocean, so there's no point
in doing that. It also couldn't be done without coming up with some
generalized version of the magical "objects/17" directory. See [2] for
a discussion of that directory.

1. https://public-inbox.org/git/874l84ber7.fsf@evledraar.gmail.com/

2. https://public-inbox.org/git/87k1mta9x5.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 8c2312681ce..733bd7bdf46 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
 	int auto_threshold;
 	int num_loose = 0;
 	int needed = 0;
+	const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
 
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
@@ -163,8 +164,8 @@ static int too_many_loose_objects(void)
 
 	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
-		if (strspn(ent->d_name, "0123456789abcdef") != 38 ||
-		    ent->d_name[38] != '\0')
+		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
+		    ent->d_name[hexsz_loose] != '\0')
 			continue;
 		if (++num_loose > auto_threshold) {
 			needed = 1;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 3/8] gc: refactor a "call me once" pattern
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 2/8] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 4/8] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
                               ` (5 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change an idiom we're using to ensure that gc_before_repack() only
does work once (see 62aad1849f ("gc --auto: do not lock refs in the
background", 2014-05-25)) to be more obvious.

Nothing except this function cares about the "pack_refs" and
"prune_reflogs" variables, so let's not leave the reader wondering if
they're being zero'd out for later use somewhere else.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 733bd7bdf46..ae716a00d4a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -489,14 +489,20 @@ static int report_last_gc_error(void)
 
 static void gc_before_repack(void)
 {
+	/*
+	 * We may be called twice, as both the pre- and
+	 * post-daemonized phases will call us, but running these
+	 * commands more than once is pointless and wasteful.
+	 */
+	static int done = 0;
+	if (done++)
+		return;
+
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, pack_refs_cmd.argv[0]);
 
 	if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, reflog.argv[0]);
-
-	pack_refs = 0;
-	prune_reflogs = 0;
 }
 
 int cmd_gc(int argc, const char **argv, const char *prefix)
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 4/8] reflog tests: make use of "test_config" idiom
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 3/8] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 5/8] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
                               ` (4 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change a couple of tests that weren't using the helper to use it. This
makes the trailing "--unset" unnecessary.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index ae8a448e343..42f5ac9ed95 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -232,24 +232,21 @@ test_expect_success '--expire=never' '
 '
 
 test_expect_success 'gc.reflogexpire=never' '
+	test_config gc.reflogexpire never &&
+	test_config gc.reflogexpireunreachable never &&
 
-	git config gc.reflogexpire never &&
-	git config gc.reflogexpireunreachable never &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
 
 test_expect_success 'gc.reflogexpire=false' '
+	test_config gc.reflogexpire false &&
+	test_config gc.reflogexpireunreachable false &&
 
-	git config gc.reflogexpire false &&
-	git config gc.reflogexpireunreachable false &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
-	test_line_count = 4 output &&
-
-	git config --unset gc.reflogexpire &&
-	git config --unset gc.reflogexpireunreachable
+	test_line_count = 4 output
 
 '
 
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 5/8] reflog tests: test for the "points nowhere" warning
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 4/8] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
                               ` (3 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

The "git reflog expire" command when given an unknown reference has
since 4264dc15e1 ("git reflog expire", 2006-12-19) when this command
was implemented emit an error, but this has never been tested for.

Let's test for it, also under gc.reflogExpire{Unreachable,}=never in
case a future change is tempted to take shortcuts in the presence of
such config.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index 42f5ac9ed95..e8f8ac97856 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -250,6 +250,16 @@ test_expect_success 'gc.reflogexpire=false' '
 
 '
 
+test_expect_success 'git reflog expire unknown reference' '
+	test_config gc.reflogexpire never &&
+	test_config gc.reflogexpireunreachable never &&
+
+	test_must_fail git reflog expire master@{123} 2>stderr &&
+	test_i18ngrep "points nowhere" stderr &&
+	test_must_fail git reflog expire does-not-exist 2>stderr &&
+	test_i18ngrep "points nowhere" stderr
+'
+
 test_expect_success 'checkout should not delete log for packed ref' '
 	test $(git reflog master | wc -l) = 4 &&
 	git branch foo &&
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never"
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (5 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 5/8] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-19  6:20               ` Jeff King
  2019-03-15 15:59             ` [PATCH v3 7/8] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  9 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
and --stale-fix isn't in effect (covered by the first part of the "if"
statement being modified here) we *could* exit early without
pointlessly looping over all the reflogs.

However, as an earlier change to add a test for the "points nowhere"
warning shows even in such a mode we might want to print out a
warning.

So while it's conceivable to implement this, I don't think it's worth
it. It's going to be too easy to inadvertently add some flag that'll
make the expiry happen anyway, and even with "never" we'd like to see
all the lines we're going to keep.

So let's assert that we're going to loop over all the references even
when this configuration is in effect.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index e8f8ac97856..79f731db37c 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -235,7 +235,9 @@ test_expect_success 'gc.reflogexpire=never' '
 	test_config gc.reflogexpire never &&
 	test_config gc.reflogexpireunreachable never &&
 
-	git reflog expire --verbose --all &&
+	git reflog expire --verbose --all >output &&
+	test_line_count = 9 output &&
+
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 7/8] gc: handle & check gc.reflogExpire config
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (6 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
  2019-03-15 15:59             ` [PATCH v3 8/8] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
  2019-03-19  6:27             ` [PATCH v2 0/7] gc: minor code cleanup + contention fixes Jeff King
  9 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Don't redundantly run "git reflog expire --all" when gc.reflogExpire
and gc.reflogExpireUnreachable are set to "never", and die immediately
if those configuration valuer are bad.

As an earlier "assert lack of early exit" change to the tests for "git
reflog expire" shows, an early check of gc.reflogExpire{Unreachable,}
isn't wanted in general for "git reflog expire", but it makes sense
for "gc" because:

 1) Similarly to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    arguments more nicely", 2018-04-21) we'll now die early if the
    config variables are set to invalid values.

    We run "pack-refs" before "reflog expire", which can take a while,
    only to then die on an invalid gc.reflogExpire{Unreachable,}
    configuration.

 2) Not invoking the command at all means it won't show up in trace
    output, which makes what's going on more obvious when the two are
    set to "never".

 3) As a later change documents we lock the refs when looping over the
    refs to expire, even in cases where we end up doing nothing due to
    this config.

    For the reasons noted in the earlier "assert lack of early exit"
    change I don't think it's worth it to bend over backwards in "git
    reflog expire" itself to carefully detect if we'll really do
    nothing given the combination of all its possible options and skip
    that locking, but that's easy to detect here in "gc" where we'll
    only run "reflog expire" in a relatively simple mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c  | 17 +++++++++++++++++
 t/t6500-gc.sh | 19 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index ae716a00d4a..8943bcc300d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -116,6 +116,19 @@ static void process_log_file_on_signal(int signo)
 	raise(signo);
 }
 
+static int gc_config_is_timestamp_never(const char *var)
+{
+	const char *value;
+	timestamp_t expire;
+
+	if (!git_config_get_value(var, &value) && value) {
+		if (parse_expiry_date(value, &expire))
+			die(_("failed to parse '%s' value '%s'"), var, value);
+		return expire == 0;
+	}
+	return 0;
+}
+
 static void gc_config(void)
 {
 	const char *value;
@@ -127,6 +140,10 @@ static void gc_config(void)
 			pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
+	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
+	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
+		prune_reflogs = 0;
+
 	git_config_get_int("gc.aggressivewindow", &aggressive_window);
 	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
 	git_config_get_int("gc.auto", &gc_auto_threshold);
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 4684d06552a..7411bf7fecd 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -120,6 +120,25 @@ test_expect_success 'gc --quiet' '
 	test_must_be_empty stderr
 '
 
+test_expect_success 'gc.reflogExpire{Unreachable,}=never skips "expire" via "gc"' '
+	test_config gc.reflogExpire never &&
+	test_config gc.reflogExpireUnreachable never &&
+
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+
+	# Check that git-pack-refs is run as a sanity check (done via
+	# gc_before_repack()) but that git-expire is not.
+	grep -E "^trace: (built-in|exec|run_command): git pack-refs --" trace.out &&
+	! grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
+test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "expire" via "gc"' '
+	>trace.out &&
+	test_config gc.reflogExpire never &&
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+	grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v3 8/8] reflog expire: don't assert the OID when locking refs
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (7 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 7/8] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
@ 2019-03-15 15:59             ` Ævar Arnfjörð Bjarmason
       [not found]               ` <b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu>
  2019-03-19  6:27             ` [PATCH v2 0/7] gc: minor code cleanup + contention fixes Jeff King
  9 siblings, 1 reply; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-15 15:59 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one to expire its reflog by getting a *.lock file on the
corresponding loose ref[1] (even if the actual ref is packed).

This lock is needed, but what isn't needed is locking the loose ref as
a function of the OID we found from that first iteration. By the time
we get around to re-visiting the reference some of the OIDs may have
changed.

Thus the verify_lock() function called by the lock_ref_oid_basic()
function being changed here would fail with e.g. "ref '%s' is at %s
but expected %s" if the repository was being updated concurrent to the
"reflog expire".

By not passing the OID to it we'll try to lock the reference
regardless of it last known OID. Locking as a function of the OID
would make "reflog expire" exit with a non-zero exit status under such
contention, which in turn meant that a "gc" command (which expires
reflogs before forking to the background) would encounter a hard
error.

This behavior of considering the OID when locking has been here ever
since "reflog expire" was initially implemented in 4264dc15e1 ("git
reflog expire", 2006-12-19). As seen in that simpler initial version
of the code we subsequently use the OID to inform the expiry (and
still do), but never needed to use it to lock the reference associated
with the reflog.

By locking the reference without considering what OID we last saw it
at, we won't encounter user-visible contention to the extent that
core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21).

Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:

    (
        cd /tmp &&
        rm -rf git &&
        git init git &&
        cd git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >out &&
            git add out &&
            git commit -m"out"
        done
    )

    (
        cd /tmp &&
        rm -rf git-clone &&
        git clone file:///tmp/git git-clone &&
        cd git-clone &&
        while git pull
        do
            date
        done
    )

    (
        cd /tmp/git-clone &&
        while git reflog expire --all
        do
            date
        done
    )

Before this change the "reflog expire" would fail really quickly with
a "but expected" error. After this change both the "pull" and "reflog
expire" will run for a while, but eventually fail because I get
unlucky with core.filesRefLockTimeout (the "reflog expire" is in a
really tight loop). That can be resolved by being more generous with
higher values of core.filesRefLockTimeout than the 100ms default.

As noted in the commentary being added here we also need to handle the
case of references being racily deleted, that can be tested by adding
this to the above:

    (
        cd /tmp/git-clone &&
        while git branch topic master && git branch -D topic
        do
            date
        done
    )

We could change lock_ref_oid_basic() to always pass down
RESOLVE_REF_READING to refs_resolve_ref_unsafe() and then
files_reflog_expire() to detect the "is it deleted?" state. But let's
not bother, in the event of such a race we're going to redundantly
create a lock on the deleted reference, and shortly afterwards handle
that case and others with the refs_reflog_exists() check.

1. https://public-inbox.org/git/54857871.5090805@alum.mit.edu/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 refs/files-backend.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index ef053f716c3..c7ed1792b3b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3036,8 +3036,14 @@ static int files_reflog_expire(struct ref_store *ref_store,
 	 * The reflog file is locked by holding the lock on the
 	 * reference itself, plus we might need to update the
 	 * reference if --updateref was specified:
+	 *
+	 * We don't pass down the oid here because we'd like to be
+	 * tolerant to the OID of the ref having changed, and to
+	 * gracefully handle the case where it's been deleted (see oid
+	 * -> mustexist -> RESOLVE_REF_READING in
+	 * lock_ref_oid_basic()) ...
 	 */
-	lock = lock_ref_oid_basic(refs, refname, oid,
+	lock = lock_ref_oid_basic(refs, refname, NULL,
 				  NULL, NULL, REF_NO_DEREF,
 				  &type, &err);
 	if (!lock) {
@@ -3045,6 +3051,13 @@ static int files_reflog_expire(struct ref_store *ref_store,
 		strbuf_release(&err);
 		return -1;
 	}
+	/*
+	 * When refs are deleted their reflog is deleted before the
+	 * loose ref is deleted. This catches that case, i.e. when
+	 * racing against a ref deletion lock_ref_oid_basic() will
+	 * have acquired a lock on the now-deleted ref, but here's
+	 * where we find out it has no reflog anymore.
+	 */
 	if (!refs_reflog_exists(ref_store, refname)) {
 		unlock_ref(lock);
 		return 0;
-- 
2.21.0.360.g471c308f928


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs
  2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
  2019-03-15 10:32                 ` Duy Nguyen
  2019-03-15 15:51                 ` Johannes Schindelin
@ 2019-03-18  6:04                 ` Junio C Hamano
  2 siblings, 0 replies; 60+ messages in thread
From: Junio C Hamano @ 2019-03-18  6:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Duy Nguyen, Git Mailing List, Jeff King, Michael Haggerty,
	Stefan Beller, Jonathan Nieder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>> +       if (gc_config_is_timestamp_never("gc.reflogexpire") &&
>>> +           gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
>>
>> Nit. configset api normalizes the key internally, so we can safely
>> write gc.reflogExpireUnreachable here, which is a bit easier to read.
>
> I didn't know that, do we want to do that?
>
> I'd think that as a matter of coding style always sticking to lower case
> in the C code made more sense, because e.g. you might #define a config
> key, and then use it both in git_config_*() as well as via strcmp() in
> some callback. Mixing the two would lead to confusion and possible bugs
> as we'd refactor the former of those patterns to the latter.

When parsing end-user supplied value in a "const char *var", with

    #define REU "gc.reflogexpireunreacahble"

strcmp(var, REU) is wrong.  You must use strcasecmp(var, REU)
because the end-user supplied value may be in camelCase.

And once the code uses strcasecmp(var, REU), using camelCase

    #define REU "gc.reflogExpireUnreacahble"

to compare against would not break it, either.  The only difference
is that being able to see the word boundary may make it easier to
read.

So even when git_config-*() and str*cmp() are used in a mixed way, I
do not think it is a good justification to make the string constant
all lowercase (assuming that camel is easier than all lowercase to
read, that is).

Parsing three-level names is tricky.  Without breaking it down into
components first, you cannot use strcmp nor strcasecmp; comparing
the whole variable name with either function is wrong.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 0/8] gc: minor code cleanup + contention fixes
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
@ 2019-03-18  6:13               ` Junio C Hamano
  2019-03-28 16:14               ` [PATCH v4 0/7] gc: tests and handle reflog expire config Ævar Arnfjörð Bjarmason
                                 ` (7 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Junio C Hamano @ 2019-03-18  6:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Changes since v2:
>
>  * The "let's detect that we don't need work in reflog expire" patch
>    is gone, as noted in a new set of tests/commit messages that was
>    tricker than expected, and not worth it.
>
>  * Minor rewording fixes to commit messages.
>
>  * Further paraphrased rationale in commit messages from replies to v2
>    feedback (thanks all!)
>
>  * The "no OID" and "no mustexist" case was confusingly conflated, now
>    we still do that in 8/8, but with a commit message & commit
>    explaining why and how that works.

All made sense (admittedly from a cursory look).  Will queue.

Thanks, all.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never"
  2019-03-15 15:59             ` [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
@ 2019-03-19  6:20               ` Jeff King
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-19  6:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Fri, Mar 15, 2019 at 04:59:57PM +0100, Ævar Arnfjörð Bjarmason wrote:

> When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
> and --stale-fix isn't in effect (covered by the first part of the "if"
> statement being modified here) we *could* exit early without
> pointlessly looping over all the reflogs.

Er, which "if" statement are we modifying here? :)

I assume this is leftover from the earlier attempt which actually did
modify it.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 0/7] gc: minor code cleanup + contention fixes
  2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                               ` (8 preceding siblings ...)
  2019-03-15 15:59             ` [PATCH v3 8/8] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
@ 2019-03-19  6:27             ` Jeff King
  9 siblings, 0 replies; 60+ messages in thread
From: Jeff King @ 2019-03-19  6:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Stefan Beller, Jonathan Nieder

On Thu, Mar 14, 2019 at 01:34:32PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Addresse Peff's comments to v1. For a couple of patches I've faked up
> his SOB where I copy/pasted a comment or code from a v1 comment
> verbatim. Will see if he acks that.

Yep, for the record, those are fine (actually, I guess one of them got
axed in v3).

> The main change is a better commit message in the last patch (now
> 7/7), and 2x new "reflog" patches to make it early exit in addition to
> "gc" when there's nothing to do.
> 
> There was the "why do it at all in gc" feedback on 6/7 in v1. I've
> adjusted the commit message there to justify it, but we'll see what
> Peff things about it this time around...

Even after reading your v3 reasoning, I still think it would probably be
OK for git-reflog to treat the "never" case as a noop. But frankly it's
not worth even spending any more brain cycles on. I can't imagine why
somebody would bother to run "reflog expire" without any expiration,
outside of git-gc.

The improved commit messages all made sense to me (including the NULL
oid bits in the final one), with the exception of one nit I pointed out
in one of the v3 messages.

-Peff

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v3 8/8] reflog expire: don't assert the OID when locking refs
       [not found]               ` <b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu>
@ 2019-03-21 14:10                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-21 14:10 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: git, Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Jeff King, Stefan Beller, Jonathan Nieder


On Tue, Mar 19 2019, Michael Haggerty wrote:

> Thanks for your work and for your thorough explanation of the change!

Hi. Yes, thanks a lot for the feedback. Just hadn't gotten around to
looping back to this yet & digging into the issue you raised.

> On 3/15/19 4:59 PM, Ævar Arnfjörð Bjarmason wrote:
>> During reflog expiry, the cmd_reflog_expire() function first iterates
>> over all reflogs in logs/*, and then one-by-one acquires the lock for
>> each one to expire its reflog by getting a *.lock file on the
>> corresponding loose ref[1] (even if the actual ref is packed).
>>
>> This lock is needed, but what isn't needed is locking the loose ref as
>> a function of the OID we found from that first iteration. By the time
>> we get around to re-visiting the reference some of the OIDs may have
>> changed.
>
> Instead of "what isn't needed is locking the loose ref as a function of
> the OID we found from that first iteration", I suggest "what isn't
> needed is to insist that the reference still has the OID that we found
> in that first iteration".
>
>> Thus the verify_lock() function called by the lock_ref_oid_basic()
>> function being changed here would fail with e.g. "ref '%s' is at %s
>> but expected %s" if the repository was being updated concurrent to the
>> "reflog expire".
>>
>> By not passing the OID to it we'll try to lock the reference
>> regardless of it last known OID. Locking as a function of the OID
>
> s/it/its/
>
>> would make "reflog expire" exit with a non-zero exit status under such
>> contention, which in turn meant that a "gc" command (which expires
>> reflogs before forking to the background) would encounter a hard
>> error.
>
> The last sentence seems mostly redundant with the previous paragraph.
>
>> This behavior of considering the OID when locking has been here ever
>> since "reflog expire" was initially implemented in 4264dc15e1 ("git
>> reflog expire", 2006-12-19). As seen in that simpler initial version
>> of the code we subsequently use the OID to inform the expiry (and
>> still do), but never needed to use it to lock the reference associated
>> with the reflog.
>>
>> By locking the reference without considering what OID we last saw it
>> at, we won't encounter user-visible contention to the extent that
>> core.filesRefLockTimeout mitigates it. See 4ff0f01cb7 ("refs: retry
>> acquiring reference locks for 100ms", 2017-08-21).
>>
>> Unfortunately this sort of probabilistic contention is hard to turn
>> into a test. I've tested this by running the following three subshells
>> in concurrent terminals:
>>
>>     (
>>         cd /tmp &&
>>         rm -rf git &&
>>         git init git &&
>>         cd git &&
>>         while true
>>         do
>>             head -c 10 /dev/urandom | hexdump >out &&
>>             git add out &&
>>             git commit -m"out"
>>         done
>>     )
>>
>>     (
>>         cd /tmp &&
>>         rm -rf git-clone &&
>>         git clone file:///tmp/git git-clone &&
>>         cd git-clone &&
>>         while git pull
>>         do
>>             date
>>         done
>>     )
>>
>>     (
>>         cd /tmp/git-clone &&
>>         while git reflog expire --all
>>         do
>>             date
>>         done
>>     )
>>
>> Before this change the "reflog expire" would fail really quickly with
>> a "but expected" error. After this change both the "pull" and "reflog
>> expire" will run for a while, but eventually fail because I get
>> unlucky with core.filesRefLockTimeout (the "reflog expire" is in a
>> really tight loop). That can be resolved by being more generous with
>> higher values of core.filesRefLockTimeout than the 100ms default.
>>
>> As noted in the commentary being added here we also need to handle the
>> case of references being racily deleted, that can be tested by adding
>> this to the above:
>>
>>     (
>>         cd /tmp/git-clone &&
>>         while git branch topic master && git branch -D topic
>>         do
>>             date
>>         done
>>     )
>>
>> We could change lock_ref_oid_basic() to always pass down
>> RESOLVE_REF_READING to refs_resolve_ref_unsafe() and then
>> files_reflog_expire() to detect the "is it deleted?" state. But let's
>> not bother, in the event of such a race we're going to redundantly
>> create a lock on the deleted reference, and shortly afterwards handle
>> that case and others with the refs_reflog_exists() check.
>>
>> 1. https://public-inbox.org/git/54857871.5090805@alum.mit.edu/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  refs/files-backend.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index ef053f716c3..c7ed1792b3b 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -3036,8 +3036,14 @@ static int files_reflog_expire(struct ref_store *ref_store,
>>  	 * The reflog file is locked by holding the lock on the
>>  	 * reference itself, plus we might need to update the
>>  	 * reference if --updateref was specified:
>> +	 *
>> +	 * We don't pass down the oid here because we'd like to be
>> +	 * tolerant to the OID of the ref having changed, and to
>> +	 * gracefully handle the case where it's been deleted (see oid
>> +	 * -> mustexist -> RESOLVE_REF_READING in
>> +	 * lock_ref_oid_basic()) ...
>>  	 */
>> -	lock = lock_ref_oid_basic(refs, refname, oid,
>> +	lock = lock_ref_oid_basic(refs, refname, NULL,
>>  				  NULL, NULL, REF_NO_DEREF,
>>  				  &type, &err);
>
> This seems totally reasonable. But then later, where `oid` is passed to
> `(*prepare_fn)()`, I think you must pass `&(lock->old_oid)` instead,
> since we no longer have a guarantee that `oid` reflects the correct
> state of the reference. And after that, there is no need for this
> function to take an `oid` parameter at all (which also makes sense from
> an abstract point of view). Which means that the signatures of
> `refs_reflog_expire()`, `reflog_expire()`, `packed_reflog_expire()`, and
> `reflog_expire_fn` can also be changed, along with callers.
>
> I haven't had time yet to inspect those callers to see whether they
> might actually care that the `oid` that they used to pass to
> `reflog_expire()` isn't necessarily the one that gets passed back to
> their callbacks, but following the trail that I just outlined should
> make it possible to determine that.
>
>>  	if (!lock) {
>> @@ -3045,6 +3051,13 @@ static int files_reflog_expire(struct ref_store *ref_store,
>>  		strbuf_release(&err);
>>  		return -1;
>>  	}
>> +	/*
>> +	 * When refs are deleted their reflog is deleted before the
>> +	 * loose ref is deleted. This catches that case, i.e. when
>> +	 * racing against a ref deletion lock_ref_oid_basic() will
>> +	 * have acquired a lock on the now-deleted ref, but here's
>> +	 * where we find out it has no reflog anymore.
>> +	 */
>>  	if (!refs_reflog_exists(ref_store, refname)) {
>>  		unlock_ref(lock);
>>  		return 0;
>>
>
> Cheers,
> Michael

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 0/7] gc: tests and handle reflog expire config
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
  2019-03-18  6:13               ` Junio C Hamano
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
                                 ` (6 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

It seems the list software dislikes Michael Haggerty for some reason,
but as seen in my reply to his message my v4 has some unaddressed
issues in the previous 8/8 that I need to get to:
https://public-inbox.org/git/87pnqkco8v.fsf@evledraar.gmail.com/

I'll have limited time next week to get to that, so in the meantime
here's a re-send without that patch, but just the unrelated
cleanup/tests & gc "expire" fix up to v3's 7/8. The only other change
is fixing commit message nonsense (from an earlier version), which
Peff pointed out.

I'll then get to the issue mhaggerty noted & submit that independently
later, but this should be ready for queuing & moving down to next,
since (unlike the previous 8/8) none of it's tricky code we need to be
really careful with.

Ævar Arnfjörð Bjarmason (7):
  gc: remove redundant check for gc_auto_threshold
  gc: convert to using the_hash_algo
  gc: refactor a "call me once" pattern
  reflog tests: make use of "test_config" idiom
  reflog tests: test for the "points nowhere" warning
  reflog tests: assert lack of early exit with expiry="never"
  gc: handle & check gc.reflogExpire config

 builtin/gc.c      | 37 +++++++++++++++++++++++++++++--------
 t/t1410-reflog.sh | 25 +++++++++++++++++--------
 t/t6500-gc.sh     | 19 +++++++++++++++++++
 3 files changed, 65 insertions(+), 16 deletions(-)

Range-diff:
1:  81694c8213 = 1:  be889156db gc: remove redundant check for gc_auto_threshold
2:  4bdcf1d0be = 2:  764c9a7380 gc: convert to using the_hash_algo
3:  9444a1233a = 3:  d521c22103 gc: refactor a "call me once" pattern
4:  60a06ae618 = 4:  768aba9889 reflog tests: make use of "test_config" idiom
5:  52838fdc44 = 5:  2ddbee93a1 reflog tests: test for the "points nowhere" warning
6:  6063429f10 ! 6:  97e3d74371 reflog tests: assert lack of early exit with expiry="never"
    @@ -3,8 +3,7 @@
         reflog tests: assert lack of early exit with expiry="never"
     
         When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
    -    and --stale-fix isn't in effect (covered by the first part of the "if"
    -    statement being modified here) we *could* exit early without
    +    and --stale-fix isn't in effect we *could* exit early without
         pointlessly looping over all the reflogs.
     
         However, as an earlier change to add a test for the "points nowhere"
7:  6693d1d84d = 7:  48e5c234ae gc: handle & check gc.reflogExpire config
8:  e0814569ab < -:  ---------- reflog expire: don't assert the OID when locking refs
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 1/7] gc: remove redundant check for gc_auto_threshold
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
  2019-03-18  6:13               ` Junio C Hamano
  2019-03-28 16:14               ` [PATCH v4 0/7] gc: tests and handle reflog expire config Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
                                 ` (5 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Checking gc_auto_threshold in too_many_loose_objects() was added in
17815501a8 ("git-gc --auto: run "repack -A -d -l" as necessary.",
2007-09-17) when need_to_gc() itself was also reliant on
gc_auto_pack_limit before its early return:

    gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0

When that check was simplified to just checking "gc_auto_threshold <=
0" in b14d255ba8 ("builtin-gc.c: allow disabling all auto-gc'ing by
assigning 0 to gc.auto", 2008-03-19) this unreachable code should have
been removed. We only call too_many_loose_objects() from within
need_to_gc() itself, which will return if this condition holds, and in
cmd_gc() which will return before ever getting to "auto_gc &&
too_many_loose_objects()" if "auto_gc && !need_to_gc()" is true
earlier in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 020f725acc..8c2312681c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -157,9 +157,6 @@ static int too_many_loose_objects(void)
 	int num_loose = 0;
 	int needed = 0;
 
-	if (gc_auto_threshold <= 0)
-		return 0;
-
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
 		return 0;
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 2/7] gc: convert to using the_hash_algo
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (2 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
                                 ` (4 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

There's been a lot of changing of the hardcoded "40" values to
the_hash_algo->hexsz, but we've so far missed this one where we
hardcoded 38 for the loose object file length.

This is because a SHA-1 like abcde[...] gets turned into
objects/ab/cde[...]. There's no reason to suppose the same won't be
the case for SHA-256, and reading between the lines in
hash-function-transition.txt the format is planned to be the same.

In the future we may want to further modify this code for the hash
function transition. There's a potential pathological case here where
we'll only consider the loose objects for the currently active hash,
but objects for that hash will share a directory storage with the
other hash.

Thus we could theoretically have e.g. 1k SHA-1 loose objects, and 1
million SHA-256 objects. Then not notice that we need to pack them
because we're currently using SHA-1, even though our FS may be
straining under the stress of such humongous directories.

So assuming that "gc" eventually learns to pack up both SHA-1 and
SHA-256 objects regardless of what the current the_hash_algo is,
perhaps this check should be changed to consider all files in
objects/17/ matching [0-9a-f] 38 or 62 characters in length (i.e. both
SHA-1 and SHA-256).

But none of that is something we need to worry about now, and
supporting both 38 and 62 characters depending on "the_hash_algo"
removes another case of SHA-1 hardcoding.

As noted in [1] I'm making no effort to somehow remove the hardcoding
for "2" as in "use the first two hexdigits for the directory
name". There's no indication that that'll ever change, and somehow
generalizing it here would be a drop in the ocean, so there's no point
in doing that. It also couldn't be done without coming up with some
generalized version of the magical "objects/17" directory. See [2] for
a discussion of that directory.

1. https://public-inbox.org/git/874l84ber7.fsf@evledraar.gmail.com/

2. https://public-inbox.org/git/87k1mta9x5.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 8c2312681c..733bd7bdf4 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -156,6 +156,7 @@ static int too_many_loose_objects(void)
 	int auto_threshold;
 	int num_loose = 0;
 	int needed = 0;
+	const unsigned hexsz_loose = the_hash_algo->hexsz - 2;
 
 	dir = opendir(git_path("objects/17"));
 	if (!dir)
@@ -163,8 +164,8 @@ static int too_many_loose_objects(void)
 
 	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
-		if (strspn(ent->d_name, "0123456789abcdef") != 38 ||
-		    ent->d_name[38] != '\0')
+		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
+		    ent->d_name[hexsz_loose] != '\0')
 			continue;
 		if (++num_loose > auto_threshold) {
 			needed = 1;
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 3/7] gc: refactor a "call me once" pattern
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (3 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
                                 ` (3 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change an idiom we're using to ensure that gc_before_repack() only
does work once (see 62aad1849f ("gc --auto: do not lock refs in the
background", 2014-05-25)) to be more obvious.

Nothing except this function cares about the "pack_refs" and
"prune_reflogs" variables, so let's not leave the reader wondering if
they're being zero'd out for later use somewhere else.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 733bd7bdf4..ae716a00d4 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -489,14 +489,20 @@ static int report_last_gc_error(void)
 
 static void gc_before_repack(void)
 {
+	/*
+	 * We may be called twice, as both the pre- and
+	 * post-daemonized phases will call us, but running these
+	 * commands more than once is pointless and wasteful.
+	 */
+	static int done = 0;
+	if (done++)
+		return;
+
 	if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, pack_refs_cmd.argv[0]);
 
 	if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD))
 		die(FAILED_RUN, reflog.argv[0]);
-
-	pack_refs = 0;
-	prune_reflogs = 0;
 }
 
 int cmd_gc(int argc, const char **argv, const char *prefix)
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 4/7] reflog tests: make use of "test_config" idiom
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (4 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 5/7] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
                                 ` (2 subsequent siblings)
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Change a couple of tests that weren't using the helper to use it. This
makes the trailing "--unset" unnecessary.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index ae8a448e34..42f5ac9ed9 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -232,24 +232,21 @@ test_expect_success '--expire=never' '
 '
 
 test_expect_success 'gc.reflogexpire=never' '
+	test_config gc.reflogexpire never &&
+	test_config gc.reflogexpireunreachable never &&
 
-	git config gc.reflogexpire never &&
-	git config gc.reflogexpireunreachable never &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
 
 test_expect_success 'gc.reflogexpire=false' '
+	test_config gc.reflogexpire false &&
+	test_config gc.reflogexpireunreachable false &&
 
-	git config gc.reflogexpire false &&
-	git config gc.reflogexpireunreachable false &&
 	git reflog expire --verbose --all &&
 	git reflog refs/heads/master >output &&
-	test_line_count = 4 output &&
-
-	git config --unset gc.reflogexpire &&
-	git config --unset gc.reflogexpireunreachable
+	test_line_count = 4 output
 
 '
 
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 5/7] reflog tests: test for the "points nowhere" warning
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (5 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 6/7] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 7/7] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

The "git reflog expire" command when given an unknown reference has
since 4264dc15e1 ("git reflog expire", 2006-12-19) when this command
was implemented emit an error, but this has never been tested for.

Let's test for it, also under gc.reflogExpire{Unreachable,}=never in
case a future change is tempted to take shortcuts in the presence of
such config.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index 42f5ac9ed9..e8f8ac9785 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -250,6 +250,16 @@ test_expect_success 'gc.reflogexpire=false' '
 
 '
 
+test_expect_success 'git reflog expire unknown reference' '
+	test_config gc.reflogexpire never &&
+	test_config gc.reflogexpireunreachable never &&
+
+	test_must_fail git reflog expire master@{123} 2>stderr &&
+	test_i18ngrep "points nowhere" stderr &&
+	test_must_fail git reflog expire does-not-exist 2>stderr &&
+	test_i18ngrep "points nowhere" stderr
+'
+
 test_expect_success 'checkout should not delete log for packed ref' '
 	test $(git reflog master | wc -l) = 4 &&
 	git branch foo &&
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 6/7] reflog tests: assert lack of early exit with expiry="never"
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (6 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 5/7] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  2019-03-28 16:14               ` [PATCH v4 7/7] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

When gc.reflogExpire and gc.reflogExpireUnreachable are set to "never"
and --stale-fix isn't in effect we *could* exit early without
pointlessly looping over all the reflogs.

However, as an earlier change to add a test for the "points nowhere"
warning shows even in such a mode we might want to print out a
warning.

So while it's conceivable to implement this, I don't think it's worth
it. It's going to be too easy to inadvertently add some flag that'll
make the expiry happen anyway, and even with "never" we'd like to see
all the lines we're going to keep.

So let's assert that we're going to loop over all the references even
when this configuration is in effect.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1410-reflog.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index e8f8ac9785..79f731db37 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -235,7 +235,9 @@ test_expect_success 'gc.reflogexpire=never' '
 	test_config gc.reflogexpire never &&
 	test_config gc.reflogexpireunreachable never &&
 
-	git reflog expire --verbose --all &&
+	git reflog expire --verbose --all >output &&
+	test_line_count = 9 output &&
+
 	git reflog refs/heads/master >output &&
 	test_line_count = 4 output
 '
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v4 7/7] gc: handle & check gc.reflogExpire config
  2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
                                 ` (7 preceding siblings ...)
  2019-03-28 16:14               ` [PATCH v4 6/7] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
@ 2019-03-28 16:14               ` Ævar Arnfjörð Bjarmason
  8 siblings, 0 replies; 60+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-28 16:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Michael Haggerty, Stefan Beller, Jonathan Nieder,
	Ævar Arnfjörð Bjarmason

Don't redundantly run "git reflog expire --all" when gc.reflogExpire
and gc.reflogExpireUnreachable are set to "never", and die immediately
if those configuration valuer are bad.

As an earlier "assert lack of early exit" change to the tests for "git
reflog expire" shows, an early check of gc.reflogExpire{Unreachable,}
isn't wanted in general for "git reflog expire", but it makes sense
for "gc" because:

 1) Similarly to 8ab5aa4bd8 ("parseopt: handle malformed --expire
    arguments more nicely", 2018-04-21) we'll now die early if the
    config variables are set to invalid values.

    We run "pack-refs" before "reflog expire", which can take a while,
    only to then die on an invalid gc.reflogExpire{Unreachable,}
    configuration.

 2) Not invoking the command at all means it won't show up in trace
    output, which makes what's going on more obvious when the two are
    set to "never".

 3) As a later change documents we lock the refs when looping over the
    refs to expire, even in cases where we end up doing nothing due to
    this config.

    For the reasons noted in the earlier "assert lack of early exit"
    change I don't think it's worth it to bend over backwards in "git
    reflog expire" itself to carefully detect if we'll really do
    nothing given the combination of all its possible options and skip
    that locking, but that's easy to detect here in "gc" where we'll
    only run "reflog expire" in a relatively simple mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c  | 17 +++++++++++++++++
 t/t6500-gc.sh | 19 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index ae716a00d4..8943bcc300 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -116,6 +116,19 @@ static void process_log_file_on_signal(int signo)
 	raise(signo);
 }
 
+static int gc_config_is_timestamp_never(const char *var)
+{
+	const char *value;
+	timestamp_t expire;
+
+	if (!git_config_get_value(var, &value) && value) {
+		if (parse_expiry_date(value, &expire))
+			die(_("failed to parse '%s' value '%s'"), var, value);
+		return expire == 0;
+	}
+	return 0;
+}
+
 static void gc_config(void)
 {
 	const char *value;
@@ -127,6 +140,10 @@ static void gc_config(void)
 			pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
+	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
+	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
+		prune_reflogs = 0;
+
 	git_config_get_int("gc.aggressivewindow", &aggressive_window);
 	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
 	git_config_get_int("gc.auto", &gc_auto_threshold);
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 4684d06552..7411bf7fec 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -120,6 +120,25 @@ test_expect_success 'gc --quiet' '
 	test_must_be_empty stderr
 '
 
+test_expect_success 'gc.reflogExpire{Unreachable,}=never skips "expire" via "gc"' '
+	test_config gc.reflogExpire never &&
+	test_config gc.reflogExpireUnreachable never &&
+
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+
+	# Check that git-pack-refs is run as a sanity check (done via
+	# gc_before_repack()) but that git-expire is not.
+	grep -E "^trace: (built-in|exec|run_command): git pack-refs --" trace.out &&
+	! grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
+test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "expire" via "gc"' '
+	>trace.out &&
+	test_config gc.reflogExpire never &&
+	GIT_TRACE=$(pwd)/trace.out git gc &&
+	grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.21.0.392.gf8f6787159e


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2019-03-28 16:14 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-12 23:28 BUG: Race condition due to reflog expiry in "gc" Ævar Arnfjörð Bjarmason
2019-03-13  1:43 ` Junio C Hamano
2019-03-13 10:28   ` Ævar Arnfjörð Bjarmason
2019-03-13 16:02     ` Jeff King
2019-03-13 16:22       ` Ævar Arnfjörð Bjarmason
2019-03-13 23:54         ` [PATCH 0/5] gc: minor code cleanup + contention fixes Ævar Arnfjörð Bjarmason
2019-03-14 12:34           ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
2019-03-18  6:13               ` Junio C Hamano
2019-03-28 16:14               ` [PATCH v4 0/7] gc: tests and handle reflog expire config Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 5/7] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 6/7] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
2019-03-28 16:14               ` [PATCH v4 7/7] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 1/8] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 2/8] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 3/8] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 4/8] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 5/8] reflog tests: test for the "points nowhere" warning Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 6/8] reflog tests: assert lack of early exit with expiry="never" Ævar Arnfjörð Bjarmason
2019-03-19  6:20               ` Jeff King
2019-03-15 15:59             ` [PATCH v3 7/8] gc: handle & check gc.reflogExpire config Ævar Arnfjörð Bjarmason
2019-03-15 15:59             ` [PATCH v3 8/8] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
     [not found]               ` <b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu>
2019-03-21 14:10                 ` Ævar Arnfjörð Bjarmason
2019-03-19  6:27             ` [PATCH v2 0/7] gc: minor code cleanup + contention fixes Jeff King
2019-03-14 12:34           ` [PATCH v2 1/7] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
2019-03-14 12:34           ` [PATCH v2 2/7] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
2019-03-15  9:51             ` Duy Nguyen
2019-03-15 10:42               ` Ævar Arnfjörð Bjarmason
2019-03-15 15:49                 ` Johannes Schindelin
2019-03-14 12:34           ` [PATCH v2 3/7] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
2019-03-14 12:34           ` [PATCH v2 4/7] reflog tests: make use of "test_config" idiom Ævar Arnfjörð Bjarmason
2019-03-14 12:34           ` [PATCH v2 5/7] reflog: exit early if there's no work to do Ævar Arnfjörð Bjarmason
2019-03-15 10:01             ` Duy Nguyen
2019-03-15 15:51               ` Ævar Arnfjörð Bjarmason
2019-03-14 12:34           ` [PATCH v2 6/7] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
2019-03-15 10:05             ` Duy Nguyen
2019-03-15 10:24               ` Ævar Arnfjörð Bjarmason
2019-03-15 10:32                 ` Duy Nguyen
2019-03-15 15:51                 ` Johannes Schindelin
2019-03-18  6:04                 ` Junio C Hamano
2019-03-14 12:34           ` [PATCH v2 7/7] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
2019-03-15 11:10             ` Duy Nguyen
2019-03-15 15:52               ` Ævar Arnfjörð Bjarmason
2019-03-13 23:54         ` [PATCH 1/5] gc: remove redundant check for gc_auto_threshold Ævar Arnfjörð Bjarmason
2019-03-14  0:25           ` Jeff King
2019-03-13 23:54         ` [PATCH 2/5] gc: convert to using the_hash_algo Ævar Arnfjörð Bjarmason
2019-03-14  0:26           ` Jeff King
2019-03-13 23:54         ` [PATCH 3/5] gc: refactor a "call me once" pattern Ævar Arnfjörð Bjarmason
2019-03-14  0:30           ` Jeff King
2019-03-13 23:54         ` [PATCH 4/5] gc: don't run "reflog expire" when keeping reflogs Ævar Arnfjörð Bjarmason
2019-03-14  0:40           ` Jeff King
2019-03-14  4:51             ` Junio C Hamano
2019-03-14 19:26               ` Jeff King
2019-03-13 23:54         ` [PATCH 5/5] reflog expire: don't assert the OID when locking refs Ævar Arnfjörð Bjarmason
2019-03-14  0:44           ` Jeff King
2019-03-14  0:25         ` BUG: Race condition due to reflog expiry in "gc" Jeff King

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).