git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] refs: do not use cached refs in repack_without_ref
Date: Sat, 29 Dec 2012 02:16:30 -0500	[thread overview]
Message-ID: <20121229071630.GA15408@sigill.intra.peff.net> (raw)
In-Reply-To: <50DAB447.8000101@alum.mit.edu>

On Wed, Dec 26, 2012 at 09:24:39AM +0100, Michael Haggerty wrote:

> I'm sorry to take so long to respond to this patch.  Thank you for
> tracking down this bug and for your careful analysis.
> 
> I think your patch is correct and should fix the first race condition
> that you described.

Thanks for checking it over. I almost cc'd you, as I know you have been
working on ref caching. But I think that the problem is much older, as
we always cached the packed-refs list in memory.

> But I think the continued existence of the other race conditions is an
> indication of a fundamental problem with the reference locking
> policy--independent of the in-RAM reference cache.
> 
> The tacit assumption of the current locking policy is that changes to
> the packed-refs file are not critical for correctness, because loose
> references take precedence over it anyway.  This is true for adding and
> modifying references.  But it is not true for deleting references,
> because there is no way for a deletion to be represented as a loose
> reference in a way that takes precedence over the packed-refs file
> (i.e., there is no way for a loose reference to say "I am deleted,
> regardless of what packed-refs says").  Thus the race conditions for
> deleting references, whether via delete_ref() or via pack_refs() with
> --prune.

Yeah. It would be much nicer if we could just write the null sha1 or a
similar sentinel into the loose ref for "I am deleted". The problem
(besides backwards compatibility) is the usual D/F conflict. I cannot
delete "refs/heads/foo" and then create "refs/heads/foo/bar" if the old
ref file is in the way.

> There is a problem if two processes try to delete a reference at the
> same time, or if one process tries to delete a reference at the same
> time as another process is trying to pack the references.  The reason is
> that there is no "transaction" that spans both the rewriting of the
> packed-refs file and also the deletion of the loose-ref files, and
> therefore it is possible for conflicting changes to be made in the two
> locations.

Just to be clear, are you talking about the races I wrote about in my
other email? Or are there other races? I didn't (and still don't) see
any actual on-disk data loss races. Just ones where a reader may get an
old, packed value (which is still bad, but is less bad than one where a
write is lost).

> I think that all of the problems would be fixed if a lock would be held
> on the packed-refs file during the whole process of deleting any
> reference; i.e., change the algorithms to:

Yeah, I looked at that, too. In fact, before I had correctly analyzed
the problem, I thought that doing so would solve the problem I was
seeing (which I noticed was wrong when it did not pass my tests :) ).

>From your description, I think you realize this, but I want to point out
for other readers: just updating the pack_refs side to call prune_refs
under the lock would create a deadlock with a simultaneous delete (which
takes the individual ref lock first, then the packed-refs lock). Of
course, I do not think git is capable of deadlock, as we typically just
die() instead of blocking on a lock. So maybe it does not matter so
much. :)

> * Delete reference foo:
> 
>   1. Acquire the lock $GIT_DIR/packed-refs.lock (regardless of whether
>      "foo" is a packed ref)

This suffers from the same problem I mentioned in my earlier email: we
create lock contention on packed-refs.lock when there are two
simultaneous deletes, even when the refs aren't packed. That might be an
acceptable tradeoff, though. It's only for deletion, not for update,
which is presumably rare. And it has to happen anyway when both refs are
packed.

>   2. Write to $GIT_DIR/packed-refs.new a version of the packed-refs
>      file that omits "foo"
> [...]

All of the further steps make sense. By deleting from packed-refs first,
we eliminate the read race-condition I mentioned in my earlier email.
The only downside is the possible increased lock contention on
packed-refs.lock.  I'm very tempted to go this route. It's better to be
correct and sometimes die on lock contention than to sometimes give the
wrong answer.

> I would appreciate a critique of my analysis.  Even if you agree, I
> think it would be OK to apply Peff's patch to fix up the most pressing
> problem, then implement the more complete solution later.

I think your analysis is correct, modulo the issue I mentioned. And I
agree that my patch is a good interim, as it fixes the worst case
(losing writes).

-Peff

  parent reply	other threads:[~2012-12-29  7:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
2013-01-04 18:01           ` Martin Fick
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` Jeff King [this message]
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` [PATCH] refs: do not use cached refs in repack_without_ref Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121229071630.GA15408@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).