git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Martin Fick <mfick@codeaurora.org>
To: Jeff King <peff@peff.net>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] refs: do not use cached refs in repack_without_ref
Date: Mon, 7 Jan 2013 11:14:22 -0700	[thread overview]
Message-ID: <201301071114.22526.mfick@codeaurora.org> (raw)
In-Reply-To: <201301071109.12086.mfick@codeaurora.org>

...[Sorry about the previous HTML reposts]

Jeff King <peff@peff.net> wrote:
>On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick 
wrote:
>
>> The general approach is to setup a transaction and
>> either commit or abort it. A transaction can be setup
>> by renaming an appropriately setup directory to the
>> "ref.lock" name. If the rename succeeds, the
>> transaction is begun. Any actor can abort the
>> transaction (up until it is committed) by simply
>> deleting the "ref.lock" directory, so it is not at
>> risk of going stale.
>
> Deleting a directory is not atomic, as you first have
> to remove the contents, putting it into a potentially
> inconsistent state. I'll assume you deal with that
> later...

Right, these simple single file transactions have at 
best 1 important file/directory in them, once deleted 
the transaction is aborted (can no longer complete).  
However to support multi file transactions, a better 
approach is likely to rename the uuid directory to have 
a .delete extension before deleting stuff in it.


> > One important piece of the transaction is the use
> > of uuids. The uuids provide a mechanism to tie the
> > atomic commit pieces to the transactions and thus to
> > prevent long sleeping process from inadvertently
> > performing actions which could be out of date when
> > they wake finally up.
> >
>Has this been a problem for you in practice?

No, but as you say, we don't currently hold locks for 
very long. I anticipate it being a problem in a 
clustered environment when transactions start spanning 
repos from java processes, with insane amounts of RAM, 
which can sometimes have unpredictable indeterminately 
long java GC cycles at inopportune times.. It would seem 
short sighted if Gerrit at least did not assume this 
will be a problem.

But, deletes today in git are not so short and Michael's 
fixes may make things worse? But, as you point out, that 
should perhaps be solved a different way.


> Avoiding this is one of the reasons that git does not
> take out long locks; instead, it takes the lock only
> at the moment it is ready to write, and aborts if it
> has been updated since the longer-term operation
> began. This has its own problems (you might do a lot
> of work only to have your operation aborted), but I
> am not sure that your proposal improves on that.

It does not, it might increase this.


> Git typically holds ref locks for a few syscalls. If
> you are conservative about leaving potentially stale
> locks in place (e.g., give them a few minutes to
> complete before assuming they are now bogus), you will
> not run into that problem.

In a distributed environment even a few minutes might 
not be enough, processes could be on a remote server 
with a temporarily split network, that could cause 
delays longer than your typical local expectations.

But there is also the other piece of this problem, how 
do you detect stale locks? How long will it be stale 
until a user figures it out and reports it? How many 
other users will simply have failed pushes and wonder 
why without reporting them?


> > In each case, the atomic commit piece is the
> > renaming of a file. For the create and update
> > pieces, a file is renamed from the "ref.lock" 
> > dir to the "ref" file resulting in an update to 
> > the sha for the ref.
>
> I think we've had problems with cross-directory
> renames on some filesystems, but I don't recall the
> details. I know that Coda does not like 
> cross-directory links, but cross-directory renames 
> are OK (and in fact we fall back to the latter when
> the former does not work).
>
> Ah, here we go: 5723fe7 (Avoid cross-directory renames
> and linking on object creation, 2008-06-14). Looks
> like NFS is the culprit.

If the renames fail we can fall back to regular file 
locking, the hard part to detect and deal with would be 
if the renames don't fail but become copies/mkdirs.


>> In the case of a delete, the actor may verify that
>> "ref" currently contains the sha to "prune" if it
>> needs to, and then renames the "ref" file to
>> "ref.lock/uuid/delete". On success, the ref was
>> deleted.
>>
>> Whether successful or not, the actor may now simply
>> delete the "ref.lock" directory, clearing the way for
>> a new transaction. Any other actor may delete this
>> directory at any time also, likely either on conflict
>> (if they are attempting to initiate a transaction),
>> or after a grace period just to cleanup the FS. Any
>> actor may also safely cleanup the tmp directories,
>> preferably also after a grace period.
>
> Hmm. So what happens to the "delete" file when the
> ref.lock directory is being deleted? Presumably
> deleting the ref.lock directory means doing it
> recursively (which is non-atomic). But then why are 
> we keeping the delete file at all, if we're just about
> to remove it?

We are not trying to keep it, but we need to ensure that 
our transaction has not yet been aborted: the rename 
does this.  If we just deleted the file, we may sleep 
and another transaction may abort our transaction and 
complete before we wake up and actually delete the file. 
But by using a rename we tie the delete atomically to 
the transaction, it cannot succeed if our transaction 
was aborted during a sleep since the directory we are 
renaming the file into would be gone! This is sort of 
the magic piece that makes the whole scheme special, 
safe deletes tend to be the hardest part.


>What happens if another process wants to cancel a
> transaction that is partially done? That is, the
> ref.lock directory is created, but it contains the
> uuid subdir? It sounds like it's OK to just delete
> from it, and the original process will then fail at
> its rename?

Yes, exactly.  But again, it might be better to cause a 
rename of the uuid dir before deleting it.


>> One neat part about this scheme is that I believe it
>> would be backwards compatible with the current
>> locking mechanism since the transaction directory
>> will simply appear to be a lock to older clients. 
>> And the old lock file should continue to lock
>> out these newer transactions.
>
>Yeah, I don't see anything that would prevent that. 
> The current code only cares about open("$ref.lock",
> O_EXCL). But that should be correct and atomic with
> respect to the renamed directories. You are depending
> on atomic directory renames. Those aren't used
> anywhere yet in git, as far as I know. So that may run
> into problems (but there's no reason this system
> couldn't be optional for filesystems that are more
> abled, and other systems could fall back to the
> straight-file locking).

Right.


> So in response to your question, no, I don't see any
> real showstoppers here. 

Alright, cool, thanks for the review and analysis, I 
appreciate it.


> And unlike the "all refs are files in a directory"
> scheme, it's confined to writing, which solves the
> readdir() atomicity questions I had there.


Right, I couldn't see a way around those, I think it was 
inherently flawed.


> I still can't say I'm super excited about it, just
> because it seems like a solution in search of a
> problem that I have not experienced myself.

It might be.  But at least a solution is out there now. 
If time indicates that it is needed, I feel better 
knowing there is a solution. 

Murphy has a way of making unlikely problems real 
annoyances in automated distributed environments. The 
problem is not usually one real annoying problem, those 
get fixed. The problem we deal mostly with is 1000 
infrequent problems, all different, but they add up to a 
system which needs constant maintenance. And the 
maintenance is hard since each problem is different, the 
analysis is difficult and requires expertise.


> Fixing the minor remaining races on ref packing would
> be nice, but I do not think those are a problem of the
> on-disk lock representation. The reason they are not
> fixed now is from an attempt to keep lock contention
> low between unrelated updates (something which we
> should probably give up on in favor of correctness;
> but that is orthogonal to whether we do it with the
> existing file locks or with a new system).

Agreed.


>A much stronger fix for that would be to record deletes
> in the loose ref storage so that we don't have to
> update the packed-refs file as often (because it is
> inherently a lock bottleneck, and because it is
> wasteful when you have a very large number of refs).
> But that means dealing with directory/file conflicts
> between deleted and existing branches, which is a
> pain.

Makes sense.

Thanks again for your thoughts,

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation

      parent reply	other threads:[~2013-01-07 18:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
2013-01-04 18:01           ` Martin Fick
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` Martin Fick [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201301071114.22526.mfick@codeaurora.org \
    --to=mfick@codeaurora.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).