From: Martin Fick <mfick@codeaurora.org>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: Shawn Pearce <sop@google.com>, Jeff King <peff@peff.net>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)
Date: Thu, 3 Jan 2013 16:52:44 -0700 [thread overview]
Message-ID: <201301031652.44982.mfick@codeaurora.org> (raw)
In-Reply-To: <201212310330.53835.mfick@codeaurora.org>
Any thoughts on this idea? Is it flawed? I am trying to
write it up in a more formal generalized manner and was
hoping to get at least one "it seems sane" before I do.
Thanks,
-Martin
On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> On Thursday, December 27, 2012 04:11:51 pm Martin Fick
wrote:
> > It concerns me that git uses any locking at all, even
> > for refs since it has the potential to leave around
> > stale locks.
> > ...
> > [a previous not so great attempt to fix this]
> > ...
>
> I may have finally figured out a working loose ref update
> mechanism which I think can avoid stale locks.
> Unfortunately it requires atomic directory renames and
> universally unique identifiers (uuids). These may be
> no-go criteria? But I figure it is worth at least
> exploring this idea because of the potential benefits?
>
> The general approach is to setup a transaction and either
> commit or abort it. A transaction can be setup by
> renaming an appropriately setup directory to the
> "ref.lock" name. If the rename succeeds, the transaction
> is begun. Any actor can abort the transaction (up until
> it is committed) by simply deleting the "ref.lock"
> directory, so it is not at risk of going stale. However,
> once the actor who sets up the transaction commits it,
> deleting the "ref.lock" directory simply aids in cleaning
> it up for the next transaction (instead of aborting it).
>
> One important piece of the transaction is the use of
> uuids. The uuids provide a mechanism to tie the atomic
> commit pieces to the transactions and thus to prevent
> long sleeping process from inadvertently performing
> actions which could be out of date when they wake finally
> up. In each case, the atomic commit piece is the
> renaming of a file. For the create and update pieces, a
> file is renamed from the "ref.lock" dir to the "ref" file
> resulting in an update to the sha for the ref. However,
> in the delete case, the "ref" file is instead renamed to
> end up in the "ref.lock" directory resulting in a delete
> of the ref. This scheme does not affect the way refs are
> read today,
>
> To prepare for a transaction, an actor first generates a
> uuid (an exercise I will delay for now). Next, a tmp
> directory named after the uuid is generated in the parent
> directory for the ref to be updated, perhaps something
> like: ".lock_uuid". In this directory is places either a
> file or a directory named after the uuid, something like:
> ".lock_uuid/,uuid". In the case of a create or an
> update, the new sha is written to this file. In the case
> of a delete, it is a directory.
>
> Once the tmp directory is setup, the initiating actor
> attempts to start the transaction by renaming the tmp
> directory to "ref.lock". If the rename fails, the update
> fails. If the rename succeeds, the actor can then attempt
> to commit the transaction (before another actor aborts
> it).
>
> In the case of a create, the actor verifies that "ref"
> does not currently exist, and then renames the now named
> "ref.lock/uuid" file to "ref". On success, the ref was
> created.
>
> In the case of an update, the actor verifies that "ref"
> currently contains the old sha, and then also renames the
> now named "ref.lock/uuid" file to "ref". On success, the
> ref was updated.
>
> In the case of a delete, the actor may verify that "ref"
> currently contains the sha to "prune" if it needs to, and
> then renames the "ref" file to "ref.lock/uuid/delete". On
> success, the ref was deleted.
>
> Whether successful or not, the actor may now simply delete
> the "ref.lock" directory, clearing the way for a new
> transaction. Any other actor may delete this directory at
> any time also, likely either on conflict (if they are
> attempting to initiate a transaction), or after a grace
> period just to cleanup the FS. Any actor may also safely
> cleanup the tmp directories, preferably also after a grace
> period.
>
> One neat part about this scheme is that I believe it would
> be backwards compatible with the current locking
> mechanism since the transaction directory will simply
> appear to be a lock to older clients. And the old lock
> file should continue to lock out these newer
> transactions.
>
> Due to this backwards compatibility, I believe that this
> could be incrementally employed today without affecting
> very much. It could be deployed in place of any updates
> which only hold ref.locks to update the loose ref. So
> for example I think it could replace step 4a below from
> Michael Haggerty's description of today's loose ref
> pruning during
>
> ref packing:
> > * Pack references:
> ...
>
> > 4. prune_refs(): for each ref in the ref_to_prune list,
> >
> > call prune_ref():
> > a. Lock the reference using lock_ref_sha1(),
> > verifying that the recorded SHA1 is still valid. If
> > it is, unlink the loose reference file then free
> > the lock; otherwise leave the loose reference file
> > untouched.
>
> I think it would also therefore be able to replace the
> loose ref locking in Michael's new ref-packing scheme as
> well as the locking in Michael's new ref deletion scheme
> (again steps
>
> 4):
> > * Delete reference foo:
> ...
>
> > 4. Delete loose ref for "foo":
> > a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> >
> > b. Unlink $GIT_DIR/refs/heads/foo if it is
> > unchanged.
> >
> > If it is changed, leave it untouched. If it is
> > deleted,
> >
> > that is OK too.
> >
> > c. Release lock $GIT_DIR/refs/heads/foo.lock
>
> ...
>
> > * Pack references:
> ...
>
> > 4. prune_refs(): for each ref in the ref_to_prune
> > list,
> >
> > call prune_ref():
> > a. Lock the loose reference using lock_ref_sha1(),
> >
> > verifying that the recorded SHA1 is still valid
> >
> > b. If it is, unlink the loose reference file
> >
> > (otherwise, leave it untouched)
> >
> > c. Release the lock on the loose reference
>
> To be honest, I suspect I missed something obvious because
> this seems almost too simple to work. I am ashamed that
> it took me so long to come up with (of course, I will be
> even more ashamed :( when it is shown to be flawed!)
> This scheme also feels extensible. if there are no
> obvious flaws in it, I will try to post solutions for ref
> packing and for multiple repository/ref transactions also
> soon.
>
> I welcome any comments/criticisms,
>
> -Martin
> --
> To unsubscribe from this list: send the line "unsubscribe
> git" in the body of a message to
> majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-01-03 23:53 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-21 8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26 8:24 ` Michael Haggerty
2012-12-27 23:11 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50 ` Martin Fick
2012-12-28 17:15 ` Lockless Refs? Junio C Hamano
2012-12-29 8:16 ` Jeff King
2012-12-29 21:15 ` Martin Fick
2012-12-29 8:12 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29 ` Martin Fick
2012-12-28 16:58 ` Lockless Refs? Junio C Hamano
2012-12-29 1:07 ` Martin Fick
2012-12-29 8:10 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18 ` Martin Fick
2012-12-30 17:03 ` Martin Fick
2012-12-31 10:30 ` Martin Fick
2013-01-03 23:52 ` Martin Fick [this message]
2013-01-04 17:52 ` Pyeron, Jason J CTR (US)
2013-01-04 18:01 ` Martin Fick
2013-01-04 21:28 ` Lockless Refs? Junio C Hamano
2013-01-05 16:12 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22 4:31 ` Drew Northup
2012-12-29 7:16 ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
[not found] ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14 ` Martin Fick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201301031652.44982.mfick@codeaurora.org \
--to=mfick@codeaurora.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
--cc=sop@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).