git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Pyeron, Jason J CTR (US)" <jason.j.pyeron.ctr@mail.mil>
To: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
Date: Fri, 4 Jan 2013 17:52:43 +0000	[thread overview]
Message-ID: <871B6C10EBEFE342A772D1159D1320853A011469@umechphj.easf.csd.disa.mil> (raw)
In-Reply-To: <201301031652.44982.mfick@codeaurora.org>

[-- Attachment #1: Type: text/plain, Size: 7382 bytes --]

> From: Martin Fick
> Sent: Thursday, January 03, 2013 6:53 PM
> 
> Any thoughts on this idea?  Is it flawed?  I am trying to
> write it up in a more formal generalized manner and was
> hoping to get at least one "it seems sane" before I do.

If you are assuming that atomic renames, etc. are available, then you should identify a test case and a degrade operation path when it is not available.

> 
> Thanks,
> 
> -Martin
> 
> On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> > On Thursday, December 27, 2012 04:11:51 pm Martin Fick
> wrote:
> > > It concerns me that git uses any locking at all, even
> > > for refs since it has the potential to leave around
> > > stale locks.
> > > ...
> > > [a previous not so great attempt to fix this]
> > > ...
> >
> > I may have finally figured out a working loose ref update
> > mechanism which I think can avoid stale locks.
> > Unfortunately it requires atomic directory renames and
> > universally unique identifiers (uuids).  These may be
> > no-go criteria?  But I figure it is worth at least
> > exploring this idea because of the potential benefits?
> >
> > The general approach is to setup a transaction and either
> > commit or abort it.  A transaction can be setup by
> > renaming an appropriately setup directory to the
> > "ref.lock" name.  If the rename succeeds, the transaction
> > is begun.  Any actor can abort the transaction (up until
> > it is committed) by simply deleting the "ref.lock"
> > directory, so it is not at risk of going stale.  However,
> > once the actor who sets up the transaction commits it,
> > deleting the "ref.lock" directory simply aids in cleaning
> > it up for the next transaction (instead of aborting it).
> >
> > One important piece of the transaction is the use of
> > uuids. The uuids provide a mechanism to tie the atomic
> > commit pieces to the transactions and thus to prevent
> > long sleeping process from inadvertently performing
> > actions which could be out of date when they wake finally
> > up.  In each case, the atomic commit piece is the
> > renaming of a file.   For the create and update pieces, a
> > file is renamed from the "ref.lock" dir to the "ref" file
> > resulting in an update to the sha for the ref. However,
> > in the delete case, the "ref" file is instead renamed to
> > end up in the "ref.lock" directory resulting in a delete
> > of the ref.  This scheme does not affect the way refs are
> > read today,
> >
> > To prepare for a transaction, an actor first generates a
> > uuid (an exercise I will delay for now).  Next, a tmp
> > directory named after the uuid is generated in the parent
> > directory for the ref to be updated, perhaps something
> > like:  ".lock_uuid". In this directory is places either a
> > file or a directory named after the uuid, something like:
> > ".lock_uuid/,uuid".  In the case of a create or an
> > update, the new sha is written to this file.  In the case
> > of a delete, it is a directory.
> >
> > Once the tmp directory is setup, the initiating actor
> > attempts to start the transaction by renaming the tmp
> > directory to "ref.lock".  If the rename fails, the update
> > fails. If the rename succeeds, the actor can then attempt
> > to commit the transaction (before another actor aborts
> > it).
> >
> > In the case of a create, the actor verifies that "ref"
> > does not currently exist, and then renames the now named
> > "ref.lock/uuid" file to "ref". On success, the ref was
> > created.
> >
> > In the case of an update, the actor verifies that "ref"
> > currently contains the old sha, and then also renames the
> > now named "ref.lock/uuid" file to "ref". On success, the
> > ref was updated.
> >
> > In the case of a delete, the actor may verify that "ref"
> > currently contains the sha to "prune" if it needs to, and
> > then renames the "ref" file to "ref.lock/uuid/delete". On
> > success, the ref was deleted.
> >
> > Whether successful or not, the actor may now simply delete
> > the "ref.lock" directory, clearing the way for a new
> > transaction.  Any other actor may delete this directory at
> > any time also, likely either on conflict (if they are
> > attempting to initiate a transaction), or after a grace
> > period just to cleanup the FS.  Any actor may also safely
> > cleanup the tmp directories, preferably also after a grace
> > period.
> >
> > One neat part about this scheme is that I believe it would
> > be backwards compatible with the current locking
> > mechanism since the transaction directory will simply
> > appear to be a lock to older clients.  And the old lock
> > file should continue to lock out these newer
> > transactions.
> >
> > Due to this backwards compatibility, I believe that this
> > could be incrementally employed today without affecting
> > very much.  It could be deployed in place of any updates
> > which only hold ref.locks to update the loose ref.  So
> > for example I think it could replace step 4a below from
> > Michael Haggerty's description of today's loose ref
> > pruning during
> >
> > ref packing:
> > > * Pack references:
> > ...
> >
> > > 4. prune_refs(): for each ref in the ref_to_prune list,
> > >
> > > call  prune_ref():
> > >     a. Lock the reference using lock_ref_sha1(),
> > >     verifying that the recorded SHA1 is still valid.  If
> > >     it is, unlink the loose reference file then free
> > >     the lock; otherwise leave the loose reference file
> > >     untouched.
> >
> > I think it would also therefore be able to replace the
> > loose ref locking in Michael's new ref-packing scheme as
> > well as the locking in Michael's new ref deletion scheme
> > (again steps
> >
> > 4):
> > > * Delete reference foo:
> > ...
> >
> > >   4. Delete loose ref for "foo":
> > >      a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> > >
> > >      b. Unlink $GIT_DIR/refs/heads/foo if it is
> > >      unchanged.
> > >
> > >  If it is changed, leave it untouched.  If it is
> > >  deleted,
> > >
> > > that is OK too.
> > >
> > >      c. Release lock $GIT_DIR/refs/heads/foo.lock
> >
> > ...
> >
> > > * Pack references:
> > ...
> >
> > >   4. prune_refs(): for each ref in the ref_to_prune
> > >   list,
> > >
> > > call prune_ref():
> > >      a. Lock the loose reference using lock_ref_sha1(),
> > >
> > > verifying that the recorded SHA1 is still valid
> > >
> > >      b. If it is, unlink the loose reference file
> > >
> > > (otherwise, leave it untouched)
> > >
> > >      c. Release the lock on the loose reference
> >
> > To be honest, I suspect I missed something obvious because
> > this seems almost too simple to work.  I am ashamed that
> > it took me so long to come up with (of course, I will be
> > even more ashamed :( when it is shown to be flawed!)
> > This scheme also feels extensible. if there are no
> > obvious flaws in it, I will try to post solutions for ref
> > packing and for multiple repository/ref transactions also
> > soon.
> >
> > I welcome any comments/criticisms,
> >
> > -Martin
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > git" in the body of a message to
> > majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5615 bytes --]

  reply	other threads:[~2013-01-04 17:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US) [this message]
2013-01-04 18:01           ` Martin Fick
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871B6C10EBEFE342A772D1159D1320853A011469@umechphj.easf.csd.disa.mil \
    --to=jason.j.pyeron.ctr@mail.mil \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).