git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Martin Fick <mfick@codeaurora.org>
To: Jeff King <peff@peff.net>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Shawn Pearce <sop@google.com>
Subject: Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
Date: Sat, 29 Dec 2012 14:29:55 -0700	[thread overview]
Message-ID: <befe9a89-8f71-437e-ad33-c4dc4ee90507@email.android.com> (raw)
In-Reply-To: <20121229081200.GD15408@sigill.intra.peff.net>

Jeff King <peff@peff.net> wrote:

>On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:
>
>> Hmm, actually I believe that with a small modification to the 
>> semantics described here it would be possible to make multi 
>> repo/branch commits work.   Simply allow the ref filename to 
>> be locked by a transaction by appending the transaction ID to 
>> the filename.  So if transaction 123 wants to lock master 
>> which points currently to abcde, then it will move 
>> master/abcde to master/abcde_123.  If transaction 123 is 
>> designed so that any process can commit/complete/abort it 
>> without requiring any locks which can go stale, then this ref 
>> lock will never go stale either (easy as long as it writes 
>> all its proposed updates somewhere upfront and has atomic 
>> semantics for starting, committing and aborting).  On commit, 
>> the ref lock gets updated to its new value: master/newsha and 
>> on abort it gets unlocked: master/abcde.
>
>Hmm. I thought our goal was to avoid locks? Isn't this just locking by
>another name?

It is a lock, but it is a lock with an owner: the transaction.  If the transaction has reliable recovery semantics, then the lock will be recoverable also.  This is possible if we have lock ownership (the transaction) which does not exist today for the ref locks.  With good lock ownership we gain the ability to reliably delete locks for a specific owner without the risk of deleting the lock when held by another owner (putting the owner in the filename is "good", while putting the owner in the filecontents is not).   Lastly, for reliable recovery of stale locks we need the ability to determine when an owner has abandoned a lock.  I believe that the transaction semantics laid out below give this.


>I guess your point is to have no locks in the "normal" case, and have
>locked transactions as an optional add-on?

Basically.  If we design the transaction into the git semantics we could ensure that it is recoverable and we should not need to expose these reflocks outside of the transaction APIs.

To illustrate a simple transaction approach (borrowing some of Shawn's ideas), we could designate a directory to hold transaction files *1.  To prepare a transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new (in a stable sorted order based on repo:ref to prevent deadlocks).  This is not a state change and thus this file could be deleted by any process at anytime (preferably after a long grace period).

If file renames are atomic on the filesystem holding the transaction files then 1, 2, 3 below will be atomic state changes.  It does not matter who performs state transitions 2 or 3.  It does not matter who implements the work following any of the 3 transitions, many processes could attempt the work in parallel (so could a human).
 
1) To start the transaction, rename the id.new file to id.  If the rename fails, start over if desired/still possible.  On success, ref locks for each entry should be acquired in listed order (to prevent deadlocks), using transaction id and oldvalue.  It is never legal to unlock a ref in this state (because a block could cause the unlock to be delayed until the commit phase).  However, it is legal for any process to transition to abort at any time from this state, perhaps because of a failure to acquire a lock (held by another transaction), and definitely if a ref has changed (is no longer oldvalue).

2) To abort the transaction, rename the id file to id.abort.  This should only ever fail if commit was achieved first.  Once in this state, any process may/should unlock any ref locks belonging to this transaction id.  Once all refs are unlocked, id.abort may be deleted (it could be deleted earlier, but then cleanup will take longer).

3) To commit the transaction, rename the file to id.commit.  This should only ever fail if abort was achieved first. This transition should never be done until every listed ref is locked by the current transaction id.  Once in this phase, all refs may/should be moved to their new values and unlocked by any process. Once all refs are unlocked, id.commit may be deleted. 

Since any process attempting any of the work in these transactions could block at any time for an indefinite amount of time, these processes may wake after the transaction is aborted or comitted and the transaction files are cleaned up.  I believe that in these cases the only actions which could succeed by these waking processes is the ref locking action.  All such abandoned ref locks may/should be unlocked by any process.  This last rule means that no transaction ids should ever be reused,

-Martin


*1 We may want to adapt the simple model illustrated above to use git mechanisms such as refs to hold transaction info instead of files in a directory, and git submodule files to hold the list of refs to update.  

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

  reply	other threads:[~2012-12-29 21:35 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick [this message]
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
2013-01-04 18:01           ` Martin Fick
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=befe9a89-8f71-437e-ad33-c4dc4ee90507@email.android.com \
    --to=mfick@codeaurora.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=sop@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).