From: Martin Fick <mfick@codeaurora.org>
To: "Pyeron, Jason J CTR (US)" <jason.j.pyeron.ctr@mail.mil>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)
Date: Fri, 4 Jan 2013 11:01:02 -0700 [thread overview]
Message-ID: <201301041101.02756.mfick@codeaurora.org> (raw)
In-Reply-To: <871B6C10EBEFE342A772D1159D1320853A011469@umechphj.easf.csd.disa.mil>
On Friday, January 04, 2013 10:52:43 am Pyeron, Jason J
CTR (US) wrote:
> > From: Martin Fick
> > Sent: Thursday, January 03, 2013 6:53 PM
> >
> > Any thoughts on this idea? Is it flawed? I am
trying
> > to write it up in a more formal generalized manner
and
> > was hoping to get at least one "it seems sane"
before
> > I do.
>
> If you are assuming that atomic renames, etc. are
> available, then you should identify a test case and a
> degrade operation path when it is not available.
Thanks, sound reasonable. Where you thinking a runtime
test case that would be run before every transaction? I
was anticipating a per repo config option called
something like "core.locks = recoverable" that would be
needed to turn them on? I was thinking that this was
something that server sites could test in advance on
their repos and then enable it for them. Maybe a git-
lock tool with a --test-recoverable option?
-Martin
> >
> > On Monday, December 31, 2012 03:30:53 am Martin Fick
wrote:
> > > On Thursday, December 27, 2012 04:11:51 pm Martin
> > > Fick
> >
> > wrote:
> > > > It concerns me that git uses any locking at all,
> > > > even for refs since it has the potential to
leave
> > > > around stale locks.
> > > > ...
> > > > [a previous not so great attempt to fix this]
> > > > ...
> > >
> > > I may have finally figured out a working loose ref
> > > update mechanism which I think can avoid stale
> > > locks. Unfortunately it requires atomic directory
> > > renames and universally unique identifiers
(uuids).
> > > These may be no-go criteria? But I figure it is
> > > worth at least exploring this idea because of the
> > > potential benefits?
> > >
> > > The general approach is to setup a transaction and
> > > either commit or abort it. A transaction can be
> > > setup by renaming an appropriately setup directory
> > > to the "ref.lock" name. If the rename succeeds,
the
> > > transaction is begun. Any actor can abort the
> > > transaction (up until it is committed) by simply
> > > deleting the "ref.lock" directory, so it is not at
> > > risk of going stale. However, once the actor who
> > > sets up the transaction commits it, deleting the
> > > "ref.lock" directory simply aids in cleaning it up
> > > for the next transaction (instead of aborting it).
> > >
> > > One important piece of the transaction is the use
of
> > > uuids. The uuids provide a mechanism to tie the
> > > atomic commit pieces to the transactions and thus
to
> > > prevent long sleeping process from inadvertently
> > > performing actions which could be out of date when
> > > they wake finally up. In each case, the atomic
> > > commit piece is the renaming of a file. For the
> > > create and update pieces, a file is renamed from
the
> > > "ref.lock" dir to the "ref" file resulting in an
> > > update to the sha for the ref. However, in the
> > > delete case, the "ref" file is instead renamed to
> > > end up in the "ref.lock" directory resulting in a
> > > delete of the ref. This scheme does not affect
the
> > > way refs are read today,
> > >
> > > To prepare for a transaction, an actor first
> > > generates a uuid (an exercise I will delay for
now).
> > > Next, a tmp directory named after the uuid is
> > > generated in the parent directory for the ref to
be
> > > updated, perhaps something like: ".lock_uuid". In
> > > this directory is places either a file or a
> > > directory named after the uuid, something like:
> > > ".lock_uuid/,uuid". In the case of a create or an
> > > update, the new sha is written to this file. In
the
> > > case of a delete, it is a directory.
> > >
> > > Once the tmp directory is setup, the initiating
actor
> > > attempts to start the transaction by renaming the
tmp
> > > directory to "ref.lock". If the rename fails, the
> > > update fails. If the rename succeeds, the actor
can
> > > then attempt to commit the transaction (before
> > > another actor aborts it).
> > >
> > > In the case of a create, the actor verifies that
> > > "ref" does not currently exist, and then renames
the
> > > now named "ref.lock/uuid" file to "ref". On
success,
> > > the ref was created.
> > >
> > > In the case of an update, the actor verifies that
> > > "ref" currently contains the old sha, and then
also
> > > renames the now named "ref.lock/uuid" file to
"ref".
> > > On success, the ref was updated.
> > >
> > > In the case of a delete, the actor may verify that
> > > "ref" currently contains the sha to "prune" if it
> > > needs to, and then renames the "ref" file to
> > > "ref.lock/uuid/delete". On success, the ref was
> > > deleted.
> > >
> > > Whether successful or not, the actor may now
simply
> > > delete the "ref.lock" directory, clearing the way
> > > for a new transaction. Any other actor may delete
> > > this directory at any time also, likely either on
> > > conflict (if they are attempting to initiate a
> > > transaction), or after a grace period just to
> > > cleanup the FS. Any actor may also safely cleanup
> > > the tmp directories, preferably also after a grace
> > > period.
> > >
> > > One neat part about this scheme is that I believe
it
> > > would be backwards compatible with the current
> > > locking mechanism since the transaction directory
> > > will simply appear to be a lock to older clients.
> > > And the old lock file should continue to lock out
> > > these newer transactions.
> > >
> > > Due to this backwards compatibility, I believe
that
> > > this could be incrementally employed today without
> > > affecting very much. It could be deployed in
place
> > > of any updates which only hold ref.locks to update
> > > the loose ref. So for example I think it could
> > > replace step 4a below from Michael Haggerty's
> > > description of today's loose ref pruning during
> > >
> > > ref packing:
> > > > * Pack references:
> > > ...
> > >
> > > > 4. prune_refs(): for each ref in the
ref_to_prune
> > > > list,
> > > >
> > > > call prune_ref():
> > > > a. Lock the reference using lock_ref_sha1(),
> > > > verifying that the recorded SHA1 is still
> > > > valid. If it is, unlink the loose reference
> > > > file then free the lock; otherwise leave the
> > > > loose reference file untouched.
> > >
> > > I think it would also therefore be able to replace
> > > the loose ref locking in Michael's new ref-packing
> > > scheme as well as the locking in Michael's new ref
> > > deletion scheme (again steps
> > >
> > > 4):
> > > > * Delete reference foo:
> > > ...
> > >
> > > > 4. Delete loose ref for "foo":
> > > > a. Acquire the lock
> > > > $GIT_DIR/refs/heads/foo.lock
> > > >
> > > > b. Unlink $GIT_DIR/refs/heads/foo if it is
> > > > unchanged.
> > > >
> > > > If it is changed, leave it untouched. If it is
> > > > deleted,
> > > >
> > > > that is OK too.
> > > >
> > > > c. Release lock
$GIT_DIR/refs/heads/foo.lock
> > >
> > > ...
> > >
> > > > * Pack references:
> > > ...
> > >
> > > > 4. prune_refs(): for each ref in the
ref_to_prune
> > > > list,
> > > >
> > > > call prune_ref():
> > > > a. Lock the loose reference using
> > > > lock_ref_sha1(),
> > > >
> > > > verifying that the recorded SHA1 is still valid
> > > >
> > > > b. If it is, unlink the loose reference
file
> > > >
> > > > (otherwise, leave it untouched)
> > > >
> > > > c. Release the lock on the loose reference
> > >
> > > To be honest, I suspect I missed something obvious
> > > because this seems almost too simple to work. I
am
> > > ashamed that it took me so long to come up with
(of
> > > course, I will be even more ashamed :( when it is
> > > shown to be flawed!) This scheme also feels
> > > extensible. if there are no obvious flaws in it, I
> > > will try to post solutions for ref packing and for
> > > multiple repository/ref transactions also soon.
> > >
> > > I welcome any comments/criticisms,
> > >
> > > -Martin
> > > --
> > > To unsubscribe from this list: send the line
> > > "unsubscribe git" in the body of a message to
> > > majordomo@vger.kernel.org More majordomo info at
> > > http://vger.kernel.org/majordomo-info.html
> >
> > --
> > To unsubscribe from this list: send the line
> > "unsubscribe git" in the body of a message to
> > majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
next prev parent reply other threads:[~2013-01-04 18:01 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-21 8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26 8:24 ` Michael Haggerty
2012-12-27 23:11 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50 ` Martin Fick
2012-12-28 17:15 ` Lockless Refs? Junio C Hamano
2012-12-29 8:16 ` Jeff King
2012-12-29 21:15 ` Martin Fick
2012-12-29 8:12 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29 ` Martin Fick
2012-12-28 16:58 ` Lockless Refs? Junio C Hamano
2012-12-29 1:07 ` Martin Fick
2012-12-29 8:10 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18 ` Martin Fick
2012-12-30 17:03 ` Martin Fick
2012-12-31 10:30 ` Martin Fick
2013-01-03 23:52 ` Martin Fick
2013-01-04 17:52 ` Pyeron, Jason J CTR (US)
2013-01-04 18:01 ` Martin Fick [this message]
2013-01-04 21:28 ` Lockless Refs? Junio C Hamano
2013-01-05 16:12 ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22 4:31 ` Drew Northup
2012-12-29 7:16 ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
[not found] ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14 ` Martin Fick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201301041101.02756.mfick@codeaurora.org \
--to=mfick@codeaurora.org \
--cc=git@vger.kernel.org \
--cc=jason.j.pyeron.ctr@mail.mil \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).