git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] refs: do not use cached refs in repack_without_ref
@ 2012-12-21  8:04 Jeff King
  2012-12-26  8:24 ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-12-21  8:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

When we delete a ref that is packed, we rewrite the whole
packed-refs file and simply omit the ref that no longer
exists. However, we base the rewrite on whatever happens to
be in our refs cache, not what is necessarily on disk. That
opens us up to a race condition if another process is
simultaneously packing the refs, as we will overwrite their
newly-made pack-refs file with our potentially stale data,
losing commits.

You can demonstrate the race like this:

  # setup some repositories
  git init --bare parent &&
  (cd parent && git config core.logallrefupdates true) &&
  git clone parent child &&
  (cd child && git commit --allow-empty -m base)

  # in one terminal, repack the refs repeatedly
  cd parent &&
  while true; do
	git pack-refs --all
  done

  # in another terminal, simultaneously push updates to
  # master, and create and delete an unrelated ref
  cd child &&
  while true; do
	git push origin HEAD:newbranch &&
	git commit --allow-empty -m foo
	us=`git rev-parse master` &&
	git push origin master &&
	git push origin :newbranch &&
	them=`git --git-dir=../parent rev-parse master` &&
	if test "$them" != "$us"; then
		echo >&2 "$them" != "$us"
		exit 1
	fi
  done

In many cases the two processes will conflict over locking
the packed-refs file, and the deletion of newbranch will
simply fail.  But eventually you will hit the race, which
happens like this:

  1. We push a new commit to master. It is already packed
     (from the looping pack-refs call). We write the new
     value (let us call it B) to $GIT_DIR/refs/heads/master,
     but the old value (call it A) remains in the
     packed-refs file.

  2. We push the deletion of newbranch, spawning a
     receive-pack process. Receive-pack advertises all refs
     to the client, causing it to iterate over each ref; it
     caches the packed refs in memory, which points at the
     stale value A.

  3. Meanwhile, a separate pack-refs process is running. It
     runs to completion, updating the packed-refs file to
     point master at B, and deleting $GIT_DIR/refs/heads/master
     which also pointed at B.

  4. Back in the receive-pack process, we get the
     instruction to delete :newbranch. We take a lock on
     packed-refs (which works, as the other pack-refs
     process has already finished). We then rewrite the
     contents using the cached refs, which contain the stale
     value A.

The resulting packed-refs file points master once again at
A. The loose ref which would override it to point at B was
deleted (rightfully) in step 3. As a result, master now
points at A. The only trace that B ever existed in the
parent is in the reflog: the final entry will show master
moving from A to B, even though the ref still points at A
(so you can detect this race after the fact, because the
next reflog entry will move from A to C).

We can fix this by invalidating the packed-refs cache after
we have taken the lock. This means that we will re-read the
packed-refs file, and since we have the lock, we will be
sure that what we read will be atomically up-to-date when we
write (it may be out of date with respect to loose refs, but
that is OK, as loose refs take precedence).

Signed-off-by: Jeff King <peff@peff.net>
---
We actually see this in practice on GitHub, though it is relatively
rare (I've been chasing reports for a while, and in a very busy repo, it
can happen every couple of weeks; this is probably due to the fact that
we run "git gc" very frequently).

There are a few other interesting races in this code that this does not
fix:

  1. We check to see whether the ref is packed based on the cached data.
     That means that in the following sequence:

       a. receive-pack starts, caches packed refs; master is not packed

       b. meanwhile, pack-refs runs and packs master

       c. receive-pack deletes the loose ref for master (which might be
          a no-op if the pack-refs from (b) got there first). It checks
          its cached packed-refs and sees that there is nothing to
          delete.

     We end up leaving the entry in packed-refs. In other words, the
     deletion does not actually delete anything, but it still returns
     success.

     We could fix this by scanning the list of packed refs only after
     we've acquired the lock. The downside is that this would increase
     lock contention on packed-refs.lock. Right now, two deletions may
     conflict if they are deletions of packed refs. With this change,
     any two deletions might conflict, packed or not.

     If we work under the assumption that deletions are relatively rare,
     this is probably OK. And if you tend to keep your refs packed, it
     would not make any difference. It would have an impact on repos
     which do not pack refs, and which have frequent simultaneous
     deletions.

  2. The delete_ref function first deletes the loose ref, then rewrites
     the packed-refs file. This means that for a moment, the ref may
     appear to have rewound to whatever was in the packed-refs file, and
     the reader has no way of knowing.

     This is not a huge deal, but I think it could be fixed by swapping
     the ordering. However, I think that would open us up to the reverse
     race from above: we delete from packed-refs, then before we delete
     the loose ref, a pack-refs process repacks it. Our deletion looks
     successful, but the ref remains afterwards.

I fixed just the race I did because it does not (as far as I can tell)
have any downsides. And it is way more severe (the other ones are "a
deleted ref might come back", whereas the fixed one will actually lose
commits).

 refs.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index 6cec1c8..541fec2 100644
--- a/refs.c
+++ b/refs.c
@@ -1744,7 +1744,8 @@ static int repack_without_ref(const char *refname)
 static int repack_without_ref(const char *refname)
 {
 	struct repack_without_ref_sb data;
-	struct ref_dir *packed = get_packed_refs(get_ref_cache(NULL));
+	struct ref_cache *refs = get_ref_cache(NULL);
+	struct ref_dir *packed = get_packed_refs(refs);
 	if (find_ref(packed, refname) == NULL)
 		return 0;
 	data.refname = refname;
@@ -1753,6 +1754,8 @@ static int repack_without_ref(const char *refname)
 		unable_to_lock_error(git_path("packed-refs"), errno);
 		return error("cannot delete '%s' from packed refs", refname);
 	}
+	clear_packed_ref_cache(refs);
+	packed = get_packed_refs(refs);
 	do_for_each_ref_in_dir(packed, 0, "", repack_without_ref_fn, 0, 0, &data);
 	return commit_lock_file(&packlock);
 }
-- 
1.8.1.rc2.6.g05591da

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] refs: do not use cached refs in repack_without_ref
  2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
@ 2012-12-26  8:24 ` Michael Haggerty
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
  2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Michael Haggerty @ 2012-12-26  8:24 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

On 12/21/2012 09:04 AM, Jeff King wrote:
> When we delete a ref that is packed, we rewrite the whole
> packed-refs file and simply omit the ref that no longer
> exists. However, we base the rewrite on whatever happens to
> be in our refs cache, not what is necessarily on disk. That
> opens us up to a race condition if another process is
> simultaneously packing the refs, as we will overwrite their
> newly-made pack-refs file with our potentially stale data,
> losing commits.
> [...]
> 
> There are a few other interesting races in this code that this does not
> fix:
> 
>   1. We check to see whether the ref is packed based on the cached data.
>      That means that in the following sequence:
> 
>        a. receive-pack starts, caches packed refs; master is not packed
> 
>        b. meanwhile, pack-refs runs and packs master
> 
>        c. receive-pack deletes the loose ref for master (which might be
>           a no-op if the pack-refs from (b) got there first). It checks
>           its cached packed-refs and sees that there is nothing to
>           delete.
> 
>      We end up leaving the entry in packed-refs. In other words, the
>      deletion does not actually delete anything, but it still returns
>      success.
> 
>      We could fix this by scanning the list of packed refs only after
>      we've acquired the lock. The downside is that this would increase
>      lock contention on packed-refs.lock. Right now, two deletions may
>      conflict if they are deletions of packed refs. With this change,
>      any two deletions might conflict, packed or not.
> 
>      If we work under the assumption that deletions are relatively rare,
>      this is probably OK. And if you tend to keep your refs packed, it
>      would not make any difference. It would have an impact on repos
>      which do not pack refs, and which have frequent simultaneous
>      deletions.
> 
>   2. The delete_ref function first deletes the loose ref, then rewrites
>      the packed-refs file. This means that for a moment, the ref may
>      appear to have rewound to whatever was in the packed-refs file, and
>      the reader has no way of knowing.
> 
>      This is not a huge deal, but I think it could be fixed by swapping
>      the ordering. However, I think that would open us up to the reverse
>      race from above: we delete from packed-refs, then before we delete
>      the loose ref, a pack-refs process repacks it. Our deletion looks
>      successful, but the ref remains afterwards.

I'm sorry to take so long to respond to this patch.  Thank you for
tracking down this bug and for your careful analysis.

I think your patch is correct and should fix the first race condition
that you described.  But I think the continued existence of the other
race conditions is an indication of a fundamental problem with the
reference locking policy--independent of the in-RAM reference cache.

The tacit assumption of the current locking policy is that changes to
the packed-refs file are not critical for correctness, because loose
references take precedence over it anyway.  This is true for adding and
modifying references.  But it is not true for deleting references,
because there is no way for a deletion to be represented as a loose
reference in a way that takes precedence over the packed-refs file
(i.e., there is no way for a loose reference to say "I am deleted,
regardless of what packed-refs says").  Thus the race conditions for
deleting references, whether via delete_ref() or via pack_refs() with
--prune.

The current algorithms for deleting references are:

* Delete reference foo:

  1. Acquire the lock $GIT_DIR/refs/heads/foo.lock

  2. Unlink $GIT_DIR/refs/heads/foo

  3. repack_without_ref():

     a. Acquire the lock $GIT_DIR/packed-refs.lock

     b. Write the packed-refs without the "foo" reference to
        $GIT_DIR/packed-refs.lock

     c. Rename $GIT_DIR/packed-refs.lock to $GIT_DIR/packed-refs

  4. Release lock $GIT_DIR/refs/heads/foo.lock

* Pack references:

  1. Acquire lock $GIT_DIR/packed-refs.lock

  2. for_each_ref() call handle_one_ref():

     a. Write ref and its SHA1 to $GIT_DIR/packed-refs.lock

     b. Possibly record ref and its SHA1 in the refs_to_prune list.

  3. Commit $GIT_DIR/packed-refs

  4. prune_refs(): for each ref in the ref_to_prune list, call
     prune_ref():

     a. Lock the reference using lock_ref_sha1(), verifying that the
        recorded SHA1 is still valid.  If it is, unlink the loose
        reference file then free the lock; otherwise leave the loose
        reference file untouched.

There is a problem if two processes try to delete a reference at the
same time, or if one process tries to delete a reference at the same
time as another process is trying to pack the references.  The reason is
that there is no "transaction" that spans both the rewriting of the
packed-refs file and also the deletion of the loose-ref files, and
therefore it is possible for conflicting changes to be made in the two
locations.

I think that all of the problems would be fixed if a lock would be held
on the packed-refs file during the whole process of deleting any
reference; i.e., change the algorithms to:

* Delete reference foo:

  1. Acquire the lock $GIT_DIR/packed-refs.lock (regardless of whether
     "foo" is a packed ref)

  2. Write to $GIT_DIR/packed-refs.new a version of the packed-refs
     file that omits "foo"

  3. Atomically replace $GIT_DIR/packed-refs with
     $GIT_DIR/packed-refs.new (but without relinquishing the lock
     $GIT_DIR/packed-refs.lock)

  4. Delete loose ref for "foo":

     a. Acquire the lock $GIT_DIR/refs/heads/foo.lock

     b. Unlink $GIT_DIR/refs/heads/foo if it is unchanged.  If it is
        changed, leave it untouched.  If it is deleted, that is OK too.

     c. Release lock $GIT_DIR/refs/heads/foo.lock

  5. Release lock $GIT_DIR/packed-refs.lock (without changing
     $GIT_DIR/packed-refs again)

* Pack references:

  1. Acquire lock $GIT_DIR/packed-refs.lock

  2. for_each_ref() call handle_one_ref():

     a. Write ref and its SHA1 to $GIT_DIR/packed-refs.new

     b. Possibly record ref and its SHA1 in the refs_to_prune list.

  3. Atomically replace $GIT_DIR/packed-refs with
     $GIT_DIR/packed-refs.new (but without relinquishing the lock
     $GIT_DIR/packed-refs.lock)

  4. prune_refs(): for each ref in the ref_to_prune list, call
     prune_ref():

     a. Lock the loose reference using lock_ref_sha1(), verifying that
        the recorded SHA1 is still valid

     b. If it is, unlink the loose reference file (otherwise, leave
        it untouched)

     c. Release the lock on the loose reference

  5. Release lock $GIT_DIR/packed-refs.lock (without changing
     $GIT_DIR/packed-refs again)

It is important that after step (3) in either of the above algorithms,
the new packed-refs file has been switched "live" even though there is
no way to guarantee that it holds the correct values for all references.
 This is OK, because (a) references that have been added or changed will
be represented by loose references that take precedence over the stale
references in the packed-refs file; (b) no references can have been
deleted while the packed-refs file was being rewritten, because
reference deletion is serialized via the lock on the packed-refs file.
If one of the later steps fails, it is OK to leave this version of the
packed-refs file active.

The proposed algorithms will have to hold the lock on packed-refs for
much longer; in the case of packed-refs, the lock has to be held for the
whole time that all of the loose references are being deleted.
Effectively it is being used to prevent other processes from deleting
references while it is working because that would make the just-written
packed-refs file invalid.

I would appreciate a critique of my analysis.  Even if you agree, I
think it would be OK to apply Peff's patch to fix up the most pressing
problem, then implement the more complete solution later.

By the way, this is something that I would be happy to add to my to-do
list, but it could take a while for me to get to it because of a lack of
time and because I'm still busy with two other biggish git-related
projects (git-multimail [1] and a git merging helper [2]).

Michael

[1] https://github.com/mhagger/git-multimail

[2] A fun project that I haven't yet mentioned on the list

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-26  8:24 ` Michael Haggerty
@ 2012-12-27 23:11   ` Martin Fick
  2012-12-28 14:50     ` Martin Fick
                       ` (3 more replies)
  2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
  1 sibling, 4 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-27 23:11 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, git, Junio C Hamano

On Wednesday, December 26, 2012 01:24:39 am Michael Haggerty 
wrote:
> ... lots of discussion about ref locking...

It concerns me that git uses any locking at all, even for 
refs since it has the potential to leave around stale locks. 

For a single user repo this is not a big deal, the lock can 
always be cleaned up manually (and it is a rare occurrence).  
However, in a multi user server environment, possibly even 
from multiple hosts over a shared filesystem such as NFS, 
stale locks could lead to serious downtime and risky recovery 
(since it is currently hard to figure out if a lock really is 
stale).  Even though stale locks are probably rare even today 
in the larger shared repo case, as git scales to even larger 
shared repositories, this will eventually become more of a 
problem *1.  Naturally, this has me thinking that git should 
possibly consider moving towards a lockless design for refs 
in the long term.

I realize this is hard and that git needs to support many 
different filesystems with different semantics.  I had an idea I 
think may be close to a functional lockless design for loose 
refs (one piece at a time) that I thought I should propose, 
just to get the ball rolling, even if it is just going to be 
found to be flawed (I realize that history suggests that such 
schemes usually are).  I hope that it does not make use of 
any semantics which are not currently expected from git of 
filesystems.  I think it relies only on the ability to rename 
a file atomically, and the ability to scan the contents of a 
directory reliably to detect the "ordered" existence of files.

My idea is based on using filenames to store sha1s instead of 
file contents.  To do this, the sha1 one of a ref would be 
stored in a file in a directory named after the loose ref.  I 
believe this would then make it possible to have lockless 
atomic ref updates by renaming the file.

To more fully illustrate the idea, imagine that any file 
(except for the null file) in the directory will represent the 
value of the ref with its name, then the following 
transitions can represent atomic state changes to a refs 
value and existence:

1) To update the value from a known value to a new value 
atomically, simply rename the file to the new value.  This 
operation should only succeed if the file exists and is still 
named old value before the rename.  This should even be 
faster than today's approach, especially on remote filesystems 
since it would require only 1 round trip in the success case 
instead of 3!

2) To delete the ref, simply delete the filename representing 
the current value of the ref.  This ensures that you are 
deleting the ref from a specific value.  I am not sure if git 
needs to be able to delete refs without knowing their values?  
If so, this would require reading the value and looping until 
the delete succeeds, this may be a bit slow for a constantly 
updated ref, but likely a rare situation (and not likely 
worse than trying to acquire the ref-lock today).  Overall, 
this again would likely be faster than today's approach.

3) To create a ref, it must be renamed from the null file (sha 
0000...) to the new value just as if it were being updated 
from any other value, but there is one extra condition: 
before renaming the null file, a full directory scan must be 
done to ensure that the null file is the only file in the 
directory (this condition exists because creating the 
directory and null file cannot be atomic unless the filesystem 
supports atomic directory renames, an expectation git does 
not currently make).  I am not sure how this compares to 
today's approach, but including the setup costs (described 
below), I suspect it is slower.

While this outlines the state changes, some additional 
operations may be needed to setup some starting conditions 
and to clean things up. But these operations could/should be 
performed by any process/thread and would not cause any state 
changes to the ref existence or value.  For example, when 
creating a ref, the ref directory would need to be created 
and the null file needs to be created.  Whenever a null file is 
detected in the directory at the same time as another file, it 
should be deleted.   Whenever the directory is empty, it may 
be deleted (perhaps after a grace period to reduce retries 
during ref creation unless the process just deleted the ref).

I don't know how this new scheme could be made to work with 
the current scheme, it seems like perhaps new git releases 
could be made to understand both the old and the new, and a 
config option could be used to tell it which method to write 
new refs with.  Since in this new scheme ref directory names 
would conflict with old ref filenames, this would likely 
prevent both schemes from erroneously being used 
simultaneously (so they shouldn't corrupt each other), except 
for the fact that refs can be nested in directories which 
confuses things a bit.  I am not sure what a good solution to 
this is?

What did I miss, where are my flaws?  Does anyone else share 
my concern for stale locks?  How could we similarly eliminate 
locks for the packed-refs file?

-Martin


*1 We have been concerned with stale locks in the Gerrit 
community when trying to design atomic cross repository 
updates.  Of course, while a lockless solution eliminates 
stale locks, it might make it impossible to do atomic cross 
repository updates since all of our solutions so far need 
locks. :(

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
@ 2012-12-28 14:50     ` Martin Fick
  2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
  2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
  2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-28 14:50 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, git, Junio C Hamano, Shawn Pearce

On Thursday, December 27, 2012 04:11:51 pm Martin Fick wrote:
> On Wednesday, December 26, 2012 01:24:39 am Michael
> Haggerty
> 
> wrote:
> > ... lots of discussion about ref locking...
> 
> It concerns me that git uses any locking at all, even for
> refs since it has the potential to leave around stale
> locks.
> 
> For a single user repo this is not a big deal, the lock
> can always be cleaned up manually (and it is a rare
> occurrence). However, in a multi user server environment,
> possibly even from multiple hosts over a shared
> filesystem such as NFS, stale locks could lead to serious
> downtime and risky recovery (since it is currently hard
> to figure out if a lock really is stale).  Even though
> stale locks are probably rare even today in the larger
> shared repo case, as git scales to even larger shared
> repositories, this will eventually become more of a
> problem *1.  Naturally, this has me thinking that git
> should possibly consider moving towards a lockless design
> for refs in the long term.
> 
> I realize this is hard and that git needs to support many
> different filesystems with different semantics.  I had an
> idea I think may be close to a functional lockless design
> for loose refs (one piece at a time) that I thought I
> should propose, just to get the ball rolling, even if it
> is just going to be found to be flawed (I realize that
> history suggests that such schemes usually are).  I hope
> that it does not make use of any semantics which are not
> currently expected from git of filesystems.  I think it
> relies only on the ability to rename a file atomically,
> and the ability to scan the contents of a directory
> reliably to detect the "ordered" existence of files.
> 
> My idea is based on using filenames to store sha1s instead
> of file contents.  To do this, the sha1 one of a ref
> would be stored in a file in a directory named after the
> loose ref.  I believe this would then make it possible to
> have lockless atomic ref updates by renaming the file.
> 
> To more fully illustrate the idea, imagine that any file
> (except for the null file) in the directory will represent
> the value of the ref with its name, then the following
> transitions can represent atomic state changes to a refs
> value and existence:
> 
> 1) To update the value from a known value to a new value
> atomically, simply rename the file to the new value.  This
> operation should only succeed if the file exists and is
> still named old value before the rename.  This should
> even be faster than today's approach, especially on
> remote filesystems since it would require only 1 round
> trip in the success case instead of 3!
> 
> 2) To delete the ref, simply delete the filename
> representing the current value of the ref.  This ensures
> that you are deleting the ref from a specific value.  I
> am not sure if git needs to be able to delete refs
> without knowing their values? If so, this would require
> reading the value and looping until the delete succeeds,
> this may be a bit slow for a constantly updated ref, but
> likely a rare situation (and not likely worse than trying
> to acquire the ref-lock today).  Overall, this again
> would likely be faster than today's approach.
> 
> 3) To create a ref, it must be renamed from the null file
> (sha 0000...) to the new value just as if it were being
> updated from any other value, but there is one extra
> condition: before renaming the null file, a full
> directory scan must be done to ensure that the null file
> is the only file in the directory (this condition exists
> because creating the directory and null file cannot be
> atomic unless the filesystem supports atomic directory
> renames, an expectation git does not currently make).  I
> am not sure how this compares to today's approach, but
> including the setup costs (described below), I suspect it
> is slower.
> 
> While this outlines the state changes, some additional
> operations may be needed to setup some starting conditions
> and to clean things up. But these operations could/should
> be performed by any process/thread and would not cause
> any state changes to the ref existence or value.  For
> example, when creating a ref, the ref directory would
> need to be created and the null file needs to be created.
>  Whenever a null file is detected in the directory at the
> same time as another file, it should be deleted.  
> Whenever the directory is empty, it may be deleted
> (perhaps after a grace period to reduce retries during
> ref creation unless the process just deleted the ref).
> 
> I don't know how this new scheme could be made to work
> with the current scheme, it seems like perhaps new git
> releases could be made to understand both the old and the
> new, and a config option could be used to tell it which
> method to write new refs with.  Since in this new scheme
> ref directory names would conflict with old ref
> filenames, this would likely prevent both schemes from
> erroneously being used
> simultaneously (so they shouldn't corrupt each other),
> except for the fact that refs can be nested in
> directories which confuses things a bit.  I am not sure
> what a good solution to this is?
> 
> What did I miss, where are my flaws?  Does anyone else
> share my concern for stale locks?  How could we similarly
> eliminate locks for the packed-refs file?
> 
> -Martin
> 
> 
> *1 We have been concerned with stale locks in the Gerrit
> community when trying to design atomic cross repository
> updates.  Of course, while a lockless solution eliminates
> stale locks, it might make it impossible to do atomic
> cross repository updates since all of our solutions so
> far need locks. :(

Hmm, actually I believe that with a small modification to the 
semantics described here it would be possible to make multi 
repo/branch commits work.   Simply allow the ref filename to 
be locked by a transaction by appending the transaction ID to 
the filename.  So if transaction 123 wants to lock master 
which points currently to abcde, then it will move 
master/abcde to master/abcde_123.  If transaction 123 is 
designed so that any process can commit/complete/abort it 
without requiring any locks which can go stale, then this ref 
lock will never go stale either (easy as long as it writes 
all its proposed updates somewhere upfront and has atomic 
semantics for starting, committing and aborting).  On commit, 
the ref lock gets updated to its new value: master/newsha and 
on abort it gets unlocked: master/abcde.

Shawn talked about adding multi repo/branch transaction 
semantics to jgit, this might be something that git wants to 
support also at some point?

-Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
  2012-12-28 14:50     ` Martin Fick
@ 2012-12-28 16:58     ` Junio C Hamano
  2012-12-29  1:07       ` Martin Fick
  2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
  2012-12-31 10:30     ` Martin Fick
  3 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-12-28 16:58 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Jeff King, git

Martin Fick <mfick@codeaurora.org> writes:

> 3) To create a ref, it must be renamed from the null file (sha 
> 0000...) to the new value just as if it were being updated 
> from any other value, but there is one extra condition: 
> before renaming the null file, a full directory scan must be 
> done to ensure that the null file is the only file in the 
> directory...

While you are scanning this directory to make sure it is empty, I am
contemplating to create the same ref with a different value.  You
finished checking but haven't created the null. I have also scanned,
created the null and renamed it to my value.  Now you try to create
the null, succeed, and then rename.  We won't know which of the two
non-null values are valid, but worse yet, I think one of them should
have failed in the first place.

Sounds like we would need some form of locking around here.  Is your
goal "no locks", or "less locks"?

> I don't know how this new scheme could be made to work with 
> the current scheme,...

It is much more important to know if/why yours is better than the
current scheme in the first place.  Without an analysis on how the
new scheme interacts with the packed refs and gives better
behaviour, that is kinda difficult.

I think transition plans can wait until that is done.  If it is not
even marginally better, we do not have to worry about transitioning
at all.  If it is only marginally better, the transition has to be
designed to be no impact to the existing repositories.  If it is
vastly better, we might be able to afford a flag day.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2012-12-28 14:50     ` Martin Fick
@ 2012-12-28 17:15       ` Junio C Hamano
  2012-12-29  8:16         ` Jeff King
  2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-12-28 17:15 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Jeff King, git, Shawn Pearce

Martin Fick <mfick@codeaurora.org> writes:

> Hmm, actually I believe that with a small modification to the 
> semantics described here it would be possible to make multi 
> repo/branch commits work....
>
> Shawn talked about adding multi repo/branch transaction 
> semantics to jgit, this might be something that git wants to 
> support also at some point?

Shawn may have talked about it and you may have listened to it, but
others wouldn't have any idea what kind of "multi repo/branch
transaction" you are talking about.  Is it about "I want to push
this ref to that repo and push this other ref to that other repo",
in what situation will it be used/useful, what are the failure
modes, what are failure tolerances by the expected use cases, ...?

Care to explain?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
@ 2012-12-29  1:07       ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-29  1:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Jeff King, git

On Friday, December 28, 2012 09:58:36 am Junio C Hamano 
wrote:
> Martin Fick <mfick@codeaurora.org> writes:
> > 3) To create a ref, it must be renamed from the null
> > file (sha 0000...) to the new value just as if it were
> > being updated from any other value, but there is one
> > extra condition: before renaming the null file, a full
> > directory scan must be done to ensure that the null
> > file is the only file in the directory...
> 
> While you are scanning this directory to make sure it is
> empty, 

The objective is not to scan for an empty dir, but to scan 
for the existence of only the null file.

> I am contemplating to create the same ref with a
> different value.  You finished checking but haven't
> created the null.

The scan needs to happen after creating the null, not before, 
so I don't believe the rest of the scenario below is possible 
then?

> I have also scanned, created the null
> and renamed it to my value.  Now you try to create the
> null, succeed, and then rename.  We won't know which of
> the two non-null values are valid, but worse yet, I think
> one of them should have failed in the first place.



> Sounds like we would need some form of locking around
> here.  Is your goal "no locks", or "less locks"?
(answered below)

> > I don't know how this new scheme could be made to work
> > with the current scheme,...
> 
> It is much more important to know if/why yours is better
> than the current scheme in the first place.  

The goal is: "no locks which do not have a clearly defined 
reliable recovery procedure".

Stale locks without a reliable recovery procedure will lead 
to stolen locks.  At this point it is only a matter of luck 
whether this leads to data loss or not.  So I do hope to 
convince people first that the current scheme is bad, not that 
my scheme is better!  My scheme was proposed to get people 
thinking that we may not have to use locks to get reliable 
updates.


> Without an
> analysis on how the new scheme interacts with the packed
> refs and gives better behaviour, that is kinda difficult.

Fair enough. I will attempt this if the basic idea seems at 
least sane?  I do hope that eventually the packed-refs piece 
and its locking will be reconsidered also; as Michael pointed 
out it has issues already.  So, I am hoping to get people 
thinking more about lockless approaches to all the pieces. I 
think I have some solutions to some of the other pieces also, 
but I don't want to overwhelm the discussion all at once 
(especially if my first piece is shown to be flawed, or if no 
one has any interest in eliminating the current locks?)

 
> I think transition plans can wait until that is done.  If
> it is not even marginally better, we do not have to worry
> about transitioning at all.  If it is only marginally
> better, the transition has to be designed to be no impact
> to the existing repositories.  If it is vastly better, we
> might be able to afford a flag day.

OK, makes sense, I jumped the gun a bit,

-Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] refs: do not use cached refs in repack_without_ref
  2012-12-26  8:24 ` Michael Haggerty
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
@ 2012-12-29  7:16   ` Jeff King
       [not found]     ` <201301071109.12086.mfick@codeaurora.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-12-29  7:16 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Junio C Hamano

On Wed, Dec 26, 2012 at 09:24:39AM +0100, Michael Haggerty wrote:

> I'm sorry to take so long to respond to this patch.  Thank you for
> tracking down this bug and for your careful analysis.
> 
> I think your patch is correct and should fix the first race condition
> that you described.

Thanks for checking it over. I almost cc'd you, as I know you have been
working on ref caching. But I think that the problem is much older, as
we always cached the packed-refs list in memory.

> But I think the continued existence of the other race conditions is an
> indication of a fundamental problem with the reference locking
> policy--independent of the in-RAM reference cache.
> 
> The tacit assumption of the current locking policy is that changes to
> the packed-refs file are not critical for correctness, because loose
> references take precedence over it anyway.  This is true for adding and
> modifying references.  But it is not true for deleting references,
> because there is no way for a deletion to be represented as a loose
> reference in a way that takes precedence over the packed-refs file
> (i.e., there is no way for a loose reference to say "I am deleted,
> regardless of what packed-refs says").  Thus the race conditions for
> deleting references, whether via delete_ref() or via pack_refs() with
> --prune.

Yeah. It would be much nicer if we could just write the null sha1 or a
similar sentinel into the loose ref for "I am deleted". The problem
(besides backwards compatibility) is the usual D/F conflict. I cannot
delete "refs/heads/foo" and then create "refs/heads/foo/bar" if the old
ref file is in the way.

> There is a problem if two processes try to delete a reference at the
> same time, or if one process tries to delete a reference at the same
> time as another process is trying to pack the references.  The reason is
> that there is no "transaction" that spans both the rewriting of the
> packed-refs file and also the deletion of the loose-ref files, and
> therefore it is possible for conflicting changes to be made in the two
> locations.

Just to be clear, are you talking about the races I wrote about in my
other email? Or are there other races? I didn't (and still don't) see
any actual on-disk data loss races. Just ones where a reader may get an
old, packed value (which is still bad, but is less bad than one where a
write is lost).

> I think that all of the problems would be fixed if a lock would be held
> on the packed-refs file during the whole process of deleting any
> reference; i.e., change the algorithms to:

Yeah, I looked at that, too. In fact, before I had correctly analyzed
the problem, I thought that doing so would solve the problem I was
seeing (which I noticed was wrong when it did not pass my tests :) ).

>From your description, I think you realize this, but I want to point out
for other readers: just updating the pack_refs side to call prune_refs
under the lock would create a deadlock with a simultaneous delete (which
takes the individual ref lock first, then the packed-refs lock). Of
course, I do not think git is capable of deadlock, as we typically just
die() instead of blocking on a lock. So maybe it does not matter so
much. :)

> * Delete reference foo:
> 
>   1. Acquire the lock $GIT_DIR/packed-refs.lock (regardless of whether
>      "foo" is a packed ref)

This suffers from the same problem I mentioned in my earlier email: we
create lock contention on packed-refs.lock when there are two
simultaneous deletes, even when the refs aren't packed. That might be an
acceptable tradeoff, though. It's only for deletion, not for update,
which is presumably rare. And it has to happen anyway when both refs are
packed.

>   2. Write to $GIT_DIR/packed-refs.new a version of the packed-refs
>      file that omits "foo"
> [...]

All of the further steps make sense. By deleting from packed-refs first,
we eliminate the read race-condition I mentioned in my earlier email.
The only downside is the possible increased lock contention on
packed-refs.lock.  I'm very tempted to go this route. It's better to be
correct and sometimes die on lock contention than to sometimes give the
wrong answer.

> I would appreciate a critique of my analysis.  Even if you agree, I
> think it would be OK to apply Peff's patch to fix up the most pressing
> problem, then implement the more complete solution later.

I think your analysis is correct, modulo the issue I mentioned. And I
agree that my patch is a good interim, as it fixes the worst case
(losing writes).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
  2012-12-28 14:50     ` Martin Fick
  2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
@ 2012-12-29  8:10     ` Jeff King
  2012-12-29 22:18       ` Martin Fick
  2012-12-31 10:30     ` Martin Fick
  3 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-12-29  8:10 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, git, Junio C Hamano

On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:

> For a single user repo this is not a big deal, the lock can 
> always be cleaned up manually (and it is a rare occurrence).  
> However, in a multi user server environment, possibly even 
> from multiple hosts over a shared filesystem such as NFS, 
> stale locks could lead to serious downtime and risky recovery 
> (since it is currently hard to figure out if a lock really is 
> stale).  Even though stale locks are probably rare even today 
> in the larger shared repo case, as git scales to even larger 
> shared repositories, this will eventually become more of a 
> problem *1.  Naturally, this has me thinking that git should 
> possibly consider moving towards a lockless design for refs 
> in the long term.

FWIW, I am involved in cleaning up stale locks for a very large git
hosting site. It actually happens surprisingly little. I think it is
mostly because git holds actual locks for a very short period of time
(just enough to check that the value is unchanged from when we started a
lengthy operation, and then atomically write the new value).

So I agree it would be cool (and maybe open up new realms of
scalability) for git to be lockless, but in my experience, this isn't
that pressing a problem (and any solutions are not going to be backwards
compatible, so there is going to be a high deployment cost).

> My idea is based on using filenames to store sha1s instead of 
> file contents.  To do this, the sha1 one of a ref would be 
> stored in a file in a directory named after the loose ref.  I 
> believe this would then make it possible to have lockless 
> atomic ref updates by renaming the file.
> 
> To more fully illustrate the idea, imagine that any file 
> (except for the null file) in the directory will represent the 
> value of the ref with its name, then the following 
> transitions can represent atomic state changes to a refs 
> value and existence:

Hmm. So basically you are relying on atomic rename() to move the value
around within a directory, rather than using write to move it around
within a file. Atomic rename is usually something we have on local
filesystems (and I think we rely on it elsewhere). Though I would not be
surprised if it is not atomic on all networked filesystems (though it is
on NFS, at least).

> 1) To update the value from a known value to a new value 
> atomically, simply rename the file to the new value.  This 
> operation should only succeed if the file exists and is still 
> named old value before the rename.  This should even be 
> faster than today's approach, especially on remote filesystems 
> since it would require only 1 round trip in the success case 
> instead of 3!

OK. Makes sense.

> 2) To delete the ref, simply delete the filename representing 
> the current value of the ref.  This ensures that you are 
> deleting the ref from a specific value.  I am not sure if git 
> needs to be able to delete refs without knowing their values?  
> If so, this would require reading the value and looping until 
> the delete succeeds, this may be a bit slow for a constantly 
> updated ref, but likely a rare situation (and not likely 
> worse than trying to acquire the ref-lock today).  Overall, 
> this again would likely be faster than today's approach.

We do sometimes delete without knowing the value. In most cases we would
not want to do this, but for some "force"-type commands, we do. You
would actually have the same problem with updating above, as we
sometimes update with the intent to overwrite whatever is there.

> 3) To create a ref, it must be renamed from the null file (sha 
> 0000...) to the new value just as if it were being updated 
> from any other value, but there is one extra condition: 
> before renaming the null file, a full directory scan must be 
> done to ensure that the null file is the only file in the 
> directory (this condition exists because creating the 
> directory and null file cannot be atomic unless the filesystem 
> supports atomic directory renames, an expectation git does 
> not currently make).  I am not sure how this compares to 
> today's approach, but including the setup costs (described 
> below), I suspect it is slower.

Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
create the correct sha1 file?  A simultaneous creator would fail on the
mkdir and abort. A simultaneous reader might see the directory, but it
would either see it as empty, or with the correct file. In the former
case, it would treat that the same as if the directory did not exist.

Speaking of which, you did not cover reading at all, but it would have
to be:

  dh = opendir(ref);
  if (!dh) {
          if (errno == ENOENT)
                  return 0; /* no such ref */
          else
                  return error("couldn't read ref");
  }

  while ((ent = readdir(dh)) {
          if (ent->d_name[0] == '.')
                  /*
                   * skip "." and "..", and leave room for annotating 
                   * refs via dot-files
                   */
                   continue;
          /* otherwise, we found it */
          if (get_sha1_hex(ent->d_name, sha1) < 0)
                  return error("weird junk in ref dir?");
          return 1; /* found it */
  }
  return 0; /* did not contain an entry; ref being created? Retry? */


Is readdir actually atomic with respect to directory updates? That is,
if I am calling readdir() and somebody else is renaming, what do I get?
POSIX says:

   If a file is removed from or added to the directory after the most
   recent call to opendir() or rewinddir(), whether a subsequent call to
   readdir() returns an entry for that file is unspecified.

If I get one or the other file (that is, the old name or the new one),
it is OK. It does not matter which, as it is a race whether I see the
old value or the new one during an update. But according to POSIX, it is
possible that I may see neither.

I suppose we could rewinddir() and retry. We might hit the race again
(if somebody else is updating quickly), but realistically, this will
happen very infrequently, and we can just keep trying until we win the
race and get a valid read.

> I don't know how this new scheme could be made to work with 
> the current scheme, it seems like perhaps new git releases 
> could be made to understand both the old and the new, and a 
> config option could be used to tell it which method to write 
> new refs with.  Since in this new scheme ref directory names 
> would conflict with old ref filenames, this would likely 
> prevent both schemes from erroneously being used 
> simultaneously (so they shouldn't corrupt each other), except 
> for the fact that refs can be nested in directories which 
> confuses things a bit.  I am not sure what a good solution to 
> this is?

I think you would need to bump core.repositoryformatversion, and just
never let old versions of git access the repository directly. Not the
end of the world, but it certainly increases deployment effort. If we
were going to do that, it would probably make sense to think about
solving the D/F conflict issues at the same time (i.e., start calling
"refs/heads/foo" in the filesystem "refs.d/heads.d/foo.ref" so that it
cannot conflict with "refs.d/heads.d/foo.d/bar.ref").

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-28 14:50     ` Martin Fick
  2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
@ 2012-12-29  8:12       ` Jeff King
  2012-12-29 21:29         ` Martin Fick
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-12-29  8:12 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, git, Junio C Hamano, Shawn Pearce

On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:

> Hmm, actually I believe that with a small modification to the 
> semantics described here it would be possible to make multi 
> repo/branch commits work.   Simply allow the ref filename to 
> be locked by a transaction by appending the transaction ID to 
> the filename.  So if transaction 123 wants to lock master 
> which points currently to abcde, then it will move 
> master/abcde to master/abcde_123.  If transaction 123 is 
> designed so that any process can commit/complete/abort it 
> without requiring any locks which can go stale, then this ref 
> lock will never go stale either (easy as long as it writes 
> all its proposed updates somewhere upfront and has atomic 
> semantics for starting, committing and aborting).  On commit, 
> the ref lock gets updated to its new value: master/newsha and 
> on abort it gets unlocked: master/abcde.

Hmm. I thought our goal was to avoid locks? Isn't this just locking by
another name?

I guess your point is to have no locks in the "normal" case, and have
locked transactions as an optional add-on?

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
@ 2012-12-29  8:16         ` Jeff King
  2012-12-29 21:15           ` Martin Fick
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-12-29  8:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Fick, Michael Haggerty, git, Shawn Pearce

On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:

> Martin Fick <mfick@codeaurora.org> writes:
> 
> > Hmm, actually I believe that with a small modification to the 
> > semantics described here it would be possible to make multi 
> > repo/branch commits work....
> >
> > Shawn talked about adding multi repo/branch transaction 
> > semantics to jgit, this might be something that git wants to 
> > support also at some point?
> 
> Shawn may have talked about it and you may have listened to it, but
> others wouldn't have any idea what kind of "multi repo/branch
> transaction" you are talking about.  Is it about "I want to push
> this ref to that repo and push this other ref to that other repo",
> in what situation will it be used/useful, what are the failure
> modes, what are failure tolerances by the expected use cases, ...?
> 
> Care to explain?

I cannot speak for Martin, but I am assuming the point is to atomically
update 2 (or more) refs on the same repo. That is, if I have a branch
"refs/heads/foo" and a ref pointing to meta-information (say, notes
about commits in foo, in "refs/notes/meta/foo"), I would want to "git
push" them, and only update them if _both_ will succeed, and otherwise
fail and update nothing.

I think Shawn mentioned this at the last GitTogether as a stumbling
block for pushing more of Gerrit's meta-information as refs over the git
protocol. But I might be mis-remembering.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2012-12-29  8:16         ` Jeff King
@ 2012-12-29 21:15           ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-29 21:15 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Michael Haggerty, git, Shawn Pearce



Jeff King <peff@peff.net> wrote:

>On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:
>
>> Martin Fick <mfick@codeaurora.org> writes:
>> 
>> > Hmm, actually I believe that with a small modification to the 
>> > semantics described here it would be possible to make multi 
>> > repo/branch commits work....
>> >
>> > Shawn talked about adding multi repo/branch transaction 
>> > semantics to jgit, this might be something that git wants to 
>> > support also at some point?
>> 
>> Shawn may have talked about it and you may have listened to it, but
>> others wouldn't have any idea what kind of "multi repo/branch
>> transaction" you are talking about.  Is it about "I want to push
>> this ref to that repo and push this other ref to that other repo",
>> in what situation will it be used/useful, what are the failure
>> modes, what are failure tolerances by the expected use cases, ...?
>> 
>> Care to explain?
>
>I cannot speak for Martin, but I am assuming the point is to atomically
>update 2 (or more) refs on the same repo. That is, if I have a branch
>"refs/heads/foo" and a ref pointing to meta-information (say, notes
>about commits in foo, in "refs/notes/meta/foo"), I would want to "git
>push" them, and only update them if _both_ will succeed, and otherwise
>fail and update nothing.

My use case was cross repo/branch dependencies in Gerrit (which do not yet exist). Users want to be able to define several changes (destined for different project/branches) which can only be merged together.  If one change cannot be merged, the others should fail too.  The solutions we can think of generally need to hold ref locks while acquiring more ref locks, this drastically increases the opportunities for stale locks over the simple "lock, check, update, unlock" mode which git locks are currently used for.

I was perhaps making too big of a leap to assume that there would be other non Gerrit uses cases for this?  I assumed that other git projects which are spread across several git repos would need this? But maybe this simply wouldn't be practical with other git server solutions?

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
@ 2012-12-29 21:29         ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-29 21:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano, Shawn Pearce

Jeff King <peff@peff.net> wrote:

>On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:
>
>> Hmm, actually I believe that with a small modification to the 
>> semantics described here it would be possible to make multi 
>> repo/branch commits work.   Simply allow the ref filename to 
>> be locked by a transaction by appending the transaction ID to 
>> the filename.  So if transaction 123 wants to lock master 
>> which points currently to abcde, then it will move 
>> master/abcde to master/abcde_123.  If transaction 123 is 
>> designed so that any process can commit/complete/abort it 
>> without requiring any locks which can go stale, then this ref 
>> lock will never go stale either (easy as long as it writes 
>> all its proposed updates somewhere upfront and has atomic 
>> semantics for starting, committing and aborting).  On commit, 
>> the ref lock gets updated to its new value: master/newsha and 
>> on abort it gets unlocked: master/abcde.
>
>Hmm. I thought our goal was to avoid locks? Isn't this just locking by
>another name?

It is a lock, but it is a lock with an owner: the transaction.  If the transaction has reliable recovery semantics, then the lock will be recoverable also.  This is possible if we have lock ownership (the transaction) which does not exist today for the ref locks.  With good lock ownership we gain the ability to reliably delete locks for a specific owner without the risk of deleting the lock when held by another owner (putting the owner in the filename is "good", while putting the owner in the filecontents is not).   Lastly, for reliable recovery of stale locks we need the ability to determine when an owner has abandoned a lock.  I believe that the transaction semantics laid out below give this.


>I guess your point is to have no locks in the "normal" case, and have
>locked transactions as an optional add-on?

Basically.  If we design the transaction into the git semantics we could ensure that it is recoverable and we should not need to expose these reflocks outside of the transaction APIs.

To illustrate a simple transaction approach (borrowing some of Shawn's ideas), we could designate a directory to hold transaction files *1.  To prepare a transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new (in a stable sorted order based on repo:ref to prevent deadlocks).  This is not a state change and thus this file could be deleted by any process at anytime (preferably after a long grace period).

If file renames are atomic on the filesystem holding the transaction files then 1, 2, 3 below will be atomic state changes.  It does not matter who performs state transitions 2 or 3.  It does not matter who implements the work following any of the 3 transitions, many processes could attempt the work in parallel (so could a human).
 
1) To start the transaction, rename the id.new file to id.  If the rename fails, start over if desired/still possible.  On success, ref locks for each entry should be acquired in listed order (to prevent deadlocks), using transaction id and oldvalue.  It is never legal to unlock a ref in this state (because a block could cause the unlock to be delayed until the commit phase).  However, it is legal for any process to transition to abort at any time from this state, perhaps because of a failure to acquire a lock (held by another transaction), and definitely if a ref has changed (is no longer oldvalue).

2) To abort the transaction, rename the id file to id.abort.  This should only ever fail if commit was achieved first.  Once in this state, any process may/should unlock any ref locks belonging to this transaction id.  Once all refs are unlocked, id.abort may be deleted (it could be deleted earlier, but then cleanup will take longer).

3) To commit the transaction, rename the file to id.commit.  This should only ever fail if abort was achieved first. This transition should never be done until every listed ref is locked by the current transaction id.  Once in this phase, all refs may/should be moved to their new values and unlocked by any process. Once all refs are unlocked, id.commit may be deleted. 

Since any process attempting any of the work in these transactions could block at any time for an indefinite amount of time, these processes may wake after the transaction is aborted or comitted and the transaction files are cleaned up.  I believe that in these cases the only actions which could succeed by these waking processes is the ref locking action.  All such abandoned ref locks may/should be unlocked by any process.  This last rule means that no transaction ids should ever be reused,

-Martin


*1 We may want to adapt the simple model illustrated above to use git mechanisms such as refs to hold transaction info instead of files in a directory, and git submodule files to hold the list of refs to update.  

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
@ 2012-12-29 22:18       ` Martin Fick
  2012-12-30 17:03         ` Martin Fick
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Fick @ 2012-12-29 22:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano

Jeff King <peff@peff.net> wrote:

>On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:
>> My idea is based on using filenames to store sha1s instead of 
>> file contents.  To do this, the sha1 one of a ref would be 
>> stored in a file in a directory named after the loose ref.  I 
>> believe this would then make it possible to have lockless 
>> atomic ref updates by renaming the file.
>> 
>> To more fully illustrate the idea, imagine that any file 
>> (except for the null file) in the directory will represent the 
>> value of the ref with its name, then the following 
>> transitions can represent atomic state changes to a refs 
>> value and existence:
>
>Hmm. So basically you are relying on atomic rename() to move the value
>around within a directory, rather than using write to move it around
>within a file. Atomic rename is usually something we have on local
>filesystems (and I think we rely on it elsewhere). Though I would not
>be
>surprised if it is not atomic on all networked filesystems (though it
>is
>on NFS, at least).

Yes.  I assume this is OK because doesn't git already rely on atomic renames?  For example to rename the new packed-refs file to unlock it?

...

>> 3) To create a ref, it must be renamed from the null file (sha 
>> 0000...) to the new value just as if it were being updated 
>> from any other value, but there is one extra condition: 
>> before renaming the null file, a full directory scan must be 
>> done to ensure that the null file is the only file in the 
>> directory (this condition exists because creating the 
>> directory and null file cannot be atomic unless the filesystem 
>> supports atomic directory renames, an expectation git does 
>> not currently make).  I am not sure how this compares to 
>> today's approach, but including the setup costs (described 
>> below), I suspect it is slower.
>
>Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
>create the correct sha1 file?

But then a process could mkdir and die leaving a stale empty dir with no reliable recovery mechanism.


Unfortunately, I think I see another flaw though! :( I should have known that I cannot separate an important check from its state transitioning action.  The following could happen:

 A does mkdir
 A creates null file
 A checks dir -> no other files 
 B checks dir -> no other files
 A renames null file to abcd
 C creates second null file 
 B renames second null file to defg

One way to fix this is to rely on directory renames, but I believe this is something git does not want to require of every FS? If we did, we could Change #3 to be:

3) To create a ref, it must be renamed from the null file (sha 0000...) to the new value just as if it were being updated from any other value. (No more scan)

Then, with reliable directory renames, a process could do what you suggested to a temporary directory, mkdir + create null file, then rename the temporary dir to refname.  This would prevent duplicate null files.  With a grace period, the temporary dirs could be cleaned up in case a process dies before the rename.  This is your approach with reliable recovery.


>> I don't know how this new scheme could be made to work with 
>> the current scheme, it seems like perhaps new git releases 
>> could be made to understand both the old and the new, and a 
>> config option could be used to tell it which method to write 
>> new refs with.  Since in this new scheme ref directory names 
>> would conflict with old ref filenames, this would likely 
>> prevent both schemes from erroneously being used 
>> simultaneously (so they shouldn't corrupt each other), except 
>> for the fact that refs can be nested in directories which 
>> confuses things a bit.  I am not sure what a good solution to 
>> this is?
>
>I think you would need to bump core.repositoryformatversion, and just
>never let old versions of git access the repository directly. Not the
>end of the world, but it certainly increases deployment effort. If we
>were going to do that, it would probably make sense to think about
>solving the D/F conflict issues at the same time (i.e., start calling
>"refs/heads/foo" in the filesystem "refs.d/heads.d/foo.ref" so that it
>cannot conflict with "refs.d/heads.d/foo.d/bar.ref").

Wouldn't you want to use a non legal ref character instead of dot? And without locks, we free up more of the ref namespace too I think? (Refs could end in ".lock")

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-29 22:18       ` Martin Fick
@ 2012-12-30 17:03         ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-30 17:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano

On Saturday, December 29, 2012 03:18:49 pm Martin Fick wrote:
> Jeff King <peff@peff.net> wrote:
> >On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick 
wrote:
> >> My idea is based on using filenames to store sha1s
> >> instead of file contents.  To do this, the sha1 one of
> >> a ref would be stored in a file in a directory named
> >> after the loose ref.  I believe this would then make
> >> it possible to have lockless atomic ref updates by
> >> renaming the file.
> >> 
> >> To more fully illustrate the idea, imagine that any
> >> file (except for the null file) in the directory will
> >> represent the value of the ref with its name, then the
> >> following transitions can represent atomic state
> >> changes to a refs
> >
> >> value and existence:
> >Hmm. So basically you are relying on atomic rename() to
> >move the value around within a directory, rather than
> >using write to move it around within a file. Atomic
> >rename is usually something we have on local filesystems
> >(and I think we rely on it elsewhere). Though I would
> >not be
> >surprised if it is not atomic on all networked
> >filesystems (though it is
> >on NFS, at least).
> 
> Yes.  I assume this is OK because doesn't git already rely
> on atomic renames?  For example to rename the new
> packed-refs file to unlock it?
> 
> ...
> 
> >> 3) To create a ref, it must be renamed from the null
> >> file (sha 0000...) to the new value just as if it were
> >> being updated from any other value, but there is one
> >> extra condition: before renaming the null file, a full
> >> directory scan must be done to ensure that the null
> >> file is the only file in the directory (this condition
> >> exists because creating the directory and null file
> >> cannot be atomic unless the filesystem supports atomic
> >> directory renames, an expectation git does not
> >> currently make).  I am not sure how this compares to
> >> today's approach, but including the setup costs
> >> (described below), I suspect it is slower.
> >
> >Hmm. mkdir is atomic. So wouldn't it be sufficient to
> >just mkdir and create the correct sha1 file?
> 
> But then a process could mkdir and die leaving a stale
> empty dir with no reliable recovery mechanism.
> 
> 
> Unfortunately, I think I see another flaw though! :( I
> should have known that I cannot separate an important
> check from its state transitioning action.  The following
> could happen:
> 
>  A does mkdir
>  A creates null file
>  A checks dir -> no other files
>  B checks dir -> no other files
>  A renames null file to abcd
>  C creates second null file
>  B renames second null file to defg
> 
> One way to fix this is to rely on directory renames, but I
> believe this is something git does not want to require of
> every FS? If we did, we could Change #3 to be:
> 
> 3) To create a ref, it must be renamed from the null file
> (sha 0000...) to the new value just as if it were being
> updated from any other value. (No more scan)
> 
> Then, with reliable directory renames, a process could do
> what you suggested to a temporary directory, mkdir +
> create null file, then rename the temporary dir to
> refname.  This would prevent duplicate null files.  With
> a grace period, the temporary dirs could be cleaned up in
> case a process dies before the rename.  This is your
> approach with reliable recovery.

The whole null file can go away if we use directory renames.  
Make #3:

3) To create a ref, create a temporary directory containing a 
file named after the sha1 of the ref to be created and rename 
the directory to the name of the ref to create.  If the 
rename fails, the create fails.  If the rename succeeds, the 
create succeeds.

With a grace period, the temporary dirs could be cleaned up 
in case a process dies before the rename,

-Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
                       ` (2 preceding siblings ...)
  2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
@ 2012-12-31 10:30     ` Martin Fick
  2013-01-03 23:52       ` Martin Fick
  2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
  3 siblings, 2 replies; 23+ messages in thread
From: Martin Fick @ 2012-12-31 10:30 UTC (permalink / raw)
  To: Michael Haggerty, Shawn Pearce; +Cc: Jeff King, git, Junio C Hamano

On Thursday, December 27, 2012 04:11:51 pm Martin Fick wrote:
> It concerns me that git uses any locking at all, even for
> refs since it has the potential to leave around stale
> locks.
> ...
> [a previous not so great attempt to fix this]
> ...

I may have finally figured out a working loose ref update 
mechanism which I think can avoid stale locks.  Unfortunately 
it requires atomic directory renames and universally unique 
identifiers (uuids).  These may be no-go criteria?  But I 
figure it is worth at least exploring this idea because of the 
potential benefits?

The general approach is to setup a transaction and either 
commit or abort it.  A transaction can be setup by renaming 
an appropriately setup directory to the "ref.lock" name.  If 
the rename succeeds, the transaction is begun.  Any actor can 
abort the transaction (up until it is committed) by simply 
deleting the "ref.lock" directory, so it is not at risk of 
going stale.  However, once the actor who sets up the 
transaction commits it, deleting the "ref.lock" directory 
simply aids in cleaning it up for the next transaction 
(instead of aborting it).

One important piece of the transaction is the use of uuids.  
The uuids provide a mechanism to tie the atomic commit pieces 
to the transactions and thus to prevent long sleeping process 
from inadvertently performing actions which could be out of 
date when they wake finally up.  In each case, the atomic 
commit piece is the renaming of a file.   For the create and 
update pieces, a file is renamed from the "ref.lock" dir to 
the "ref" file resulting in an update to the sha for the ref.  
However, in the delete case, the "ref" file is instead renamed 
to end up in the "ref.lock" directory resulting in a delete 
of the ref.  This scheme does not affect the way refs are read 
today,

To prepare for a transaction, an actor first generates a uuid 
(an exercise I will delay for now).  Next, a tmp directory 
named after the uuid is generated in the parent directory for 
the ref to be updated, perhaps something like:  ".lock_uuid".  
In this directory is places either a file or a directory named 
after the uuid, something like: ".lock_uuid/,uuid".  In the 
case of a create or an update, the new sha is written to this 
file.  In the case of a delete, it is a directory.  

Once the tmp directory is setup, the initiating actor 
attempts to start the transaction by renaming the tmp 
directory to "ref.lock".  If the rename fails, the update 
fails. If the rename succeeds, the actor can then attempt to 
commit the transaction (before another actor aborts it). 

In the case of a create, the actor verifies that "ref" does 
not currently exist, and then renames the now named 
"ref.lock/uuid" file to "ref". On success, the ref was 
created.

In the case of an update, the actor verifies that "ref" 
currently contains the old sha, and then also renames the now 
named "ref.lock/uuid" file to "ref". On success, the ref was 
updated.

In the case of a delete, the actor may verify that "ref" 
currently contains the sha to "prune" if it needs to, and 
then renames the "ref" file to "ref.lock/uuid/delete". On 
success, the ref was deleted.

Whether successful or not, the actor may now simply delete 
the "ref.lock" directory, clearing the way for a new 
transaction.  Any other actor may delete this directory at 
any time also, likely either on conflict (if they are 
attempting to initiate a transaction), or after a grace 
period just to cleanup the FS.  Any actor may also safely 
cleanup the tmp directories, preferably also after a grace 
period.

One neat part about this scheme is that I believe it would be 
backwards compatible with the current locking mechanism since 
the transaction directory will simply appear to be a lock to 
older clients.  And the old lock file should continue to lock 
out these newer transactions.

Due to this backwards compatibility, I believe that this 
could be incrementally employed today without affecting very 
much.  It could be deployed in place of any updates which 
only hold ref.locks to update the loose ref.  So for example 
I think it could replace step 4a below from Michael 
Haggerty's description of today's loose ref pruning during 
ref packing:

> * Pack references:
...
> 4. prune_refs(): for each ref in the ref_to_prune list,
> call  prune_ref():
>
>     a. Lock the reference using lock_ref_sha1(), 
>     verifying that the recorded SHA1 is still valid.  If it
>     is, unlink the loose reference file then free the lock;
>     otherwise leave the loose reference file untouched.

I think it would also therefore be able to replace the loose 
ref locking in Michael's new ref-packing scheme as well as 
the locking in Michael's new ref deletion scheme (again steps 
4):

> * Delete reference foo:
...
>   4. Delete loose ref for "foo":
> 
>      a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> 
>      b. Unlink $GIT_DIR/refs/heads/foo if it is unchanged.
>  If it is changed, leave it untouched.  If it is deleted,
> that is OK too.
> 
>      c. Release lock $GIT_DIR/refs/heads/foo.lock

...
> * Pack references:
...
>   4. prune_refs(): for each ref in the ref_to_prune list,
> call prune_ref():
> 
>      a. Lock the loose reference using lock_ref_sha1(),
> verifying that the recorded SHA1 is still valid
> 
>      b. If it is, unlink the loose reference file
> (otherwise, leave it untouched)
> 
>      c. Release the lock on the loose reference

To be honest, I suspect I missed something obvious because 
this seems almost too simple to work.  I am ashamed that it 
took me so long to come up with (of course, I will be even 
more ashamed :( when it is shown to be flawed!)  This scheme 
also feels extensible. if there are no obvious flaws in it, I 
will try to post solutions for ref packing and for multiple 
repository/ref transactions also soon.

I welcome any comments/criticisms,

-Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-31 10:30     ` Martin Fick
@ 2013-01-03 23:52       ` Martin Fick
  2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
  2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
  2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Martin Fick @ 2013-01-03 23:52 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Shawn Pearce, Jeff King, git, Junio C Hamano

Any thoughts on this idea?  Is it flawed?  I am trying to 
write it up in a more formal generalized manner and was 
hoping to get at least one "it seems sane" before I do.

Thanks,

-Martin

On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> On Thursday, December 27, 2012 04:11:51 pm Martin Fick 
wrote:
> > It concerns me that git uses any locking at all, even
> > for refs since it has the potential to leave around
> > stale locks.
> > ...
> > [a previous not so great attempt to fix this]
> > ...
> 
> I may have finally figured out a working loose ref update
> mechanism which I think can avoid stale locks. 
> Unfortunately it requires atomic directory renames and
> universally unique identifiers (uuids).  These may be
> no-go criteria?  But I figure it is worth at least
> exploring this idea because of the potential benefits?
> 
> The general approach is to setup a transaction and either
> commit or abort it.  A transaction can be setup by
> renaming an appropriately setup directory to the
> "ref.lock" name.  If the rename succeeds, the transaction
> is begun.  Any actor can abort the transaction (up until
> it is committed) by simply deleting the "ref.lock"
> directory, so it is not at risk of going stale.  However,
> once the actor who sets up the transaction commits it,
> deleting the "ref.lock" directory simply aids in cleaning
> it up for the next transaction (instead of aborting it).
> 
> One important piece of the transaction is the use of
> uuids. The uuids provide a mechanism to tie the atomic
> commit pieces to the transactions and thus to prevent
> long sleeping process from inadvertently performing
> actions which could be out of date when they wake finally
> up.  In each case, the atomic commit piece is the
> renaming of a file.   For the create and update pieces, a
> file is renamed from the "ref.lock" dir to the "ref" file
> resulting in an update to the sha for the ref. However,
> in the delete case, the "ref" file is instead renamed to
> end up in the "ref.lock" directory resulting in a delete
> of the ref.  This scheme does not affect the way refs are
> read today,
> 
> To prepare for a transaction, an actor first generates a
> uuid (an exercise I will delay for now).  Next, a tmp
> directory named after the uuid is generated in the parent
> directory for the ref to be updated, perhaps something
> like:  ".lock_uuid". In this directory is places either a
> file or a directory named after the uuid, something like:
> ".lock_uuid/,uuid".  In the case of a create or an
> update, the new sha is written to this file.  In the case
> of a delete, it is a directory.
> 
> Once the tmp directory is setup, the initiating actor
> attempts to start the transaction by renaming the tmp
> directory to "ref.lock".  If the rename fails, the update
> fails. If the rename succeeds, the actor can then attempt
> to commit the transaction (before another actor aborts
> it).
> 
> In the case of a create, the actor verifies that "ref"
> does not currently exist, and then renames the now named
> "ref.lock/uuid" file to "ref". On success, the ref was
> created.
> 
> In the case of an update, the actor verifies that "ref"
> currently contains the old sha, and then also renames the
> now named "ref.lock/uuid" file to "ref". On success, the
> ref was updated.
> 
> In the case of a delete, the actor may verify that "ref"
> currently contains the sha to "prune" if it needs to, and
> then renames the "ref" file to "ref.lock/uuid/delete". On
> success, the ref was deleted.
> 
> Whether successful or not, the actor may now simply delete
> the "ref.lock" directory, clearing the way for a new
> transaction.  Any other actor may delete this directory at
> any time also, likely either on conflict (if they are
> attempting to initiate a transaction), or after a grace
> period just to cleanup the FS.  Any actor may also safely
> cleanup the tmp directories, preferably also after a grace
> period.
> 
> One neat part about this scheme is that I believe it would
> be backwards compatible with the current locking
> mechanism since the transaction directory will simply
> appear to be a lock to older clients.  And the old lock
> file should continue to lock out these newer
> transactions.
> 
> Due to this backwards compatibility, I believe that this
> could be incrementally employed today without affecting
> very much.  It could be deployed in place of any updates
> which only hold ref.locks to update the loose ref.  So
> for example I think it could replace step 4a below from
> Michael Haggerty's description of today's loose ref
> pruning during
> 
> ref packing:
> > * Pack references:
> ...
> 
> > 4. prune_refs(): for each ref in the ref_to_prune list,
> > 
> > call  prune_ref():
> >     a. Lock the reference using lock_ref_sha1(),
> >     verifying that the recorded SHA1 is still valid.  If
> >     it is, unlink the loose reference file then free
> >     the lock; otherwise leave the loose reference file
> >     untouched.
> 
> I think it would also therefore be able to replace the
> loose ref locking in Michael's new ref-packing scheme as
> well as the locking in Michael's new ref deletion scheme
> (again steps
> 
> 4):
> > * Delete reference foo:
> ...
> 
> >   4. Delete loose ref for "foo":
> >      a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> >      
> >      b. Unlink $GIT_DIR/refs/heads/foo if it is
> >      unchanged.
> >  
> >  If it is changed, leave it untouched.  If it is
> >  deleted,
> > 
> > that is OK too.
> > 
> >      c. Release lock $GIT_DIR/refs/heads/foo.lock
> 
> ...
> 
> > * Pack references:
> ...
> 
> >   4. prune_refs(): for each ref in the ref_to_prune
> >   list,
> > 
> > call prune_ref():
> >      a. Lock the loose reference using lock_ref_sha1(),
> > 
> > verifying that the recorded SHA1 is still valid
> > 
> >      b. If it is, unlink the loose reference file
> > 
> > (otherwise, leave it untouched)
> > 
> >      c. Release the lock on the loose reference
> 
> To be honest, I suspect I missed something obvious because
> this seems almost too simple to work.  I am ashamed that
> it took me so long to come up with (of course, I will be
> even more ashamed :( when it is shown to be flawed!) 
> This scheme also feels extensible. if there are no
> obvious flaws in it, I will try to post solutions for ref
> packing and for multiple repository/ref transactions also
> soon.
> 
> I welcome any comments/criticisms,
> 
> -Martin
> --
> To unsubscribe from this list: send the line "unsubscribe
> git" in the body of a message to
> majordomo@vger.kernel.org More majordomo info at 
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2013-01-03 23:52       ` Martin Fick
@ 2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
  2013-01-04 18:01           ` Martin Fick
  2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Pyeron, Jason J CTR (US) @ 2013-01-04 17:52 UTC (permalink / raw)
  To: git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 7382 bytes --]

> From: Martin Fick
> Sent: Thursday, January 03, 2013 6:53 PM
> 
> Any thoughts on this idea?  Is it flawed?  I am trying to
> write it up in a more formal generalized manner and was
> hoping to get at least one "it seems sane" before I do.

If you are assuming that atomic renames, etc. are available, then you should identify a test case and a degrade operation path when it is not available.

> 
> Thanks,
> 
> -Martin
> 
> On Monday, December 31, 2012 03:30:53 am Martin Fick wrote:
> > On Thursday, December 27, 2012 04:11:51 pm Martin Fick
> wrote:
> > > It concerns me that git uses any locking at all, even
> > > for refs since it has the potential to leave around
> > > stale locks.
> > > ...
> > > [a previous not so great attempt to fix this]
> > > ...
> >
> > I may have finally figured out a working loose ref update
> > mechanism which I think can avoid stale locks.
> > Unfortunately it requires atomic directory renames and
> > universally unique identifiers (uuids).  These may be
> > no-go criteria?  But I figure it is worth at least
> > exploring this idea because of the potential benefits?
> >
> > The general approach is to setup a transaction and either
> > commit or abort it.  A transaction can be setup by
> > renaming an appropriately setup directory to the
> > "ref.lock" name.  If the rename succeeds, the transaction
> > is begun.  Any actor can abort the transaction (up until
> > it is committed) by simply deleting the "ref.lock"
> > directory, so it is not at risk of going stale.  However,
> > once the actor who sets up the transaction commits it,
> > deleting the "ref.lock" directory simply aids in cleaning
> > it up for the next transaction (instead of aborting it).
> >
> > One important piece of the transaction is the use of
> > uuids. The uuids provide a mechanism to tie the atomic
> > commit pieces to the transactions and thus to prevent
> > long sleeping process from inadvertently performing
> > actions which could be out of date when they wake finally
> > up.  In each case, the atomic commit piece is the
> > renaming of a file.   For the create and update pieces, a
> > file is renamed from the "ref.lock" dir to the "ref" file
> > resulting in an update to the sha for the ref. However,
> > in the delete case, the "ref" file is instead renamed to
> > end up in the "ref.lock" directory resulting in a delete
> > of the ref.  This scheme does not affect the way refs are
> > read today,
> >
> > To prepare for a transaction, an actor first generates a
> > uuid (an exercise I will delay for now).  Next, a tmp
> > directory named after the uuid is generated in the parent
> > directory for the ref to be updated, perhaps something
> > like:  ".lock_uuid". In this directory is places either a
> > file or a directory named after the uuid, something like:
> > ".lock_uuid/,uuid".  In the case of a create or an
> > update, the new sha is written to this file.  In the case
> > of a delete, it is a directory.
> >
> > Once the tmp directory is setup, the initiating actor
> > attempts to start the transaction by renaming the tmp
> > directory to "ref.lock".  If the rename fails, the update
> > fails. If the rename succeeds, the actor can then attempt
> > to commit the transaction (before another actor aborts
> > it).
> >
> > In the case of a create, the actor verifies that "ref"
> > does not currently exist, and then renames the now named
> > "ref.lock/uuid" file to "ref". On success, the ref was
> > created.
> >
> > In the case of an update, the actor verifies that "ref"
> > currently contains the old sha, and then also renames the
> > now named "ref.lock/uuid" file to "ref". On success, the
> > ref was updated.
> >
> > In the case of a delete, the actor may verify that "ref"
> > currently contains the sha to "prune" if it needs to, and
> > then renames the "ref" file to "ref.lock/uuid/delete". On
> > success, the ref was deleted.
> >
> > Whether successful or not, the actor may now simply delete
> > the "ref.lock" directory, clearing the way for a new
> > transaction.  Any other actor may delete this directory at
> > any time also, likely either on conflict (if they are
> > attempting to initiate a transaction), or after a grace
> > period just to cleanup the FS.  Any actor may also safely
> > cleanup the tmp directories, preferably also after a grace
> > period.
> >
> > One neat part about this scheme is that I believe it would
> > be backwards compatible with the current locking
> > mechanism since the transaction directory will simply
> > appear to be a lock to older clients.  And the old lock
> > file should continue to lock out these newer
> > transactions.
> >
> > Due to this backwards compatibility, I believe that this
> > could be incrementally employed today without affecting
> > very much.  It could be deployed in place of any updates
> > which only hold ref.locks to update the loose ref.  So
> > for example I think it could replace step 4a below from
> > Michael Haggerty's description of today's loose ref
> > pruning during
> >
> > ref packing:
> > > * Pack references:
> > ...
> >
> > > 4. prune_refs(): for each ref in the ref_to_prune list,
> > >
> > > call  prune_ref():
> > >     a. Lock the reference using lock_ref_sha1(),
> > >     verifying that the recorded SHA1 is still valid.  If
> > >     it is, unlink the loose reference file then free
> > >     the lock; otherwise leave the loose reference file
> > >     untouched.
> >
> > I think it would also therefore be able to replace the
> > loose ref locking in Michael's new ref-packing scheme as
> > well as the locking in Michael's new ref deletion scheme
> > (again steps
> >
> > 4):
> > > * Delete reference foo:
> > ...
> >
> > >   4. Delete loose ref for "foo":
> > >      a. Acquire the lock $GIT_DIR/refs/heads/foo.lock
> > >
> > >      b. Unlink $GIT_DIR/refs/heads/foo if it is
> > >      unchanged.
> > >
> > >  If it is changed, leave it untouched.  If it is
> > >  deleted,
> > >
> > > that is OK too.
> > >
> > >      c. Release lock $GIT_DIR/refs/heads/foo.lock
> >
> > ...
> >
> > > * Pack references:
> > ...
> >
> > >   4. prune_refs(): for each ref in the ref_to_prune
> > >   list,
> > >
> > > call prune_ref():
> > >      a. Lock the loose reference using lock_ref_sha1(),
> > >
> > > verifying that the recorded SHA1 is still valid
> > >
> > >      b. If it is, unlink the loose reference file
> > >
> > > (otherwise, leave it untouched)
> > >
> > >      c. Release the lock on the loose reference
> >
> > To be honest, I suspect I missed something obvious because
> > this seems almost too simple to work.  I am ashamed that
> > it took me so long to come up with (of course, I will be
> > even more ashamed :( when it is shown to be flawed!)
> > This scheme also feels extensible. if there are no
> > obvious flaws in it, I will try to post solutions for ref
> > packing and for multiple repository/ref transactions also
> > soon.
> >
> > I welcome any comments/criticisms,
> >
> > -Martin
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > git" in the body of a message to
> > majordomo@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5615 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
@ 2013-01-04 18:01           ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2013-01-04 18:01 UTC (permalink / raw)
  To: Pyeron, Jason J CTR (US); +Cc: git@vger.kernel.org

On Friday, January 04, 2013 10:52:43 am Pyeron, Jason J 
CTR (US) wrote:
> > From: Martin Fick
> > Sent: Thursday, January 03, 2013 6:53 PM
> > 
> > Any thoughts on this idea?  Is it flawed?  I am 
trying
> > to write it up in a more formal generalized manner 
and
> > was hoping to get at least one "it seems sane" 
before
> > I do.
> 
> If you are assuming that atomic renames, etc. are
> available, then you should identify a test case and a
> degrade operation path when it is not available.

Thanks, sound reasonable.  Where you thinking a runtime 
test case that would be run before every transaction?  I 
was anticipating a per repo config option called 
something like "core.locks = recoverable" that would be 
needed to turn them on?  I was thinking that this was 
something that server sites could test in advance on 
their repos and then enable it for them.  Maybe a git-
lock tool with a --test-recoverable option?

-Martin


> > 
> > On Monday, December 31, 2012 03:30:53 am Martin Fick 
wrote:
> > > On Thursday, December 27, 2012 04:11:51 pm Martin
> > > Fick
> > 
> > wrote:
> > > > It concerns me that git uses any locking at all,
> > > > even for refs since it has the potential to 
leave
> > > > around stale locks.
> > > > ...
> > > > [a previous not so great attempt to fix this]
> > > > ...
> > > 
> > > I may have finally figured out a working loose ref
> > > update mechanism which I think can avoid stale
> > > locks. Unfortunately it requires atomic directory
> > > renames and universally unique identifiers 
(uuids). 
> > > These may be no-go criteria?  But I figure it is
> > > worth at least exploring this idea because of the
> > > potential benefits?
> > > 
> > > The general approach is to setup a transaction and
> > > either commit or abort it.  A transaction can be
> > > setup by renaming an appropriately setup directory
> > > to the "ref.lock" name.  If the rename succeeds, 
the
> > > transaction is begun.  Any actor can abort the
> > > transaction (up until it is committed) by simply
> > > deleting the "ref.lock" directory, so it is not at
> > > risk of going stale.  However, once the actor who
> > > sets up the transaction commits it, deleting the
> > > "ref.lock" directory simply aids in cleaning it up
> > > for the next transaction (instead of aborting it).
> > > 
> > > One important piece of the transaction is the use 
of
> > > uuids. The uuids provide a mechanism to tie the
> > > atomic commit pieces to the transactions and thus 
to
> > > prevent long sleeping process from inadvertently
> > > performing actions which could be out of date when
> > > they wake finally up.  In each case, the atomic
> > > commit piece is the renaming of a file.   For the
> > > create and update pieces, a file is renamed from 
the
> > > "ref.lock" dir to the "ref" file resulting in an
> > > update to the sha for the ref. However, in the
> > > delete case, the "ref" file is instead renamed to
> > > end up in the "ref.lock" directory resulting in a
> > > delete of the ref.  This scheme does not affect 
the
> > > way refs are read today,
> > > 
> > > To prepare for a transaction, an actor first
> > > generates a uuid (an exercise I will delay for 
now).
> > >  Next, a tmp directory named after the uuid is
> > > generated in the parent directory for the ref to 
be
> > > updated, perhaps something like:  ".lock_uuid". In
> > > this directory is places either a file or a
> > > directory named after the uuid, something like:
> > > ".lock_uuid/,uuid".  In the case of a create or an
> > > update, the new sha is written to this file.  In 
the
> > > case of a delete, it is a directory.
> > > 
> > > Once the tmp directory is setup, the initiating 
actor
> > > attempts to start the transaction by renaming the 
tmp
> > > directory to "ref.lock".  If the rename fails, the
> > > update fails. If the rename succeeds, the actor 
can
> > > then attempt to commit the transaction (before
> > > another actor aborts it).
> > > 
> > > In the case of a create, the actor verifies that
> > > "ref" does not currently exist, and then renames 
the
> > > now named "ref.lock/uuid" file to "ref". On 
success,
> > > the ref was created.
> > > 
> > > In the case of an update, the actor verifies that
> > > "ref" currently contains the old sha, and then 
also
> > > renames the now named "ref.lock/uuid" file to 
"ref".
> > > On success, the ref was updated.
> > > 
> > > In the case of a delete, the actor may verify that
> > > "ref" currently contains the sha to "prune" if it
> > > needs to, and then renames the "ref" file to
> > > "ref.lock/uuid/delete". On success, the ref was
> > > deleted.
> > > 
> > > Whether successful or not, the actor may now 
simply
> > > delete the "ref.lock" directory, clearing the way
> > > for a new transaction.  Any other actor may delete
> > > this directory at any time also, likely either on
> > > conflict (if they are attempting to initiate a
> > > transaction), or after a grace period just to
> > > cleanup the FS.  Any actor may also safely cleanup
> > > the tmp directories, preferably also after a grace
> > > period.
> > > 
> > > One neat part about this scheme is that I believe 
it
> > > would be backwards compatible with the current
> > > locking mechanism since the transaction directory
> > > will simply appear to be a lock to older clients. 
> > > And the old lock file should continue to lock out
> > > these newer transactions.
> > > 
> > > Due to this backwards compatibility, I believe 
that
> > > this could be incrementally employed today without
> > > affecting very much.  It could be deployed in 
place
> > > of any updates which only hold ref.locks to update
> > > the loose ref.  So for example I think it could
> > > replace step 4a below from Michael Haggerty's
> > > description of today's loose ref pruning during
> > > 
> > > ref packing:
> > > > * Pack references:
> > > ...
> > > 
> > > > 4. prune_refs(): for each ref in the 
ref_to_prune
> > > > list,
> > > > 
> > > > call  prune_ref():
> > > >     a. Lock the reference using lock_ref_sha1(),
> > > >     verifying that the recorded SHA1 is still
> > > >     valid.  If it is, unlink the loose reference
> > > >     file then free the lock; otherwise leave the
> > > >     loose reference file untouched.
> > > 
> > > I think it would also therefore be able to replace
> > > the loose ref locking in Michael's new ref-packing
> > > scheme as well as the locking in Michael's new ref
> > > deletion scheme (again steps
> > > 
> > > 4):
> > > > * Delete reference foo:
> > > ...
> > > 
> > > >   4. Delete loose ref for "foo":
> > > >      a. Acquire the lock
> > > >      $GIT_DIR/refs/heads/foo.lock
> > > >      
> > > >      b. Unlink $GIT_DIR/refs/heads/foo if it is
> > > >      unchanged.
> > > >  
> > > >  If it is changed, leave it untouched.  If it is
> > > >  deleted,
> > > > 
> > > > that is OK too.
> > > > 
> > > >      c. Release lock 
$GIT_DIR/refs/heads/foo.lock
> > > 
> > > ...
> > > 
> > > > * Pack references:
> > > ...
> > > 
> > > >   4. prune_refs(): for each ref in the 
ref_to_prune
> > > >   list,
> > > > 
> > > > call prune_ref():
> > > >      a. Lock the loose reference using
> > > >      lock_ref_sha1(),
> > > > 
> > > > verifying that the recorded SHA1 is still valid
> > > > 
> > > >      b. If it is, unlink the loose reference 
file
> > > > 
> > > > (otherwise, leave it untouched)
> > > > 
> > > >      c. Release the lock on the loose reference
> > > 
> > > To be honest, I suspect I missed something obvious
> > > because this seems almost too simple to work.  I 
am
> > > ashamed that it took me so long to come up with 
(of
> > > course, I will be even more ashamed :( when it is
> > > shown to be flawed!) This scheme also feels
> > > extensible. if there are no obvious flaws in it, I
> > > will try to post solutions for ref packing and for
> > > multiple repository/ref transactions also soon.
> > > 
> > > I welcome any comments/criticisms,
> > > 
> > > -Martin
> > > --
> > > To unsubscribe from this list: send the line
> > > "unsubscribe git" in the body of a message to
> > > majordomo@vger.kernel.org More majordomo info at
> > > http://vger.kernel.org/majordomo-info.html
> > 
> > --
> > To unsubscribe from this list: send the line
> > "unsubscribe git" in the body of a message to
> > majordomo@vger.kernel.org More majordomo info at 
> > http://vger.kernel.org/majordomo-info.html

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?
  2013-01-03 23:52       ` Martin Fick
  2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
@ 2013-01-04 21:28         ` Junio C Hamano
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-01-04 21:28 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Shawn Pearce, Jeff King, git

Martin Fick <mfick@codeaurora.org> writes:

> Any thoughts on this idea?  Is it flawed?  I am trying to 
> write it up in a more formal generalized manner and was 
> hoping to get at least one "it seems sane" before I do.

The general impression I have been getting was that this isn't even
worth the effort and the resulting complexity of the code, given
Peff's observations earlier in the thread that ref update conflicts
and leftover locks are reasonably rare in practice.  But perhaps I
has been mis-reading the discussion.

I also have this suspicion that if you really want to shoot for
multi-repository transactions in an massively scaled repository
hosting environment, you would rather want to not rely on hacks
based on filesystem semantics, but instead want to RPC with a
dedicated "ref management service" that knows the transaction
semantics you want, but that could become a much larger change.

I dunno.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2012-12-31 10:30     ` Martin Fick
  2013-01-03 23:52       ` Martin Fick
@ 2013-01-05 16:12       ` Jeff King
  2013-01-22  4:31         ` Drew Northup
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2013-01-05 16:12 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Shawn Pearce, git, Junio C Hamano

On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick wrote:

> The general approach is to setup a transaction and either 
> commit or abort it.  A transaction can be setup by renaming 
> an appropriately setup directory to the "ref.lock" name.  If 
> the rename succeeds, the transaction is begun.  Any actor can 
> abort the transaction (up until it is committed) by simply 
> deleting the "ref.lock" directory, so it is not at risk of 
> going stale.

Deleting a directory is not atomic, as you first have to remove the
contents, putting it into a potentially inconsistent state. I'll assume
you deal with that later...

> One important piece of the transaction is the use of uuids.  
> The uuids provide a mechanism to tie the atomic commit pieces 
> to the transactions and thus to prevent long sleeping process 
> from inadvertently performing actions which could be out of 
> date when they wake finally up.

Has this been a problem for you in practice? Avoiding this is one of the
reasons that git does not take out long locks; instead, it takes the
lock only at the moment it is ready to write, and aborts if it has been
updated since the longer-term operation began. This has its own problems
(you might do a lot of work only to have your operation aborted), but I
am not sure that your proposal improves on that.

Your proposal does sound like it could potentially improve robustness
when killing stale transactions (i.e., you know that the transaction
originator will never wake up and think it still has the lock). But
again, is that a problem in practice? Git typically holds ref locks for
a few syscalls. If you are conservative about leaving potentially stale
locks in place (e.g., give them a few minutes to complete before
assuming they are now bogus), you will not run into that problem.

The more conservative you are about treating a lock as stale, of course,
the less performant you will be in the face of stale locks. But since
they're the exception, that isn't a big problem in practice (at least it
has not been for me).

> In each case, the atomic 
> commit piece is the renaming of a file.   For the create and 
> update pieces, a file is renamed from the "ref.lock" dir to 
> the "ref" file resulting in an update to the sha for the ref.

I think we've had problems with cross-directory renames on some
filesystems, but I don't recall the details. I know that Coda does not
like cross-directory links, but cross-directory renames are OK (and in
fact we fall back to the latter when the former does not work).

Ah, here we go: 5723fe7 (Avoid cross-directory renames and linking on
object creation, 2008-06-14). Looks like NFS is the culprit.

> In the case of a delete, the actor may verify that "ref" 
> currently contains the sha to "prune" if it needs to, and 
> then renames the "ref" file to "ref.lock/uuid/delete". On 
> success, the ref was deleted.
> 
> Whether successful or not, the actor may now simply delete 
> the "ref.lock" directory, clearing the way for a new 
> transaction.  Any other actor may delete this directory at 
> any time also, likely either on conflict (if they are 
> attempting to initiate a transaction), or after a grace 
> period just to cleanup the FS.  Any actor may also safely 
> cleanup the tmp directories, preferably also after a grace 
> period.

Hmm. So what happens to the "delete" file when the ref.lock directory is
being deleted? Presumably deleting the ref.lock directory means doing it
recursively (which is non-atomic). But then why are we keeping the
delete file at all, if we're just about to remove it?

What happens if another process wants to cancel a transaction that is
partially done? That is, the ref.lock directory is created, but it
contains the uuid subdir? It sounds like it's OK to just delete from it,
and the original process will then fail at its rename?

> One neat part about this scheme is that I believe it would be 
> backwards compatible with the current locking mechanism since 
> the transaction directory will simply appear to be a lock to 
> older clients.  And the old lock file should continue to lock 
> out these newer transactions.

Yeah, I don't see anything that would prevent that. The current code
only cares about open("$ref.lock", O_EXCL). But that should be correct
and atomic with respect to the renamed directories. You are depending on
atomic directory renames. Those aren't used anywhere yet in git, as far
as I know. So that may run into problems (but there's no reason this
system couldn't be optional for filesystems that are more abled, and
other systems could fall back to the straight-file locking).


So in response to your question, no, I don't see any real showstoppers
here. And unlike the "all refs are files in a directory" scheme, it's
confined to writing, which solves the readdir() atomicity questions I
had there.

I still can't say I'm super excited about it, just because it seems like
a solution in search of a problem that I have not experienced myself.
But if it's solving a problem for you, I don't want to discourage you
from pursuing it.

> To be honest, I suspect I missed something obvious because 
> this seems almost too simple to work.  I am ashamed that it 
> took me so long to come up with (of course, I will be even 
> more ashamed :( when it is shown to be flawed!)  This scheme 
> also feels extensible. if there are no obvious flaws in it, I 
> will try to post solutions for ref packing and for multiple 
> repository/ref transactions also soon.

Fixing the minor remaining races on ref packing would be nice, but I do
not think those are a problem of the on-disk lock representation. The
reason they are not fixed now is from an attempt to keep lock contention
low between unrelated updates (something which we should probably give
up on in favor of correctness; but that is orthogonal to whether we do
it with the existing file locks or with a new system).

A much stronger fix for that would be to record deletes in the loose ref
storage so that we don't have to update the packed-refs file as often
(because it is inherently a lock bottleneck, and because it is wasteful
when you have a very large number of refs). But that means dealing with
directory/file conflicts between deleted and existing branches, which is
a pain.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] refs: do not use cached refs in repack_without_ref
       [not found]     ` <201301071109.12086.mfick@codeaurora.org>
@ 2013-01-07 18:14       ` Martin Fick
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Fick @ 2013-01-07 18:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano

...[Sorry about the previous HTML reposts]

Jeff King <peff@peff.net> wrote:
>On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick 
wrote:
>
>> The general approach is to setup a transaction and
>> either commit or abort it. A transaction can be setup
>> by renaming an appropriately setup directory to the
>> "ref.lock" name. If the rename succeeds, the
>> transaction is begun. Any actor can abort the
>> transaction (up until it is committed) by simply
>> deleting the "ref.lock" directory, so it is not at
>> risk of going stale.
>
> Deleting a directory is not atomic, as you first have
> to remove the contents, putting it into a potentially
> inconsistent state. I'll assume you deal with that
> later...

Right, these simple single file transactions have at 
best 1 important file/directory in them, once deleted 
the transaction is aborted (can no longer complete).  
However to support multi file transactions, a better 
approach is likely to rename the uuid directory to have 
a .delete extension before deleting stuff in it.


> > One important piece of the transaction is the use
> > of uuids. The uuids provide a mechanism to tie the
> > atomic commit pieces to the transactions and thus to
> > prevent long sleeping process from inadvertently
> > performing actions which could be out of date when
> > they wake finally up.
> >
>Has this been a problem for you in practice?

No, but as you say, we don't currently hold locks for 
very long. I anticipate it being a problem in a 
clustered environment when transactions start spanning 
repos from java processes, with insane amounts of RAM, 
which can sometimes have unpredictable indeterminately 
long java GC cycles at inopportune times.. It would seem 
short sighted if Gerrit at least did not assume this 
will be a problem.

But, deletes today in git are not so short and Michael's 
fixes may make things worse? But, as you point out, that 
should perhaps be solved a different way.


> Avoiding this is one of the reasons that git does not
> take out long locks; instead, it takes the lock only
> at the moment it is ready to write, and aborts if it
> has been updated since the longer-term operation
> began. This has its own problems (you might do a lot
> of work only to have your operation aborted), but I
> am not sure that your proposal improves on that.

It does not, it might increase this.


> Git typically holds ref locks for a few syscalls. If
> you are conservative about leaving potentially stale
> locks in place (e.g., give them a few minutes to
> complete before assuming they are now bogus), you will
> not run into that problem.

In a distributed environment even a few minutes might 
not be enough, processes could be on a remote server 
with a temporarily split network, that could cause 
delays longer than your typical local expectations.

But there is also the other piece of this problem, how 
do you detect stale locks? How long will it be stale 
until a user figures it out and reports it? How many 
other users will simply have failed pushes and wonder 
why without reporting them?


> > In each case, the atomic commit piece is the
> > renaming of a file. For the create and update
> > pieces, a file is renamed from the "ref.lock" 
> > dir to the "ref" file resulting in an update to 
> > the sha for the ref.
>
> I think we've had problems with cross-directory
> renames on some filesystems, but I don't recall the
> details. I know that Coda does not like 
> cross-directory links, but cross-directory renames 
> are OK (and in fact we fall back to the latter when
> the former does not work).
>
> Ah, here we go: 5723fe7 (Avoid cross-directory renames
> and linking on object creation, 2008-06-14). Looks
> like NFS is the culprit.

If the renames fail we can fall back to regular file 
locking, the hard part to detect and deal with would be 
if the renames don't fail but become copies/mkdirs.


>> In the case of a delete, the actor may verify that
>> "ref" currently contains the sha to "prune" if it
>> needs to, and then renames the "ref" file to
>> "ref.lock/uuid/delete". On success, the ref was
>> deleted.
>>
>> Whether successful or not, the actor may now simply
>> delete the "ref.lock" directory, clearing the way for
>> a new transaction. Any other actor may delete this
>> directory at any time also, likely either on conflict
>> (if they are attempting to initiate a transaction),
>> or after a grace period just to cleanup the FS. Any
>> actor may also safely cleanup the tmp directories,
>> preferably also after a grace period.
>
> Hmm. So what happens to the "delete" file when the
> ref.lock directory is being deleted? Presumably
> deleting the ref.lock directory means doing it
> recursively (which is non-atomic). But then why are 
> we keeping the delete file at all, if we're just about
> to remove it?

We are not trying to keep it, but we need to ensure that 
our transaction has not yet been aborted: the rename 
does this.  If we just deleted the file, we may sleep 
and another transaction may abort our transaction and 
complete before we wake up and actually delete the file. 
But by using a rename we tie the delete atomically to 
the transaction, it cannot succeed if our transaction 
was aborted during a sleep since the directory we are 
renaming the file into would be gone! This is sort of 
the magic piece that makes the whole scheme special, 
safe deletes tend to be the hardest part.


>What happens if another process wants to cancel a
> transaction that is partially done? That is, the
> ref.lock directory is created, but it contains the
> uuid subdir? It sounds like it's OK to just delete
> from it, and the original process will then fail at
> its rename?

Yes, exactly.  But again, it might be better to cause a 
rename of the uuid dir before deleting it.


>> One neat part about this scheme is that I believe it
>> would be backwards compatible with the current
>> locking mechanism since the transaction directory
>> will simply appear to be a lock to older clients. 
>> And the old lock file should continue to lock
>> out these newer transactions.
>
>Yeah, I don't see anything that would prevent that. 
> The current code only cares about open("$ref.lock",
> O_EXCL). But that should be correct and atomic with
> respect to the renamed directories. You are depending
> on atomic directory renames. Those aren't used
> anywhere yet in git, as far as I know. So that may run
> into problems (but there's no reason this system
> couldn't be optional for filesystems that are more
> abled, and other systems could fall back to the
> straight-file locking).

Right.


> So in response to your question, no, I don't see any
> real showstoppers here. 

Alright, cool, thanks for the review and analysis, I 
appreciate it.


> And unlike the "all refs are files in a directory"
> scheme, it's confined to writing, which solves the
> readdir() atomicity questions I had there.


Right, I couldn't see a way around those, I think it was 
inherently flawed.


> I still can't say I'm super excited about it, just
> because it seems like a solution in search of a
> problem that I have not experienced myself.

It might be.  But at least a solution is out there now. 
If time indicates that it is needed, I feel better 
knowing there is a solution. 

Murphy has a way of making unlikely problems real 
annoyances in automated distributed environments. The 
problem is not usually one real annoying problem, those 
get fixed. The problem we deal mostly with is 1000 
infrequent problems, all different, but they add up to a 
system which needs constant maintenance. And the 
maintenance is hard since each problem is different, the 
analysis is difficult and requires expertise.


> Fixing the minor remaining races on ref packing would
> be nice, but I do not think those are a problem of the
> on-disk lock representation. The reason they are not
> fixed now is from an attempt to keep lock contention
> low between unrelated updates (something which we
> should probably give up on in favor of correctness;
> but that is orthogonal to whether we do it with the
> existing file locks or with a new system).

Agreed.


>A much stronger fix for that would be to record deletes
> in the loose ref storage so that we don't have to
> update the packed-refs file as often (because it is
> inherently a lock bottleneck, and because it is
> wasteful when you have a very large number of refs).
> But that means dealing with directory/file conflicts
> between deleted and existing branches, which is a
> pain.

Makes sense.

Thanks again for your thoughts,

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)
  2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
@ 2013-01-22  4:31         ` Drew Northup
  0 siblings, 0 replies; 23+ messages in thread
From: Drew Northup @ 2013-01-22  4:31 UTC (permalink / raw)
  To: Jeff King
  Cc: Martin Fick, Michael Haggerty, Shawn Pearce, git, Junio C Hamano

On Sat, Jan 5, 2013 at 11:12 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick wrote:
>
>> The general approach is to setup a transaction and either
>> commit or abort it.  A transaction can be setup by renaming
>> an appropriately setup directory to the "ref.lock" name.  If
>> the rename succeeds, the transaction is begun.  Any actor can
>> abort the transaction (up until it is committed) by simply
>> deleting the "ref.lock" directory, so it is not at risk of
>> going stale.
>
> Deleting a directory is not atomic, as you first have to remove the
> contents, putting it into a potentially inconsistent state. I'll assume
> you deal with that later...

I know I'm a bit late to the dance here, but FWIW the apparent atomicy
(odd conjugation there) of directory deletion depends largely upon the
filesystem and VFS api in use. It is not unheard of that a delete
operation actually consist of moving the reference to the item's own
allocation marker into a "trashcan" to be cleaned up after later.
In other words, I'd not advise planning on directory deletes always
being atomic nor always not being atomic.

-- 
-Drew Northup
--------------------------------------------------------------
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-01-22  4:31 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
2013-01-04 18:01           ` Martin Fick
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` Martin Fick

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).