From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Date: Sat, 29 Dec 2012 03:10:21 -0500 Message-ID: <20121229081021.GC15408@sigill.intra.peff.net> References: <20121221080449.GA21741@sigill.intra.peff.net> <50DAB447.8000101@alum.mit.edu> <201212271611.52203.mfick@codeaurora.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Michael Haggerty , git@vger.kernel.org, Junio C Hamano To: Martin Fick X-From: git-owner@vger.kernel.org Sat Dec 29 09:10:48 2012 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TorVE-0001hX-Qe for gcvg-git-2@plane.gmane.org; Sat, 29 Dec 2012 09:10:45 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752364Ab2L2IK0 (ORCPT ); Sat, 29 Dec 2012 03:10:26 -0500 Received: from 75-15-5-89.uvs.iplsin.sbcglobal.net ([75.15.5.89]:38621 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752128Ab2L2IKZ (ORCPT ); Sat, 29 Dec 2012 03:10:25 -0500 Received: (qmail 27714 invoked by uid 107); 29 Dec 2012 08:11:34 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) (smtp-auth username relayok, mechanism cram-md5) by peff.net (qpsmtpd/0.84) with ESMTPA; Sat, 29 Dec 2012 03:11:34 -0500 Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Sat, 29 Dec 2012 03:10:21 -0500 Content-Disposition: inline In-Reply-To: <201212271611.52203.mfick@codeaurora.org> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote: > For a single user repo this is not a big deal, the lock can > always be cleaned up manually (and it is a rare occurrence). > However, in a multi user server environment, possibly even > from multiple hosts over a shared filesystem such as NFS, > stale locks could lead to serious downtime and risky recovery > (since it is currently hard to figure out if a lock really is > stale). Even though stale locks are probably rare even today > in the larger shared repo case, as git scales to even larger > shared repositories, this will eventually become more of a > problem *1. Naturally, this has me thinking that git should > possibly consider moving towards a lockless design for refs > in the long term. FWIW, I am involved in cleaning up stale locks for a very large git hosting site. It actually happens surprisingly little. I think it is mostly because git holds actual locks for a very short period of time (just enough to check that the value is unchanged from when we started a lengthy operation, and then atomically write the new value). So I agree it would be cool (and maybe open up new realms of scalability) for git to be lockless, but in my experience, this isn't that pressing a problem (and any solutions are not going to be backwards compatible, so there is going to be a high deployment cost). > My idea is based on using filenames to store sha1s instead of > file contents. To do this, the sha1 one of a ref would be > stored in a file in a directory named after the loose ref. I > believe this would then make it possible to have lockless > atomic ref updates by renaming the file. > > To more fully illustrate the idea, imagine that any file > (except for the null file) in the directory will represent the > value of the ref with its name, then the following > transitions can represent atomic state changes to a refs > value and existence: Hmm. So basically you are relying on atomic rename() to move the value around within a directory, rather than using write to move it around within a file. Atomic rename is usually something we have on local filesystems (and I think we rely on it elsewhere). Though I would not be surprised if it is not atomic on all networked filesystems (though it is on NFS, at least). > 1) To update the value from a known value to a new value > atomically, simply rename the file to the new value. This > operation should only succeed if the file exists and is still > named old value before the rename. This should even be > faster than today's approach, especially on remote filesystems > since it would require only 1 round trip in the success case > instead of 3! OK. Makes sense. > 2) To delete the ref, simply delete the filename representing > the current value of the ref. This ensures that you are > deleting the ref from a specific value. I am not sure if git > needs to be able to delete refs without knowing their values? > If so, this would require reading the value and looping until > the delete succeeds, this may be a bit slow for a constantly > updated ref, but likely a rare situation (and not likely > worse than trying to acquire the ref-lock today). Overall, > this again would likely be faster than today's approach. We do sometimes delete without knowing the value. In most cases we would not want to do this, but for some "force"-type commands, we do. You would actually have the same problem with updating above, as we sometimes update with the intent to overwrite whatever is there. > 3) To create a ref, it must be renamed from the null file (sha > 0000...) to the new value just as if it were being updated > from any other value, but there is one extra condition: > before renaming the null file, a full directory scan must be > done to ensure that the null file is the only file in the > directory (this condition exists because creating the > directory and null file cannot be atomic unless the filesystem > supports atomic directory renames, an expectation git does > not currently make). I am not sure how this compares to > today's approach, but including the setup costs (described > below), I suspect it is slower. Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and create the correct sha1 file? A simultaneous creator would fail on the mkdir and abort. A simultaneous reader might see the directory, but it would either see it as empty, or with the correct file. In the former case, it would treat that the same as if the directory did not exist. Speaking of which, you did not cover reading at all, but it would have to be: dh = opendir(ref); if (!dh) { if (errno == ENOENT) return 0; /* no such ref */ else return error("couldn't read ref"); } while ((ent = readdir(dh)) { if (ent->d_name[0] == '.') /* * skip "." and "..", and leave room for annotating * refs via dot-files */ continue; /* otherwise, we found it */ if (get_sha1_hex(ent->d_name, sha1) < 0) return error("weird junk in ref dir?"); return 1; /* found it */ } return 0; /* did not contain an entry; ref being created? Retry? */ Is readdir actually atomic with respect to directory updates? That is, if I am calling readdir() and somebody else is renaming, what do I get? POSIX says: If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified. If I get one or the other file (that is, the old name or the new one), it is OK. It does not matter which, as it is a race whether I see the old value or the new one during an update. But according to POSIX, it is possible that I may see neither. I suppose we could rewinddir() and retry. We might hit the race again (if somebody else is updating quickly), but realistically, this will happen very infrequently, and we can just keep trying until we win the race and get a valid read. > I don't know how this new scheme could be made to work with > the current scheme, it seems like perhaps new git releases > could be made to understand both the old and the new, and a > config option could be used to tell it which method to write > new refs with. Since in this new scheme ref directory names > would conflict with old ref filenames, this would likely > prevent both schemes from erroneously being used > simultaneously (so they shouldn't corrupt each other), except > for the fact that refs can be nested in directories which > confuses things a bit. I am not sure what a good solution to > this is? I think you would need to bump core.repositoryformatversion, and just never let old versions of git access the repository directly. Not the end of the world, but it certainly increases deployment effort. If we were going to do that, it would probably make sense to think about solving the D/F conflict issues at the same time (i.e., start calling "refs/heads/foo" in the filesystem "refs.d/heads.d/foo.ref" so that it cannot conflict with "refs.d/heads.d/foo.d/bar.ref"). -Peff