git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Johan Herland <johan@herland.net>,
	git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH 10/12] get_packed_ref_cache: reload packed-refs file when it changes
Date: Tue, 11 Jun 2013 23:48:30 +0200	[thread overview]
Message-ID: <1370987312-6761-11-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1370987312-6761-1-git-send-email-mhagger@alum.mit.edu>

From: Jeff King <peff@peff.net>

Once we read the packed-refs file into memory, we cache it
to save work on future ref lookups. However, our cache may
be out of date with respect to what is on disk if another
process is simultaneously packing the refs. Normally it
is acceptable for us to be a little out of date, since there
is no guarantee whether we read the file before or after the
simultaneous update. However, there is an important special
case: our packed-refs file must be up to date with respect
to any loose refs we read. Otherwise, we risk the following
race condition:

  0. There exists a loose ref refs/heads/master.

  1. Process A starts and looks up the ref "master". It
     first checks $GIT_DIR/master, which does not exist. It
     then loads (and caches) the packed-refs file to see if
     "master" exists in it, which it does not.

  2. Meanwhile, process B runs "pack-refs --all --prune". It
     creates a new packed-refs file which contains
     refs/heads/master, and removes the loose copy at
     $GIT_DIR/refs/heads/master.

  3. Process A continues its lookup, and eventually tries
     $GIT_DIR/refs/heads/master.  It sees that the loose ref
     is missing, and falls back to the packed-refs file. But
     it examines its cached version, which does not have
     refs/heads/master. After trying a few other prefixes,
     it reports master as a non-existent ref.

There are many variants (e.g., step 1 may involve process A
looking up another ref entirely, so even a fully qualified
refname can fail). One of the most interesting ones is if
"refs/heads/master" is already packed. In that case process
A will not see it as missing, but rather will report
whatever value happened to be in the packed-refs file before
process B repacked (which might be an arbitrarily old
value).

We can fix this by making sure we reload the packed-refs
file from disk after looking at any loose refs. That's
unacceptably slow, so we can check its stat()-validity as a
proxy, and read it only when it appears to have changed.

Reading the packed-refs file after performing any loose-ref
system calls is sufficient because we know the ordering of
the pack-refs process: it always makes sure the newly
written packed-refs file is installed into place before
pruning any loose refs. As long as those operations by B
appear in their executed order to process A, by the time A
sees the missing loose ref, the new packed-refs file must be
in place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
This is Peff's work, rebased and with some smallish changes to fit it
in with the packed_ref_cache data structure.

 refs.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 92c8e97..64f72ab 100644
--- a/refs.c
+++ b/refs.c
@@ -824,6 +824,9 @@ struct packed_ref_cache {
 
 	/* If locked, the file descriptor of the lock file. */
 	int fd;
+
+	/* The metadata from when this packed-refs cache was read */
+	struct stat_validity validity;
 };
 
 /*
@@ -858,6 +861,7 @@ static int release_packed_ref_cache(struct packed_ref_cache *packed_refs)
 {
 	if (!--packed_refs->referrers) {
 		free_ref_entry(packed_refs->root);
+		stat_validity_clear(&packed_refs->validity);
 		free(packed_refs);
 		return 1;
 	} else {
@@ -1049,20 +1053,27 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
  */
 static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs)
 {
+	const char *packed_refs_file;
+
+	if (*refs->name)
+		packed_refs_file = git_path_submodule(refs->name, "packed-refs");
+	else
+		packed_refs_file = git_path("packed-refs");
+
+	if (refs->packed &&
+	    !stat_validity_check(&refs->packed->validity, packed_refs_file))
+		clear_packed_ref_cache(refs);
+
 	if (!refs->packed) {
-		const char *packed_refs_file;
 		FILE *f;
 
 		refs->packed = xcalloc(1, sizeof(*refs->packed));
 		acquire_packed_ref_cache(refs->packed);
 		refs->packed->root = create_dir_entry(refs, "", 0, 0);
 		refs->packed->fd = -1;
-		if (*refs->name)
-			packed_refs_file = git_path_submodule(refs->name, "packed-refs");
-		else
-			packed_refs_file = git_path("packed-refs");
 		f = fopen(packed_refs_file, "r");
 		if (f) {
+			stat_validity_update(&refs->packed->validity, fileno(f));
 			read_packed_refs(f, get_ref_dir(refs->packed->root));
 			fclose(f);
 		}
-- 
1.8.3

  parent reply	other threads:[~2013-06-11 21:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-11 21:48 [PATCH 00/12] Fix some reference-related races Michael Haggerty
2013-06-11 21:48 ` [PATCH 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
2013-06-12 11:38   ` Jeff King
2013-06-12 11:56     ` Michael Haggerty
2013-06-11 21:48 ` [PATCH 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
2013-06-11 21:48 ` [PATCH 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
2013-06-11 21:48 ` [PATCH 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
2013-06-12 12:01   ` Jeff King
2013-06-11 21:48 ` [PATCH 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
2013-06-11 21:48 ` [PATCH 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
2013-06-11 21:48 ` [PATCH 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
2013-06-11 21:48 ` [PATCH 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
2013-06-11 21:48 ` [PATCH 09/12] add a stat_validity struct Michael Haggerty
2013-06-11 21:48 ` Michael Haggerty [this message]
2013-06-11 21:48 ` [PATCH 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
2013-06-11 21:48 ` [PATCH 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
2013-06-12 12:39   ` Jeff King
2013-06-12 12:52 ` [PATCH 00/12] Fix some reference-related races Jeff King
2013-06-15 20:13 ` Ramsay Jones
2013-06-16  5:50   ` Michael Haggerty
2013-06-18 18:13     ` Ramsay Jones
2013-06-19  5:51       ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1370987312-6761-11-git-send-email-mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).