git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 00/12] Fix some reference-related races
@ 2013-06-19  7:51 Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
                   ` (12 more replies)
  0 siblings, 13 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Re-roll of mh/ref-races.  Thanks to Peff, Junio, and Ramsay for
reviewing v1.

Changes since v1:

* mh/reflife has graduated to master, so this patch series now applies
  directly to master.

* Some trivial constness adjustments were necessary because of

    21a6b9fa read-cache: mark cache_entry pointers const

* Better explain (in comments and log message) patch 01/12.

* In 04/12, access struct lock_file::fd directly instead of keeping
  a separate copy in struct packed_ref_cache.

Please note that this patch series's usage of stat()/fstat() causes
breakages under cygwin, pointed out by Ramsay Jones.  He has
graciously offered to work on this issue.

The last patch is still optional--it avoids a little bit of work when
rewriting the packed-refs file, but relies on the stat-based freshness
check not giving a false negative.

Jeff King (2):
  get_packed_ref_cache: reload packed-refs file when it changes
  for_each_ref: load all loose refs before packed refs

Michael Haggerty (10):
  repack_without_ref(): split list curation and entry writing
  pack_refs(): split creation of packed refs and entry writing
  refs: wrap the packed refs cache in a level of indirection
  refs: implement simple transactions for the packed-refs file
  refs: manage lifetime of packed refs cache via reference counting
  do_for_each_entry(): increment the packed refs cache refcount
  packed_ref_cache: increment refcount when locked
  Extract a struct stat_data from cache_entry
  add a stat_validity struct
  refs: do not invalidate the packed-refs cache unnecessarily

 builtin/clone.c    |   7 +-
 builtin/ls-files.c |  12 +-
 cache.h            |  60 ++++++++--
 read-cache.c       | 181 ++++++++++++++++++------------
 refs.c             | 314 ++++++++++++++++++++++++++++++++++++++++++++---------
 refs.h             |  27 ++++-
 6 files changed, 469 insertions(+), 132 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 01/12] repack_without_ref(): split list curation and entry writing
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

The repack_without_ref() function first removes the deleted ref from
the internal packed-refs list, then writes the packed-refs list to
disk, omitting any broken or stale entries.  This patch splits that
second step into multiple passes:

* collect the list of refnames that should be deleted from packed_refs

* delete those refnames from the cache

* write the remainder to the packed-refs file

The purpose of this change is to make the "write the remainder" part
reusable.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 50 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index 42a7e17..1184b99 100644
--- a/refs.c
+++ b/refs.c
@@ -1998,6 +1998,23 @@ static void write_packed_entry(int fd, char *refname, unsigned char *sha1,
 	}
 }
 
+/*
+ * An each_ref_entry_fn that writes the entry to a packed-refs file.
+ */
+static int write_packed_entry_fn(struct ref_entry *entry, void *cb_data)
+{
+	int *fd = cb_data;
+	enum peel_status peel_status = peel_entry(entry, 0);
+
+	if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
+		error("internal error: %s is not a valid packed reference!",
+		      entry->name);
+	write_packed_entry(*fd, entry->name, entry->u.value.sha1,
+			   peel_status == PEEL_PEELED ?
+			   entry->u.value.peeled : NULL);
+	return 0;
+}
+
 struct ref_to_prune {
 	struct ref_to_prune *next;
 	unsigned char sha1[20];
@@ -2117,14 +2134,25 @@ int pack_refs(unsigned int flags)
 	return 0;
 }
 
-static int repack_ref_fn(struct ref_entry *entry, void *cb_data)
+/*
+ * If entry is no longer needed in packed-refs, add it to the string
+ * list pointed to by cb_data.  Reasons for deleting entries:
+ *
+ * - Entry is broken.
+ * - Entry is overridden by a loose ref.
+ * - Entry does not point at a valid object.
+ *
+ * In the first and third cases, also emit an error message because these
+ * are indications of repository corruption.
+ */
+static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data)
 {
-	int *fd = cb_data;
-	enum peel_status peel_status;
+	struct string_list *refs_to_delete = cb_data;
 
 	if (entry->flag & REF_ISBROKEN) {
 		/* This shouldn't happen to packed refs. */
 		error("%s is broken!", entry->name);
+		string_list_append(refs_to_delete, entry->name);
 		return 0;
 	}
 	if (!has_sha1_file(entry->u.value.sha1)) {
@@ -2134,7 +2162,7 @@ static int repack_ref_fn(struct ref_entry *entry, void *cb_data)
 		if (read_ref_full(entry->name, sha1, 0, &flags))
 			/* We should at least have found the packed ref. */
 			die("Internal error");
-		if ((flags & REF_ISSYMREF) || !(flags & REF_ISPACKED))
+		if ((flags & REF_ISSYMREF) || !(flags & REF_ISPACKED)) {
 			/*
 			 * This packed reference is overridden by a
 			 * loose reference, so it is OK that its value
@@ -2143,9 +2171,11 @@ static int repack_ref_fn(struct ref_entry *entry, void *cb_data)
 			 * collected.  For this purpose we don't even
 			 * care whether the loose reference itself is
 			 * invalid, broken, symbolic, etc.  Silently
-			 * omit the packed reference from the output.
+			 * remove the packed reference.
 			 */
+			string_list_append(refs_to_delete, entry->name);
 			return 0;
+		}
 		/*
 		 * There is no overriding loose reference, so the fact
 		 * that this reference doesn't refer to a valid object
@@ -2154,14 +2184,10 @@ static int repack_ref_fn(struct ref_entry *entry, void *cb_data)
 		 * the output.
 		 */
 		error("%s does not point to a valid object!", entry->name);
+		string_list_append(refs_to_delete, entry->name);
 		return 0;
 	}
 
-	peel_status = peel_entry(entry, 0);
-	write_packed_entry(*fd, entry->name, entry->u.value.sha1,
-			   peel_status == PEEL_PEELED ?
-			   entry->u.value.peeled : NULL);
-
 	return 0;
 }
 
@@ -2169,6 +2195,8 @@ static int repack_without_ref(const char *refname)
 {
 	int fd;
 	struct ref_dir *packed;
+	struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
+	struct string_list_item *ref_to_delete;
 
 	if (!get_packed_ref(refname))
 		return 0; /* refname does not exist in packed refs */
@@ -2180,7 +2208,8 @@ static int repack_without_ref(const char *refname)
 	}
 	clear_packed_ref_cache(&ref_cache);
 	packed = get_packed_refs(&ref_cache);
-	/* Remove refname from the cache. */
+
+	/* Remove refname from the cache: */
 	if (remove_entry(packed, refname) == -1) {
 		/*
 		 * The packed entry disappeared while we were
@@ -2189,8 +2218,17 @@ static int repack_without_ref(const char *refname)
 		rollback_lock_file(&packlock);
 		return 0;
 	}
+
+	/* Remove any other accumulated cruft: */
+	do_for_each_entry_in_dir(packed, 0, curate_packed_ref_fn, &refs_to_delete);
+	for_each_string_list_item(ref_to_delete, &refs_to_delete) {
+		if (remove_entry(packed, ref_to_delete->string) == -1)
+			die("internal error");
+	}
+
+	/* Write what remains: */
 	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
-	do_for_each_entry_in_dir(packed, 0, repack_ref_fn, &fd);
+	do_for_each_entry_in_dir(packed, 0, write_packed_entry_fn, &fd);
 	return commit_lock_file(&packlock);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 02/12] pack_refs(): split creation of packed refs and entry writing
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Split pack_refs() into multiple passes:

* Iterate over loose refs.  For each one that can be turned into a
  packed ref, create a corresponding entry in the packed refs cache.

* Write the packed refs to the packed-refs file.

This change isolates the mutation of the packed-refs file to a single
place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/refs.c b/refs.c
index 1184b99..9f1a007 100644
--- a/refs.c
+++ b/refs.c
@@ -2023,35 +2023,50 @@ struct ref_to_prune {
 
 struct pack_refs_cb_data {
 	unsigned int flags;
+	struct ref_dir *packed_refs;
 	struct ref_to_prune *ref_to_prune;
-	int fd;
 };
 
-static int pack_one_ref(struct ref_entry *entry, void *cb_data)
+/*
+ * An each_ref_entry_fn that is run over loose references only.  If
+ * the loose reference can be packed, add an entry in the packed ref
+ * cache.  If the reference should be pruned, also add it to
+ * ref_to_prune in the pack_refs_cb_data.
+ */
+static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
 {
 	struct pack_refs_cb_data *cb = cb_data;
 	enum peel_status peel_status;
+	struct ref_entry *packed_entry;
 	int is_tag_ref = !prefixcmp(entry->name, "refs/tags/");
 
-	/* ALWAYS pack refs that were already packed or are tags */
-	if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref &&
-	    !(entry->flag & REF_ISPACKED))
+	/* ALWAYS pack tags */
+	if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref)
 		return 0;
 
 	/* Do not pack symbolic or broken refs: */
 	if ((entry->flag & REF_ISSYMREF) || !ref_resolves_to_object(entry))
 		return 0;
 
+	/* Add a packed ref cache entry equivalent to the loose entry. */
 	peel_status = peel_entry(entry, 1);
 	if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
 		die("internal error peeling reference %s (%s)",
 		    entry->name, sha1_to_hex(entry->u.value.sha1));
-	write_packed_entry(cb->fd, entry->name, entry->u.value.sha1,
-			   peel_status == PEEL_PEELED ?
-			   entry->u.value.peeled : NULL);
+	packed_entry = find_ref(cb->packed_refs, entry->name);
+	if (packed_entry) {
+		/* Overwrite existing packed entry with info from loose entry */
+		packed_entry->flag = REF_ISPACKED | REF_KNOWS_PEELED;
+		hashcpy(packed_entry->u.value.sha1, entry->u.value.sha1);
+	} else {
+		packed_entry = create_ref_entry(entry->name, entry->u.value.sha1,
+						REF_ISPACKED | REF_KNOWS_PEELED, 0);
+		add_ref(cb->packed_refs, packed_entry);
+	}
+	hashcpy(packed_entry->u.value.peeled, entry->u.value.peeled);
 
-	/* If the ref was already packed, there is no need to prune it. */
-	if ((cb->flags & PACK_REFS_PRUNE) && !(entry->flag & REF_ISPACKED)) {
+	/* Schedule the loose reference for pruning if requested. */
+	if ((cb->flags & PACK_REFS_PRUNE)) {
 		int namelen = strlen(entry->name) + 1;
 		struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen);
 		hashcpy(n->sha1, entry->u.value.sha1);
@@ -2118,16 +2133,21 @@ static struct lock_file packlock;
 int pack_refs(unsigned int flags)
 {
 	struct pack_refs_cb_data cbdata;
+	int fd;
 
 	memset(&cbdata, 0, sizeof(cbdata));
 	cbdata.flags = flags;
 
-	cbdata.fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"),
-					      LOCK_DIE_ON_ERROR);
+	fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"),
+				       LOCK_DIE_ON_ERROR);
+	cbdata.packed_refs = get_packed_refs(&ref_cache);
 
-	write_or_die(cbdata.fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
+	do_for_each_entry_in_dir(get_loose_refs(&ref_cache), 0,
+				 pack_if_possible_fn, &cbdata);
+
+	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
+	do_for_each_entry_in_dir(cbdata.packed_refs, 0, write_packed_entry_fn, &fd);
 
-	do_for_each_entry(&ref_cache, "", pack_one_ref, &cbdata);
 	if (commit_lock_file(&packlock) < 0)
 		die_errno("unable to overwrite old ref-pack file");
 	prune_refs(cbdata.ref_to_prune);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 03/12] refs: wrap the packed refs cache in a level of indirection
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

As we know, we can solve any problem in this manner.  In this case,
the problem is to avoid freeing a packed refs cache while somebody is
using it.  So add a level of indirection as a prelude to
reference-counting the packed refs cache.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/refs.c b/refs.c
index 9f1a007..373d95b 100644
--- a/refs.c
+++ b/refs.c
@@ -806,6 +806,10 @@ static int is_refname_available(const char *refname, const char *oldrefname,
 	return 1;
 }
 
+struct packed_ref_cache {
+	struct ref_entry *root;
+};
+
 /*
  * Future: need to be in "struct repository"
  * when doing a full libification.
@@ -813,7 +817,7 @@ static int is_refname_available(const char *refname, const char *oldrefname,
 static struct ref_cache {
 	struct ref_cache *next;
 	struct ref_entry *loose;
-	struct ref_entry *packed;
+	struct packed_ref_cache *packed;
 	/*
 	 * The submodule name, or "" for the main repo.  We allocate
 	 * length 1 rather than FLEX_ARRAY so that the main ref_cache
@@ -825,7 +829,8 @@ static struct ref_cache {
 static void clear_packed_ref_cache(struct ref_cache *refs)
 {
 	if (refs->packed) {
-		free_ref_entry(refs->packed);
+		free_ref_entry(refs->packed->root);
+		free(refs->packed);
 		refs->packed = NULL;
 	}
 }
@@ -996,24 +1001,39 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
 	}
 }
 
-static struct ref_dir *get_packed_refs(struct ref_cache *refs)
+/*
+ * Get the packed_ref_cache for the specified ref_cache, creating it
+ * if necessary.
+ */
+static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs)
 {
 	if (!refs->packed) {
 		const char *packed_refs_file;
 		FILE *f;
 
-		refs->packed = create_dir_entry(refs, "", 0, 0);
+		refs->packed = xcalloc(1, sizeof(*refs->packed));
+		refs->packed->root = create_dir_entry(refs, "", 0, 0);
 		if (*refs->name)
 			packed_refs_file = git_path_submodule(refs->name, "packed-refs");
 		else
 			packed_refs_file = git_path("packed-refs");
 		f = fopen(packed_refs_file, "r");
 		if (f) {
-			read_packed_refs(f, get_ref_dir(refs->packed));
+			read_packed_refs(f, get_ref_dir(refs->packed->root));
 			fclose(f);
 		}
 	}
-	return get_ref_dir(refs->packed);
+	return refs->packed;
+}
+
+static struct ref_dir *get_packed_ref_dir(struct packed_ref_cache *packed_ref_cache)
+{
+	return get_ref_dir(packed_ref_cache->root);
+}
+
+static struct ref_dir *get_packed_refs(struct ref_cache *refs)
+{
+	return get_packed_ref_dir(get_packed_ref_cache(refs));
 }
 
 void add_packed_ref(const char *refname, const unsigned char *sha1)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (2 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19 19:18   ` Junio C Hamano
  2013-06-19  7:51 ` [PATCH v2 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Handle simple transactions for the packed-refs file at the
packed_ref_cache level via new functions lock_packed_refs(),
commit_packed_refs(), and rollback_packed_refs().

Only allow the packed ref cache to be modified (via add_packed_ref())
while the packed refs file is locked.

Change clone to add the new references within a transaction.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/clone.c |  7 ++++-
 refs.c          | 83 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 refs.h          | 27 +++++++++++++++++--
 3 files changed, 98 insertions(+), 19 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 66bff57..b0c000a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -489,17 +489,22 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
 	return local_refs;
 }
 
+static struct lock_file packed_refs_lock;
+
 static void write_remote_refs(const struct ref *local_refs)
 {
 	const struct ref *r;
 
+	lock_packed_refs(&packed_refs_lock, LOCK_DIE_ON_ERROR);
+
 	for (r = local_refs; r; r = r->next) {
 		if (!r->peer_ref)
 			continue;
 		add_packed_ref(r->peer_ref->name, r->old_sha1);
 	}
 
-	pack_refs(PACK_REFS_ALL);
+	if (commit_packed_refs())
+		die_errno("unable to overwrite old ref-pack file");
 }
 
 static void write_followtags(const struct ref *refs, const char *msg)
diff --git a/refs.c b/refs.c
index 373d95b..ad73251 100644
--- a/refs.c
+++ b/refs.c
@@ -808,6 +808,13 @@ static int is_refname_available(const char *refname, const char *oldrefname,
 
 struct packed_ref_cache {
 	struct ref_entry *root;
+
+	/*
+	 * Iff the packed-refs file associated with this instance is
+	 * currently locked for writing, this points at the associated
+	 * lock (which is owned by somebody else).
+	 */
+	struct lock_file *lock;
 };
 
 /*
@@ -829,6 +836,8 @@ static struct ref_cache {
 static void clear_packed_ref_cache(struct ref_cache *refs)
 {
 	if (refs->packed) {
+		if (refs->packed->lock)
+			die("internal error: packed-ref cache cleared while locked");
 		free_ref_entry(refs->packed->root);
 		free(refs->packed);
 		refs->packed = NULL;
@@ -1038,7 +1047,12 @@ static struct ref_dir *get_packed_refs(struct ref_cache *refs)
 
 void add_packed_ref(const char *refname, const unsigned char *sha1)
 {
-	add_ref(get_packed_refs(&ref_cache),
+	struct packed_ref_cache *packed_ref_cache =
+		get_packed_ref_cache(&ref_cache);
+
+	if (!packed_ref_cache->lock)
+		die("internal error: packed refs not locked");
+	add_ref(get_packed_ref_dir(packed_ref_cache),
 		create_ref_entry(refname, sha1, REF_ISPACKED, 1));
 }
 
@@ -2035,6 +2049,52 @@ static int write_packed_entry_fn(struct ref_entry *entry, void *cb_data)
 	return 0;
 }
 
+int lock_packed_refs(struct lock_file *lock, int flags)
+{
+	struct packed_ref_cache *packed_ref_cache;
+
+	/* Discard the old cache because it might be invalid: */
+	clear_packed_ref_cache(&ref_cache);
+	if (hold_lock_file_for_update(lock, git_path("packed-refs"), flags) < 0)
+		return -1;
+	/* Read the current packed-refs while holding the lock: */
+	packed_ref_cache = get_packed_ref_cache(&ref_cache);
+	packed_ref_cache->lock = lock;
+	return 0;
+}
+
+int commit_packed_refs(void)
+{
+	struct packed_ref_cache *packed_ref_cache =
+		get_packed_ref_cache(&ref_cache);
+	int error = 0;
+
+	if (!packed_ref_cache->lock)
+		die("internal error: packed-refs not locked");
+	write_or_die(packed_ref_cache->lock->fd,
+		     PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
+
+	do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
+				 0, write_packed_entry_fn,
+				 &packed_ref_cache->lock->fd);
+	if (commit_lock_file(packed_ref_cache->lock))
+		error = -1;
+	packed_ref_cache->lock = NULL;
+	return error;
+}
+
+void rollback_packed_refs(void)
+{
+	struct packed_ref_cache *packed_ref_cache =
+		get_packed_ref_cache(&ref_cache);
+
+	if (!packed_ref_cache->lock)
+		die("internal error: packed-refs not locked");
+	rollback_lock_file(packed_ref_cache->lock);
+	packed_ref_cache->lock = NULL;
+	clear_packed_ref_cache(&ref_cache);
+}
+
 struct ref_to_prune {
 	struct ref_to_prune *next;
 	unsigned char sha1[20];
@@ -2153,23 +2213,19 @@ static struct lock_file packlock;
 int pack_refs(unsigned int flags)
 {
 	struct pack_refs_cb_data cbdata;
-	int fd;
 
 	memset(&cbdata, 0, sizeof(cbdata));
 	cbdata.flags = flags;
 
-	fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"),
-				       LOCK_DIE_ON_ERROR);
+	lock_packed_refs(&packlock, LOCK_DIE_ON_ERROR);
 	cbdata.packed_refs = get_packed_refs(&ref_cache);
 
 	do_for_each_entry_in_dir(get_loose_refs(&ref_cache), 0,
 				 pack_if_possible_fn, &cbdata);
 
-	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
-	do_for_each_entry_in_dir(cbdata.packed_refs, 0, write_packed_entry_fn, &fd);
-
-	if (commit_lock_file(&packlock) < 0)
+	if (commit_packed_refs())
 		die_errno("unable to overwrite old ref-pack file");
+
 	prune_refs(cbdata.ref_to_prune);
 	return 0;
 }
@@ -2233,7 +2289,6 @@ static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data)
 
 static int repack_without_ref(const char *refname)
 {
-	int fd;
 	struct ref_dir *packed;
 	struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
 	struct string_list_item *ref_to_delete;
@@ -2241,12 +2296,10 @@ static int repack_without_ref(const char *refname)
 	if (!get_packed_ref(refname))
 		return 0; /* refname does not exist in packed refs */
 
-	fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"), 0);
-	if (fd < 0) {
+	if (lock_packed_refs(&packlock, 0)) {
 		unable_to_lock_error(git_path("packed-refs"), errno);
 		return error("cannot delete '%s' from packed refs", refname);
 	}
-	clear_packed_ref_cache(&ref_cache);
 	packed = get_packed_refs(&ref_cache);
 
 	/* Remove refname from the cache: */
@@ -2255,7 +2308,7 @@ static int repack_without_ref(const char *refname)
 		 * The packed entry disappeared while we were
 		 * acquiring the lock.
 		 */
-		rollback_lock_file(&packlock);
+		rollback_packed_refs();
 		return 0;
 	}
 
@@ -2267,9 +2320,7 @@ static int repack_without_ref(const char *refname)
 	}
 
 	/* Write what remains: */
-	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
-	do_for_each_entry_in_dir(packed, 0, write_packed_entry_fn, &fd);
-	return commit_lock_file(&packlock);
+	return commit_packed_refs();
 }
 
 int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
diff --git a/refs.h b/refs.h
index 246bf60..b275124 100644
--- a/refs.h
+++ b/refs.h
@@ -77,12 +77,35 @@ extern int for_each_rawref(each_ref_fn, void *);
 extern void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname);
 
 /*
- * Add a reference to the in-memory packed reference cache.  To actually
- * write the reference to the packed-refs file, call pack_refs().
+ * Lock the packed-refs file for writing using the specified lock_file
+ * instance.  Flags is passed to hold_lock_file_for_update().  Return
+ * 0 on success.
+ */
+extern int lock_packed_refs(struct lock_file *lock, int flags);
+
+/*
+ * Add a reference to the in-memory packed reference cache.  This may
+ * only be called while the packed-refs file is locked (see
+ * lock_packed_refs()).  To actually write the packed-refs file, call
+ * commit_packed_refs().
  */
 extern void add_packed_ref(const char *refname, const unsigned char *sha1);
 
 /*
+ * Write the current version of the packed refs cache from memory to
+ * disk.  The packed-refs file must already be locked for writing (see
+ * lock_packed_refs()).  Return zero on success.
+ */
+extern int commit_packed_refs(void);
+
+/*
+ * Rollback the lockfile for the packed-refs file, and discard the
+ * in-memory packed reference cache.  (The packed-refs file will be
+ * read anew if it is needed again after this function is called.)
+ */
+extern void rollback_packed_refs(void);
+
+/*
  * Flags for controlling behaviour of pack_refs()
  * PACK_REFS_PRUNE: Prune loose refs after packing
  * PACK_REFS_ALL:   Pack _all_ refs, not just tags and already packed refs
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 05/12] refs: manage lifetime of packed refs cache via reference counting
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (3 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

In struct packed_ref_cache, keep a count of the number of users of the
data structure.  Only free the packed ref cache when the reference
count goes to zero rather than when the packed ref cache is cleared.
This mechanism will be used to prevent the cache data structure from
being freed while it is being iterated over.

So far, only the reference in struct ref_cache::packed is counted;
other users will be adjusted in separate commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index ad73251..e2e441d 100644
--- a/refs.c
+++ b/refs.c
@@ -810,6 +810,14 @@ struct packed_ref_cache {
 	struct ref_entry *root;
 
 	/*
+	 * Count of references to the data structure in this instance,
+	 * including the pointer from ref_cache::packed if any.  The
+	 * data will not be freed as long as the reference count is
+	 * nonzero.
+	 */
+	unsigned int referrers;
+
+	/*
 	 * Iff the packed-refs file associated with this instance is
 	 * currently locked for writing, this points at the associated
 	 * lock (which is owned by somebody else).
@@ -833,14 +841,38 @@ static struct ref_cache {
 	char name[1];
 } ref_cache, *submodule_ref_caches;
 
+/*
+ * Increment the reference count of *packed_refs.
+ */
+static void acquire_packed_ref_cache(struct packed_ref_cache *packed_refs)
+{
+	packed_refs->referrers++;
+}
+
+/*
+ * Decrease the reference count of *packed_refs.  If it goes to zero,
+ * free *packed_refs and return true; otherwise return false.
+ */
+static int release_packed_ref_cache(struct packed_ref_cache *packed_refs)
+{
+	if (!--packed_refs->referrers) {
+		free_ref_entry(packed_refs->root);
+		free(packed_refs);
+		return 1;
+	} else {
+		return 0;
+	}
+}
+
 static void clear_packed_ref_cache(struct ref_cache *refs)
 {
 	if (refs->packed) {
-		if (refs->packed->lock)
+		struct packed_ref_cache *packed_refs = refs->packed;
+
+		if (packed_refs->lock)
 			die("internal error: packed-ref cache cleared while locked");
-		free_ref_entry(refs->packed->root);
-		free(refs->packed);
 		refs->packed = NULL;
+		release_packed_ref_cache(packed_refs);
 	}
 }
 
@@ -1021,6 +1053,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs)
 		FILE *f;
 
 		refs->packed = xcalloc(1, sizeof(*refs->packed));
+		acquire_packed_ref_cache(refs->packed);
 		refs->packed->root = create_dir_entry(refs, "", 0, 0);
 		if (*refs->name)
 			packed_refs_file = git_path_submodule(refs->name, "packed-refs");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 06/12] do_for_each_entry(): increment the packed refs cache refcount
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (4 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

This function calls a user-supplied callback function which could do
something that causes the packed refs cache to be invalidated.  So
acquire a reference count on the data structure to prevent our copy
from being freed while we are iterating over it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index e2e441d..d90f487 100644
--- a/refs.c
+++ b/refs.c
@@ -1587,10 +1587,12 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
 static int do_for_each_entry(struct ref_cache *refs, const char *base,
 			     each_ref_entry_fn fn, void *cb_data)
 {
-	struct ref_dir *packed_dir = get_packed_refs(refs);
+	struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs);
+	struct ref_dir *packed_dir = get_packed_ref_dir(packed_ref_cache);
 	struct ref_dir *loose_dir = get_loose_refs(refs);
 	int retval = 0;
 
+	acquire_packed_ref_cache(packed_ref_cache);
 	if (base && *base) {
 		packed_dir = find_containing_dir(packed_dir, base, 0);
 		loose_dir = find_containing_dir(loose_dir, base, 0);
@@ -1611,6 +1613,7 @@ static int do_for_each_entry(struct ref_cache *refs, const char *base,
 				loose_dir, 0, fn, cb_data);
 	}
 
+	release_packed_ref_cache(packed_ref_cache);
 	return retval;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 07/12] packed_ref_cache: increment refcount when locked
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (5 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Increment the packed_ref_cache reference count while it is locked to
prevent its being freed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index d90f487..6d48b42 100644
--- a/refs.c
+++ b/refs.c
@@ -820,7 +820,9 @@ struct packed_ref_cache {
 	/*
 	 * Iff the packed-refs file associated with this instance is
 	 * currently locked for writing, this points at the associated
-	 * lock (which is owned by somebody else).
+	 * lock (which is owned by somebody else).  The referrer count
+	 * is also incremented when the file is locked and decremented
+	 * when it is unlocked.
 	 */
 	struct lock_file *lock;
 };
@@ -2096,6 +2098,8 @@ int lock_packed_refs(struct lock_file *lock, int flags)
 	/* Read the current packed-refs while holding the lock: */
 	packed_ref_cache = get_packed_ref_cache(&ref_cache);
 	packed_ref_cache->lock = lock;
+	/* Increment the reference count to prevent it from being freed: */
+	acquire_packed_ref_cache(packed_ref_cache);
 	return 0;
 }
 
@@ -2116,6 +2120,7 @@ int commit_packed_refs(void)
 	if (commit_lock_file(packed_ref_cache->lock))
 		error = -1;
 	packed_ref_cache->lock = NULL;
+	release_packed_ref_cache(packed_ref_cache);
 	return error;
 }
 
@@ -2128,6 +2133,7 @@ void rollback_packed_refs(void)
 		die("internal error: packed-refs not locked");
 	rollback_lock_file(packed_ref_cache->lock);
 	packed_ref_cache->lock = NULL;
+	release_packed_ref_cache(packed_ref_cache);
 	clear_packed_ref_cache(&ref_cache);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 08/12] Extract a struct stat_data from cache_entry
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (6 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 09/12] add a stat_validity struct Michael Haggerty
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Add public functions fill_stat_data() and match_stat_data() to work
with it.  This infrastructure will later be used to check the validity
of other types of file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/ls-files.c |  12 +++--
 cache.h            |  33 +++++++++---
 read-cache.c       | 151 +++++++++++++++++++++++++++++------------------------
 3 files changed, 116 insertions(+), 80 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 2202072..6a0730f 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -165,11 +165,13 @@ static void show_ce_entry(const char *tag, struct cache_entry *ce)
 	}
 	write_name(ce->name, ce_namelen(ce));
 	if (debug_mode) {
-		printf("  ctime: %d:%d\n", ce->ce_ctime.sec, ce->ce_ctime.nsec);
-		printf("  mtime: %d:%d\n", ce->ce_mtime.sec, ce->ce_mtime.nsec);
-		printf("  dev: %d\tino: %d\n", ce->ce_dev, ce->ce_ino);
-		printf("  uid: %d\tgid: %d\n", ce->ce_uid, ce->ce_gid);
-		printf("  size: %d\tflags: %x\n", ce->ce_size, ce->ce_flags);
+		struct stat_data *sd = &ce->ce_stat_data;
+
+		printf("  ctime: %d:%d\n", sd->sd_ctime.sec, sd->sd_ctime.nsec);
+		printf("  mtime: %d:%d\n", sd->sd_mtime.sec, sd->sd_mtime.nsec);
+		printf("  dev: %d\tino: %d\n", sd->sd_dev, sd->sd_ino);
+		printf("  uid: %d\tgid: %d\n", sd->sd_uid, sd->sd_gid);
+		printf("  size: %d\tflags: %x\n", sd->sd_size, ce->ce_flags);
 	}
 }
 
diff --git a/cache.h b/cache.h
index 820aa05..f158fed 100644
--- a/cache.h
+++ b/cache.h
@@ -119,15 +119,19 @@ struct cache_time {
 	unsigned int nsec;
 };
 
+struct stat_data {
+	struct cache_time sd_ctime;
+	struct cache_time sd_mtime;
+	unsigned int sd_dev;
+	unsigned int sd_ino;
+	unsigned int sd_uid;
+	unsigned int sd_gid;
+	unsigned int sd_size;
+};
+
 struct cache_entry {
-	struct cache_time ce_ctime;
-	struct cache_time ce_mtime;
-	unsigned int ce_dev;
-	unsigned int ce_ino;
+	struct stat_data ce_stat_data;
 	unsigned int ce_mode;
-	unsigned int ce_uid;
-	unsigned int ce_gid;
-	unsigned int ce_size;
 	unsigned int ce_flags;
 	unsigned int ce_namelen;
 	unsigned char sha1[20];
@@ -511,6 +515,21 @@ extern int limit_pathspec_to_literal(void);
 #define HASH_FORMAT_CHECK 2
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
 extern int index_path(unsigned char *sha1, const char *path, struct stat *st, unsigned flags);
+
+/*
+ * Record to sd the data from st that we use to check whether a file
+ * might have changed.
+ */
+extern void fill_stat_data(struct stat_data *sd, struct stat *st);
+
+/*
+ * Return 0 if st is consistent with a file not having been changed
+ * since sd was filled.  If there are differences, return a
+ * combination of MTIME_CHANGED, CTIME_CHANGED, OWNER_CHANGED,
+ * INODE_CHANGED, and DATA_CHANGED.
+ */
+extern int match_stat_data(const struct stat_data *sd, struct stat *st);
+
 extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
 
 #define REFRESH_REALLY		0x0001	/* ignore_valid */
diff --git a/read-cache.c b/read-cache.c
index 5e30746..5660b37 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -67,6 +67,61 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
 
+void fill_stat_data(struct stat_data *sd, struct stat *st)
+{
+	sd->sd_ctime.sec = (unsigned int)st->st_ctime;
+	sd->sd_mtime.sec = (unsigned int)st->st_mtime;
+	sd->sd_ctime.nsec = ST_CTIME_NSEC(*st);
+	sd->sd_mtime.nsec = ST_MTIME_NSEC(*st);
+	sd->sd_dev = st->st_dev;
+	sd->sd_ino = st->st_ino;
+	sd->sd_uid = st->st_uid;
+	sd->sd_gid = st->st_gid;
+	sd->sd_size = st->st_size;
+}
+
+int match_stat_data(const struct stat_data *sd, struct stat *st)
+{
+	int changed = 0;
+
+	if (sd->sd_mtime.sec != (unsigned int)st->st_mtime)
+		changed |= MTIME_CHANGED;
+	if (trust_ctime && check_stat &&
+	    sd->sd_ctime.sec != (unsigned int)st->st_ctime)
+		changed |= CTIME_CHANGED;
+
+#ifdef USE_NSEC
+	if (check_stat && sd->sd_mtime.nsec != ST_MTIME_NSEC(*st))
+		changed |= MTIME_CHANGED;
+	if (trust_ctime && check_stat &&
+	    sd->sd_ctime.nsec != ST_CTIME_NSEC(*st))
+		changed |= CTIME_CHANGED;
+#endif
+
+	if (check_stat) {
+		if (sd->sd_uid != (unsigned int) st->st_uid ||
+			sd->sd_gid != (unsigned int) st->st_gid)
+			changed |= OWNER_CHANGED;
+		if (sd->sd_ino != (unsigned int) st->st_ino)
+			changed |= INODE_CHANGED;
+	}
+
+#ifdef USE_STDEV
+	/*
+	 * st_dev breaks on network filesystems where different
+	 * clients will have different views of what "device"
+	 * the filesystem is on
+	 */
+	if (check_stat && sd->sd_dev != (unsigned int) st->st_dev)
+			changed |= INODE_CHANGED;
+#endif
+
+	if (sd->sd_size != (unsigned int) st->st_size)
+		changed |= DATA_CHANGED;
+
+	return changed;
+}
+
 /*
  * This only updates the "non-critical" parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
@@ -74,15 +129,7 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
  */
 void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 {
-	ce->ce_ctime.sec = (unsigned int)st->st_ctime;
-	ce->ce_mtime.sec = (unsigned int)st->st_mtime;
-	ce->ce_ctime.nsec = ST_CTIME_NSEC(*st);
-	ce->ce_mtime.nsec = ST_MTIME_NSEC(*st);
-	ce->ce_dev = st->st_dev;
-	ce->ce_ino = st->st_ino;
-	ce->ce_uid = st->st_uid;
-	ce->ce_gid = st->st_gid;
-	ce->ce_size = st->st_size;
+	fill_stat_data(&ce->ce_stat_data, st);
 
 	if (assume_unchanged)
 		ce->ce_flags |= CE_VALID;
@@ -195,43 +242,11 @@ static int ce_match_stat_basic(const struct cache_entry *ce, struct stat *st)
 	default:
 		die("internal error: ce_mode is %o", ce->ce_mode);
 	}
-	if (ce->ce_mtime.sec != (unsigned int)st->st_mtime)
-		changed |= MTIME_CHANGED;
-	if (trust_ctime && check_stat &&
-	    ce->ce_ctime.sec != (unsigned int)st->st_ctime)
-		changed |= CTIME_CHANGED;
 
-#ifdef USE_NSEC
-	if (check_stat && ce->ce_mtime.nsec != ST_MTIME_NSEC(*st))
-		changed |= MTIME_CHANGED;
-	if (trust_ctime && check_stat &&
-	    ce->ce_ctime.nsec != ST_CTIME_NSEC(*st))
-		changed |= CTIME_CHANGED;
-#endif
-
-	if (check_stat) {
-		if (ce->ce_uid != (unsigned int) st->st_uid ||
-			ce->ce_gid != (unsigned int) st->st_gid)
-			changed |= OWNER_CHANGED;
-		if (ce->ce_ino != (unsigned int) st->st_ino)
-			changed |= INODE_CHANGED;
-	}
-
-#ifdef USE_STDEV
-	/*
-	 * st_dev breaks on network filesystems where different
-	 * clients will have different views of what "device"
-	 * the filesystem is on
-	 */
-	if (check_stat && ce->ce_dev != (unsigned int) st->st_dev)
-			changed |= INODE_CHANGED;
-#endif
-
-	if (ce->ce_size != (unsigned int) st->st_size)
-		changed |= DATA_CHANGED;
+	changed |= match_stat_data(&ce->ce_stat_data, st);
 
 	/* Racily smudged entry? */
-	if (!ce->ce_size) {
+	if (!ce->ce_stat_data.sd_size) {
 		if (!is_empty_blob_sha1(ce->sha1))
 			changed |= DATA_CHANGED;
 	}
@@ -246,11 +261,11 @@ static int is_racy_timestamp(const struct index_state *istate,
 		istate->timestamp.sec &&
 #ifdef USE_NSEC
 		 /* nanosecond timestamped files can also be racy! */
-		(istate->timestamp.sec < ce->ce_mtime.sec ||
-		 (istate->timestamp.sec == ce->ce_mtime.sec &&
-		  istate->timestamp.nsec <= ce->ce_mtime.nsec))
+		(istate->timestamp.sec < ce->ce_stat_data.sd_mtime.sec ||
+		 (istate->timestamp.sec == ce->ce_stat_data.sd_mtime.sec &&
+		  istate->timestamp.nsec <= ce->ce_stat_data.sd_mtime.nsec))
 #else
-		istate->timestamp.sec <= ce->ce_mtime.sec
+		istate->timestamp.sec <= ce->ce_stat_data.sd_mtime.sec
 #endif
 		 );
 }
@@ -342,7 +357,7 @@ int ie_modified(const struct index_state *istate,
 	 * then we know it is.
 	 */
 	if ((changed & DATA_CHANGED) &&
-	    (S_ISGITLINK(ce->ce_mode) || ce->ce_size != 0))
+	    (S_ISGITLINK(ce->ce_mode) || ce->ce_stat_data.sd_size != 0))
 		return changed;
 
 	changed_fs = ce_modified_check_fs(ce, st);
@@ -1324,16 +1339,16 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
 {
 	struct cache_entry *ce = xmalloc(cache_entry_size(len));
 
-	ce->ce_ctime.sec = ntoh_l(ondisk->ctime.sec);
-	ce->ce_mtime.sec = ntoh_l(ondisk->mtime.sec);
-	ce->ce_ctime.nsec = ntoh_l(ondisk->ctime.nsec);
-	ce->ce_mtime.nsec = ntoh_l(ondisk->mtime.nsec);
-	ce->ce_dev   = ntoh_l(ondisk->dev);
-	ce->ce_ino   = ntoh_l(ondisk->ino);
+	ce->ce_stat_data.sd_ctime.sec = ntoh_l(ondisk->ctime.sec);
+	ce->ce_stat_data.sd_mtime.sec = ntoh_l(ondisk->mtime.sec);
+	ce->ce_stat_data.sd_ctime.nsec = ntoh_l(ondisk->ctime.nsec);
+	ce->ce_stat_data.sd_mtime.nsec = ntoh_l(ondisk->mtime.nsec);
+	ce->ce_stat_data.sd_dev   = ntoh_l(ondisk->dev);
+	ce->ce_stat_data.sd_ino   = ntoh_l(ondisk->ino);
 	ce->ce_mode  = ntoh_l(ondisk->mode);
-	ce->ce_uid   = ntoh_l(ondisk->uid);
-	ce->ce_gid   = ntoh_l(ondisk->gid);
-	ce->ce_size  = ntoh_l(ondisk->size);
+	ce->ce_stat_data.sd_uid   = ntoh_l(ondisk->uid);
+	ce->ce_stat_data.sd_gid   = ntoh_l(ondisk->gid);
+	ce->ce_stat_data.sd_size  = ntoh_l(ondisk->size);
 	ce->ce_flags = flags & ~CE_NAMEMASK;
 	ce->ce_namelen = len;
 	hashcpy(ce->sha1, ondisk->sha1);
@@ -1610,7 +1625,7 @@ static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
 	 * The only thing we care about in this function is to smudge the
 	 * falsely clean entry due to touch-update-touch race, so we leave
 	 * everything else as they are.  We are called for entries whose
-	 * ce_mtime match the index file mtime.
+	 * ce_stat_data.sd_mtime match the index file mtime.
 	 *
 	 * Note that this actually does not do much for gitlinks, for
 	 * which ce_match_stat_basic() always goes to the actual
@@ -1649,7 +1664,7 @@ static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
 		 * file, and never calls us, so the cached size information
 		 * for "frotz" stays 6 which does not match the filesystem.
 		 */
-		ce->ce_size = 0;
+		ce->ce_stat_data.sd_size = 0;
 	}
 }
 
@@ -1659,16 +1674,16 @@ static char *copy_cache_entry_to_ondisk(struct ondisk_cache_entry *ondisk,
 {
 	short flags;
 
-	ondisk->ctime.sec = htonl(ce->ce_ctime.sec);
-	ondisk->mtime.sec = htonl(ce->ce_mtime.sec);
-	ondisk->ctime.nsec = htonl(ce->ce_ctime.nsec);
-	ondisk->mtime.nsec = htonl(ce->ce_mtime.nsec);
-	ondisk->dev  = htonl(ce->ce_dev);
-	ondisk->ino  = htonl(ce->ce_ino);
+	ondisk->ctime.sec = htonl(ce->ce_stat_data.sd_ctime.sec);
+	ondisk->mtime.sec = htonl(ce->ce_stat_data.sd_mtime.sec);
+	ondisk->ctime.nsec = htonl(ce->ce_stat_data.sd_ctime.nsec);
+	ondisk->mtime.nsec = htonl(ce->ce_stat_data.sd_mtime.nsec);
+	ondisk->dev  = htonl(ce->ce_stat_data.sd_dev);
+	ondisk->ino  = htonl(ce->ce_stat_data.sd_ino);
 	ondisk->mode = htonl(ce->ce_mode);
-	ondisk->uid  = htonl(ce->ce_uid);
-	ondisk->gid  = htonl(ce->ce_gid);
-	ondisk->size = htonl(ce->ce_size);
+	ondisk->uid  = htonl(ce->ce_stat_data.sd_uid);
+	ondisk->gid  = htonl(ce->ce_stat_data.sd_gid);
+	ondisk->size = htonl(ce->ce_stat_data.sd_size);
 	hashcpy(ondisk->sha1, ce->sha1);
 
 	flags = ce->ce_flags;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 09/12] add a stat_validity struct
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (7 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 10/12] get_packed_ref_cache: reload packed-refs file when it changes Michael Haggerty
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

It can sometimes be useful to know whether a path in the
filesystem has been updated without going to the work of
opening and re-reading its content. We trust the stat()
information on disk already to handle index updates, and we
can use the same trick here.

This patch introduces a "stat_validity" struct which
encapsulates the concept of checking the stat-freshness of a
file. It is implemented on top of "struct stat_data" to
reuse the logic about which stat entries to trust for a
particular platform, but hides the complexity behind two
simple functions: check and update.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 cache.h      | 27 +++++++++++++++++++++++++++
 read-cache.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/cache.h b/cache.h
index f158fed..50f33db 100644
--- a/cache.h
+++ b/cache.h
@@ -1360,4 +1360,31 @@ int checkout_fast_forward(const unsigned char *from,
 
 int sane_execvp(const char *file, char *const argv[]);
 
+/*
+ * A struct to encapsulate the concept of whether a file has changed
+ * since we last checked it. This uses criteria similar to those used
+ * for the index.
+ */
+struct stat_validity {
+	struct stat_data *sd;
+};
+
+void stat_validity_clear(struct stat_validity *sv);
+
+/*
+ * Returns 1 if the path is a regular file (or a symlink to a regular
+ * file) and matches the saved stat_validity, 0 otherwise.  A missing
+ * or inaccessible file is considered a match if the struct was just
+ * initialized, or if the previous update found an inaccessible file.
+ */
+int stat_validity_check(struct stat_validity *sv, const char *path);
+
+/*
+ * Update the stat_validity from a file opened at descriptor fd. If
+ * the file is missing, inaccessible, or not a regular file, then
+ * future calls to stat_validity_check will match iff one of those
+ * conditions continues to be true.
+ */
+void stat_validity_update(struct stat_validity *sv, int fd);
+
 #endif /* CACHE_H */
diff --git a/read-cache.c b/read-cache.c
index 5660b37..b15bc09 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1950,3 +1950,33 @@ void *read_blob_data_from_index(struct index_state *istate, const char *path, un
 		*size = sz;
 	return data;
 }
+
+void stat_validity_clear(struct stat_validity *sv)
+{
+	free(sv->sd);
+	sv->sd = NULL;
+}
+
+int stat_validity_check(struct stat_validity *sv, const char *path)
+{
+	struct stat st;
+
+	if (stat(path, &st) < 0)
+		return sv->sd == NULL;
+	if (!sv->sd)
+		return 0;
+	return S_ISREG(st.st_mode) && !match_stat_data(sv->sd, &st);
+}
+
+void stat_validity_update(struct stat_validity *sv, int fd)
+{
+	struct stat st;
+
+	if (fstat(fd, &st) < 0 || !S_ISREG(st.st_mode))
+		stat_validity_clear(sv);
+	else {
+		if (!sv->sd)
+			sv->sd = xcalloc(1, sizeof(struct stat_data));
+		fill_stat_data(sv->sd, &st);
+	}
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 10/12] get_packed_ref_cache: reload packed-refs file when it changes
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (8 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 09/12] add a stat_validity struct Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

From: Jeff King <peff@peff.net>

Once we read the packed-refs file into memory, we cache it
to save work on future ref lookups. However, our cache may
be out of date with respect to what is on disk if another
process is simultaneously packing the refs. Normally it
is acceptable for us to be a little out of date, since there
is no guarantee whether we read the file before or after the
simultaneous update. However, there is an important special
case: our packed-refs file must be up to date with respect
to any loose refs we read. Otherwise, we risk the following
race condition:

  0. There exists a loose ref refs/heads/master.

  1. Process A starts and looks up the ref "master". It
     first checks $GIT_DIR/master, which does not exist. It
     then loads (and caches) the packed-refs file to see if
     "master" exists in it, which it does not.

  2. Meanwhile, process B runs "pack-refs --all --prune". It
     creates a new packed-refs file which contains
     refs/heads/master, and removes the loose copy at
     $GIT_DIR/refs/heads/master.

  3. Process A continues its lookup, and eventually tries
     $GIT_DIR/refs/heads/master.  It sees that the loose ref
     is missing, and falls back to the packed-refs file. But
     it examines its cached version, which does not have
     refs/heads/master. After trying a few other prefixes,
     it reports master as a non-existent ref.

There are many variants (e.g., step 1 may involve process A
looking up another ref entirely, so even a fully qualified
refname can fail). One of the most interesting ones is if
"refs/heads/master" is already packed. In that case process
A will not see it as missing, but rather will report
whatever value happened to be in the packed-refs file before
process B repacked (which might be an arbitrarily old
value).

We can fix this by making sure we reload the packed-refs
file from disk after looking at any loose refs. That's
unacceptably slow, so we can check its stat()-validity as a
proxy, and read it only when it appears to have changed.

Reading the packed-refs file after performing any loose-ref
system calls is sufficient because we know the ordering of
the pack-refs process: it always makes sure the newly
written packed-refs file is installed into place before
pruning any loose refs. As long as those operations by B
appear in their executed order to process A, by the time A
sees the missing loose ref, the new packed-refs file must be
in place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index 6d48b42..d53a172 100644
--- a/refs.c
+++ b/refs.c
@@ -825,6 +825,9 @@ struct packed_ref_cache {
 	 * when it is unlocked.
 	 */
 	struct lock_file *lock;
+
+	/* The metadata from when this packed-refs cache was read */
+	struct stat_validity validity;
 };
 
 /*
@@ -859,6 +862,7 @@ static int release_packed_ref_cache(struct packed_ref_cache *packed_refs)
 {
 	if (!--packed_refs->referrers) {
 		free_ref_entry(packed_refs->root);
+		stat_validity_clear(&packed_refs->validity);
 		free(packed_refs);
 		return 1;
 	} else {
@@ -1050,19 +1054,26 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
  */
 static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs)
 {
+	const char *packed_refs_file;
+
+	if (*refs->name)
+		packed_refs_file = git_path_submodule(refs->name, "packed-refs");
+	else
+		packed_refs_file = git_path("packed-refs");
+
+	if (refs->packed &&
+	    !stat_validity_check(&refs->packed->validity, packed_refs_file))
+		clear_packed_ref_cache(refs);
+
 	if (!refs->packed) {
-		const char *packed_refs_file;
 		FILE *f;
 
 		refs->packed = xcalloc(1, sizeof(*refs->packed));
 		acquire_packed_ref_cache(refs->packed);
 		refs->packed->root = create_dir_entry(refs, "", 0, 0);
-		if (*refs->name)
-			packed_refs_file = git_path_submodule(refs->name, "packed-refs");
-		else
-			packed_refs_file = git_path("packed-refs");
 		f = fopen(packed_refs_file, "r");
 		if (f) {
+			stat_validity_update(&refs->packed->validity, fileno(f));
 			read_packed_refs(f, get_ref_dir(refs->packed->root));
 			fclose(f);
 		}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 11/12] for_each_ref: load all loose refs before packed refs
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (9 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 10/12] get_packed_ref_cache: reload packed-refs file when it changes Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19  7:51 ` [PATCH v2 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
  2013-06-19 18:56 ` [PATCH v2 00/12] Fix some reference-related races Jeff King
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

From: Jeff King <peff@peff.net>

If we are iterating through the refs using for_each_ref (or
any of its sister functions), we can get into a race
condition with a simultaneous "pack-refs --prune" that looks
like this:

  0. We have a large number of loose refs, and a few packed
     refs. refs/heads/z/foo is loose, with no matching entry
     in the packed-refs file.

  1. Process A starts iterating through the refs. It loads
     the packed-refs file from disk, then starts lazily
     traversing through the loose ref directories.

  2. Process B, running "pack-refs --prune", writes out the
     new packed-refs file. It then deletes the newly packed
     refs, including refs/heads/z/foo.

  3. Meanwhile, process A has finally gotten to
     refs/heads/z (it traverses alphabetically). It
     descends, but finds nothing there.  It checks its
     cached view of the packed-refs file, but it does not
     mention anything in "refs/heads/z/" at all (it predates
     the new file written by B in step 2).

The traversal completes successfully without mentioning
refs/heads/z/foo at all (the name, of course, isn't
important; but the more refs you have and the farther down
the alphabetical list a ref is, the more likely it is to hit
the race). If refs/heads/z/foo did exist in the packed refs
file at state 0, we would see an entry for it, but it would
show whatever sha1 the ref had the last time it was packed
(which could be an arbitrarily long time ago).

This can be especially dangerous when process A is "git
prune", as it means our set of reachable tips will be
incomplete, and we may erroneously prune objects reachable
from that tip (the same thing can happen if "repack -ad" is
used, as it simply drops unreachable objects that are
packed).

This patch solves it by loading all of the loose refs for
our traversal into our in-memory cache, and then refreshing
the packed-refs cache. Because a pack-refs writer will
always put the new packed-refs file into place before
starting the prune, we know that any loose refs we fail to
see will either truly be missing, or will have already been
put in the packed-refs file by the time we refresh.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index d53a172..350054c 100644
--- a/refs.c
+++ b/refs.c
@@ -750,6 +750,21 @@ static int do_for_each_entry_in_dirs(struct ref_dir *dir1,
 }
 
 /*
+ * Load all of the refs from the dir into our in-memory cache. The hard work
+ * of loading loose refs is done by get_ref_dir(), so we just need to recurse
+ * through all of the sub-directories. We do not even need to care about
+ * sorting, as traversal order does not matter to us.
+ */
+static void prime_ref_dir(struct ref_dir *dir)
+{
+	int i;
+	for (i = 0; i < dir->nr; i++) {
+		struct ref_entry *entry = dir->entries[i];
+		if (entry->flag & REF_DIR)
+			prime_ref_dir(get_ref_dir(entry));
+	}
+}
+/*
  * Return true iff refname1 and refname2 conflict with each other.
  * Two reference names conflict if one of them exactly matches the
  * leading components of the other; e.g., "foo/bar" conflicts with
@@ -1600,15 +1615,31 @@ void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
 static int do_for_each_entry(struct ref_cache *refs, const char *base,
 			     each_ref_entry_fn fn, void *cb_data)
 {
-	struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs);
-	struct ref_dir *packed_dir = get_packed_ref_dir(packed_ref_cache);
-	struct ref_dir *loose_dir = get_loose_refs(refs);
+	struct packed_ref_cache *packed_ref_cache;
+	struct ref_dir *loose_dir;
+	struct ref_dir *packed_dir;
 	int retval = 0;
 
+	/*
+	 * We must make sure that all loose refs are read before accessing the
+	 * packed-refs file; this avoids a race condition in which loose refs
+	 * are migrated to the packed-refs file by a simultaneous process, but
+	 * our in-memory view is from before the migration. get_packed_ref_cache()
+	 * takes care of making sure our view is up to date with what is on
+	 * disk.
+	 */
+	loose_dir = get_loose_refs(refs);
+	if (base && *base) {
+		loose_dir = find_containing_dir(loose_dir, base, 0);
+	}
+	if (loose_dir)
+		prime_ref_dir(loose_dir);
+
+	packed_ref_cache = get_packed_ref_cache(refs);
 	acquire_packed_ref_cache(packed_ref_cache);
+	packed_dir = get_packed_ref_dir(packed_ref_cache);
 	if (base && *base) {
 		packed_dir = find_containing_dir(packed_dir, base, 0);
-		loose_dir = find_containing_dir(loose_dir, base, 0);
 	}
 
 	if (packed_dir && loose_dir) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 12/12] refs: do not invalidate the packed-refs cache unnecessarily
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (10 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
@ 2013-06-19  7:51 ` Michael Haggerty
  2013-06-19 18:56 ` [PATCH v2 00/12] Fix some reference-related races Jeff King
  12 siblings, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-19  7:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, Ramsay Jones, git, Michael Haggerty

Now that we keep track of the packed-refs file metadata, we can detect
when the packed-refs file has been modified since we last read it, and
we do so automatically every time that get_packed_ref_cache() is
called.  So there is no need to invalidate the cache automatically
when lock_packed_refs() is called; usually the old copy will still be
valid.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index 350054c..4b989b0 100644
--- a/refs.c
+++ b/refs.c
@@ -2133,11 +2133,14 @@ int lock_packed_refs(struct lock_file *lock, int flags)
 {
 	struct packed_ref_cache *packed_ref_cache;
 
-	/* Discard the old cache because it might be invalid: */
-	clear_packed_ref_cache(&ref_cache);
 	if (hold_lock_file_for_update(lock, git_path("packed-refs"), flags) < 0)
 		return -1;
-	/* Read the current packed-refs while holding the lock: */
+	/*
+	 * Get the current packed-refs while holding the lock.  If the
+	 * packed-refs file has been modified since we last read it,
+	 * this will automatically invalidate the cache and re-read
+	 * the packed-refs file.
+	 */
 	packed_ref_cache = get_packed_ref_cache(&ref_cache);
 	packed_ref_cache->lock = lock;
 	/* Increment the reference count to prevent it from being freed: */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/12] Fix some reference-related races
  2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
                   ` (11 preceding siblings ...)
  2013-06-19  7:51 ` [PATCH v2 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
@ 2013-06-19 18:56 ` Jeff King
  2013-06-20  9:09   ` Michael Haggerty
  12 siblings, 1 reply; 24+ messages in thread
From: Jeff King @ 2013-06-19 18:56 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On Wed, Jun 19, 2013 at 09:51:21AM +0200, Michael Haggerty wrote:

> Re-roll of mh/ref-races.  Thanks to Peff, Junio, and Ramsay for
> reviewing v1.

Thanks. I just read through them again. Everything looks good to me.

Patches 10 and 11 are missing my signoff, but obviously:

  Signed-off-by: Jeff King <peff@peff.net>

> The last patch is still optional--it avoids a little bit of work when
> rewriting the packed-refs file, but relies on the stat-based freshness
> check not giving a false negative.

I don't have a real problem with it, but given the cygwin confusions
that Ramsay mentioned, maybe it is better to hold back on it for now? It
sounds like the cygwin problems go the other way (false positives
instead of false negatives).

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-19  7:51 ` [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
@ 2013-06-19 19:18   ` Junio C Hamano
  2013-06-20  7:49     ` Michael Haggerty
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2013-06-19 19:18 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Ramsay Jones, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Handle simple transactions for the packed-refs file at the
> packed_ref_cache level via new functions lock_packed_refs(),
> commit_packed_refs(), and rollback_packed_refs().
>
> Only allow the packed ref cache to be modified (via add_packed_ref())
> while the packed refs file is locked.
>
> Change clone to add the new references within a transaction.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  builtin/clone.c |  7 ++++-
>  refs.c          | 83 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>  refs.h          | 27 +++++++++++++++++--
>  3 files changed, 98 insertions(+), 19 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 66bff57..b0c000a 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -489,17 +489,22 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
>  	return local_refs;
>  }
>  
> +static struct lock_file packed_refs_lock;
> +
>  static void write_remote_refs(const struct ref *local_refs)
>  {
>  	const struct ref *r;
>  
> +	lock_packed_refs(&packed_refs_lock, LOCK_DIE_ON_ERROR);
> +
>  	for (r = local_refs; r; r = r->next) {
>  		if (!r->peer_ref)
>  			continue;
>  		add_packed_ref(r->peer_ref->name, r->old_sha1);
>  	}
>  
> -	pack_refs(PACK_REFS_ALL);
> +	if (commit_packed_refs())
> +		die_errno("unable to overwrite old ref-pack file");
>  }

The calling convention used here looks somewhat strange.  You allow
callers to specify which lock-file structure is used when locking,
but when you are done, commit_packed_refs() does not take any
parameter.

lock_packed_refs() make the singleton in-core packed-ref-cache be
aware of which lock it is under, so commit_packed_refs() does not
need to be told (the singleton already knows what lockfile is in
effect), so I am not saying the code is broken, though.

Does the caller need to even have an access to this lock_file
instance?

>  static void write_followtags(const struct ref *refs, const char *msg)
> diff --git a/refs.c b/refs.c
> index 373d95b..ad73251 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -808,6 +808,13 @@ static int is_refname_available(const char *refname, const char *oldrefname,
>  
>  struct packed_ref_cache {
>  	struct ref_entry *root;
> +
> +	/*
> +	 * Iff the packed-refs file associated with this instance is
> +	 * currently locked for writing, this points at the associated
> +	 * lock (which is owned by somebody else).
> +	 */
> +	struct lock_file *lock;
>  };
>  
>  /*
> @@ -829,6 +836,8 @@ static struct ref_cache {
>  static void clear_packed_ref_cache(struct ref_cache *refs)
>  {
>  	if (refs->packed) {
> +		if (refs->packed->lock)
> +			die("internal error: packed-ref cache cleared while locked");
>  		free_ref_entry(refs->packed->root);
>  		free(refs->packed);
>  		refs->packed = NULL;
> @@ -1038,7 +1047,12 @@ static struct ref_dir *get_packed_refs(struct ref_cache *refs)
>  
>  void add_packed_ref(const char *refname, const unsigned char *sha1)
>  {
> -	add_ref(get_packed_refs(&ref_cache),
> +	struct packed_ref_cache *packed_ref_cache =
> +		get_packed_ref_cache(&ref_cache);
> +
> +	if (!packed_ref_cache->lock)
> +		die("internal error: packed refs not locked");
> +	add_ref(get_packed_ref_dir(packed_ref_cache),
>  		create_ref_entry(refname, sha1, REF_ISPACKED, 1));
>  }
>  
> @@ -2035,6 +2049,52 @@ static int write_packed_entry_fn(struct ref_entry *entry, void *cb_data)
>  	return 0;
>  }
>  
> +int lock_packed_refs(struct lock_file *lock, int flags)
> +{
> +	struct packed_ref_cache *packed_ref_cache;
> +
> +	/* Discard the old cache because it might be invalid: */
> +	clear_packed_ref_cache(&ref_cache);
> +	if (hold_lock_file_for_update(lock, git_path("packed-refs"), flags) < 0)
> +		return -1;
> +	/* Read the current packed-refs while holding the lock: */
> +	packed_ref_cache = get_packed_ref_cache(&ref_cache);
> +	packed_ref_cache->lock = lock;
> +	return 0;
> +}
> +
> +int commit_packed_refs(void)
> +{
> +	struct packed_ref_cache *packed_ref_cache =
> +		get_packed_ref_cache(&ref_cache);
> +	int error = 0;
> +
> +	if (!packed_ref_cache->lock)
> +		die("internal error: packed-refs not locked");
> +	write_or_die(packed_ref_cache->lock->fd,
> +		     PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
> +
> +	do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
> +				 0, write_packed_entry_fn,
> +				 &packed_ref_cache->lock->fd);
> +	if (commit_lock_file(packed_ref_cache->lock))
> +		error = -1;
> +	packed_ref_cache->lock = NULL;
> +	return error;
> +}
> +
> +void rollback_packed_refs(void)
> +{
> +	struct packed_ref_cache *packed_ref_cache =
> +		get_packed_ref_cache(&ref_cache);
> +
> +	if (!packed_ref_cache->lock)
> +		die("internal error: packed-refs not locked");
> +	rollback_lock_file(packed_ref_cache->lock);
> +	packed_ref_cache->lock = NULL;
> +	clear_packed_ref_cache(&ref_cache);
> +}
> +
>  struct ref_to_prune {
>  	struct ref_to_prune *next;
>  	unsigned char sha1[20];
> @@ -2153,23 +2213,19 @@ static struct lock_file packlock;
>  int pack_refs(unsigned int flags)
>  {
>  	struct pack_refs_cb_data cbdata;
> -	int fd;
>  
>  	memset(&cbdata, 0, sizeof(cbdata));
>  	cbdata.flags = flags;
>  
> -	fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"),
> -				       LOCK_DIE_ON_ERROR);
> +	lock_packed_refs(&packlock, LOCK_DIE_ON_ERROR);
>  	cbdata.packed_refs = get_packed_refs(&ref_cache);
>  
>  	do_for_each_entry_in_dir(get_loose_refs(&ref_cache), 0,
>  				 pack_if_possible_fn, &cbdata);
>  
> -	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
> -	do_for_each_entry_in_dir(cbdata.packed_refs, 0, write_packed_entry_fn, &fd);
> -
> -	if (commit_lock_file(&packlock) < 0)
> +	if (commit_packed_refs())
>  		die_errno("unable to overwrite old ref-pack file");
> +
>  	prune_refs(cbdata.ref_to_prune);
>  	return 0;
>  }
> @@ -2233,7 +2289,6 @@ static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data)
>  
>  static int repack_without_ref(const char *refname)
>  {
> -	int fd;
>  	struct ref_dir *packed;
>  	struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
>  	struct string_list_item *ref_to_delete;
> @@ -2241,12 +2296,10 @@ static int repack_without_ref(const char *refname)
>  	if (!get_packed_ref(refname))
>  		return 0; /* refname does not exist in packed refs */
>  
> -	fd = hold_lock_file_for_update(&packlock, git_path("packed-refs"), 0);
> -	if (fd < 0) {
> +	if (lock_packed_refs(&packlock, 0)) {
>  		unable_to_lock_error(git_path("packed-refs"), errno);
>  		return error("cannot delete '%s' from packed refs", refname);
>  	}
> -	clear_packed_ref_cache(&ref_cache);
>  	packed = get_packed_refs(&ref_cache);
>  
>  	/* Remove refname from the cache: */
> @@ -2255,7 +2308,7 @@ static int repack_without_ref(const char *refname)
>  		 * The packed entry disappeared while we were
>  		 * acquiring the lock.
>  		 */
> -		rollback_lock_file(&packlock);
> +		rollback_packed_refs();
>  		return 0;
>  	}
>  
> @@ -2267,9 +2320,7 @@ static int repack_without_ref(const char *refname)
>  	}
>  
>  	/* Write what remains: */
> -	write_or_die(fd, PACKED_REFS_HEADER, strlen(PACKED_REFS_HEADER));
> -	do_for_each_entry_in_dir(packed, 0, write_packed_entry_fn, &fd);
> -	return commit_lock_file(&packlock);
> +	return commit_packed_refs();
>  }
>  
>  int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
> diff --git a/refs.h b/refs.h
> index 246bf60..b275124 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -77,12 +77,35 @@ extern int for_each_rawref(each_ref_fn, void *);
>  extern void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname);
>  
>  /*
> - * Add a reference to the in-memory packed reference cache.  To actually
> - * write the reference to the packed-refs file, call pack_refs().
> + * Lock the packed-refs file for writing using the specified lock_file
> + * instance.  Flags is passed to hold_lock_file_for_update().  Return
> + * 0 on success.
> + */
> +extern int lock_packed_refs(struct lock_file *lock, int flags);
> +
> +/*
> + * Add a reference to the in-memory packed reference cache.  This may
> + * only be called while the packed-refs file is locked (see
> + * lock_packed_refs()).  To actually write the packed-refs file, call
> + * commit_packed_refs().
>   */
>  extern void add_packed_ref(const char *refname, const unsigned char *sha1);
>  
>  /*
> + * Write the current version of the packed refs cache from memory to
> + * disk.  The packed-refs file must already be locked for writing (see
> + * lock_packed_refs()).  Return zero on success.
> + */
> +extern int commit_packed_refs(void);
> +
> +/*
> + * Rollback the lockfile for the packed-refs file, and discard the
> + * in-memory packed reference cache.  (The packed-refs file will be
> + * read anew if it is needed again after this function is called.)
> + */
> +extern void rollback_packed_refs(void);
> +
> +/*
>   * Flags for controlling behaviour of pack_refs()
>   * PACK_REFS_PRUNE: Prune loose refs after packing
>   * PACK_REFS_ALL:   Pack _all_ refs, not just tags and already packed refs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-19 19:18   ` Junio C Hamano
@ 2013-06-20  7:49     ` Michael Haggerty
  2013-06-20 11:55       ` Jeff King
  2013-06-20 17:11       ` Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Michael Haggerty @ 2013-06-20  7:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Johan Herland, Ramsay Jones, git

On 06/19/2013 09:18 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Handle simple transactions for the packed-refs file at the
>> packed_ref_cache level via new functions lock_packed_refs(),
>> commit_packed_refs(), and rollback_packed_refs().
>>
>> Only allow the packed ref cache to be modified (via add_packed_ref())
>> while the packed refs file is locked.
>>
>> Change clone to add the new references within a transaction.
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>  builtin/clone.c |  7 ++++-
>>  refs.c          | 83 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>>  refs.h          | 27 +++++++++++++++++--
>>  3 files changed, 98 insertions(+), 19 deletions(-)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index 66bff57..b0c000a 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -489,17 +489,22 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
>>  	return local_refs;
>>  }
>>  
>> +static struct lock_file packed_refs_lock;
>> +
>>  static void write_remote_refs(const struct ref *local_refs)
>>  {
>>  	const struct ref *r;
>>  
>> +	lock_packed_refs(&packed_refs_lock, LOCK_DIE_ON_ERROR);
>> +
>>  	for (r = local_refs; r; r = r->next) {
>>  		if (!r->peer_ref)
>>  			continue;
>>  		add_packed_ref(r->peer_ref->name, r->old_sha1);
>>  	}
>>  
>> -	pack_refs(PACK_REFS_ALL);
>> +	if (commit_packed_refs())
>> +		die_errno("unable to overwrite old ref-pack file");
>>  }
> 
> The calling convention used here looks somewhat strange.  You allow
> callers to specify which lock-file structure is used when locking,
> but when you are done, commit_packed_refs() does not take any
> parameter.
> 
> lock_packed_refs() make the singleton in-core packed-ref-cache be
> aware of which lock it is under, so commit_packed_refs() does not
> need to be told (the singleton already knows what lockfile is in
> effect), so I am not saying the code is broken, though.
> 
> Does the caller need to even have an access to this lock_file
> instance?

No it doesn't.  My reasoning was as follows:

lock_file instances can never be freed, because they are added to a
linked list in the atexit handler but never removed.  Therefore they
have to be allocated statically (or allocated dynamically then leaked).

[I just noticed that lock_ref_sha1_basic() leaks a struct lock_file
every time that it is called.]

I wanted to leave the way open to allow other packed refs caches to be
locked.  But since all packed refs caches are allocated dynamically, the
lock_file instance cannot be part of struct packed_ref_cache.  So I
thought that the easiest approach is to rely on the caller, who
presumably can know that only a finite number of locks are in use at
once, to supply a usable lock_file instance.

But currently only the main packed ref cache can be locked, so it would
be possible for lock_packed_refs() to use the static packlock instance
for locking.  I will change it to do so.

If/when we add support for locking other packed ref caches, we can
change the API to use a caller-supplied lock_file.  Or even better,
implement a way to remove a lock_file instance from the atexit list;
then lock_packed_refs() could use dynamically-allocated lock_file
instances without having to leak them.

v3 to come.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/12] Fix some reference-related races
  2013-06-19 18:56 ` [PATCH v2 00/12] Fix some reference-related races Jeff King
@ 2013-06-20  9:09   ` Michael Haggerty
  2013-06-20 11:52     ` Jeff King
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2013-06-20  9:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On 06/19/2013 08:56 PM, Jeff King wrote:
> On Wed, Jun 19, 2013 at 09:51:21AM +0200, Michael Haggerty wrote:
> 
>> Re-roll of mh/ref-races.  Thanks to Peff, Junio, and Ramsay for
>> reviewing v1.
> 
> Thanks. I just read through them again. Everything looks good to me.
> 
> Patches 10 and 11 are missing my signoff, but obviously:
> 
>   Signed-off-by: Jeff King <peff@peff.net>
> 
>> The last patch is still optional--it avoids a little bit of work when
>> rewriting the packed-refs file, but relies on the stat-based freshness
>> check not giving a false negative.
> 
> I don't have a real problem with it, but given the cygwin confusions
> that Ramsay mentioned, maybe it is better to hold back on it for now? It
> sounds like the cygwin problems go the other way (false positives
> instead of false negatives).

I just realized that there is another long-standing problem in
loose/packed refs handling:

Suppose there is a loose reference for which an old version is still
present in the packed-refs file.  If the reference is deleted:

1. delete_ref() gets a lock on the loose reference file

2. delete_ref() deletes the loose reference file while retaining the lock

3. delete_ref() calls repack_without_ref()

4. repack_without_ref() tries to acquire a lock on the packed-refs file
to delete the packed version of the reference.

If the packed-refs file is already locked by another process (and there
is no reason why that cannot be, and there is only one attempt to
acquire the lock), then repack_without_ref() emits an error and returns
with an error code.  delete_ref() passes the error along, but doesn't
restore the loose ref.

The net result is that an old version of the ref (namely the one in the
loose refs file) has been revived.  That version might even point at an
object that has already been garbage collected.

This problem is similar to race conditions that have been discussed
earlier, but this problem has nothing to do with the freshness/staleness
of the packed-refs file WRT to the loose reference; it only depends on
the packed-refs file being locked by another process at an inopportune time.

I think this problem would also be fixed by the locking scheme that I
proposed earlier [1]: to acquire and hold the packed-refs lock across
the modification of *both* files, and to rewrite the packed-refs file
*before* deleting the loose-refs file (because rewriting the packed-refs
file without the to-be-deleted reference is a logical NOP).

Michael

[1] http://article.gmane.org/gmane.comp.version-control.git/212131

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/12] Fix some reference-related races
  2013-06-20  9:09   ` Michael Haggerty
@ 2013-06-20 11:52     ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2013-06-20 11:52 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On Thu, Jun 20, 2013 at 11:09:49AM +0200, Michael Haggerty wrote:

> If the packed-refs file is already locked by another process (and there
> is no reason why that cannot be, and there is only one attempt to
> acquire the lock), then repack_without_ref() emits an error and returns
> with an error code.  delete_ref() passes the error along, but doesn't
> restore the loose ref.
>
> [...]
>
> I think this problem would also be fixed by the locking scheme that I
> proposed earlier [1]: to acquire and hold the packed-refs lock across
> the modification of *both* files, and to rewrite the packed-refs file
> *before* deleting the loose-refs file (because rewriting the packed-refs
> file without the to-be-deleted reference is a logical NOP).

Yeah, I agree. You could also "roll back" the loose deletion, but I'd
rather just try to do it atomically.

I don't think this increases lock contention, since delete_ref would
need to lock the packed-refs file anyway. However, there is the related
change that we should probably lock the packed-refs file before checking
"is this ref in the packed-refs file?" in repack_without_ref. Which does
increase lock contention, but is more correct.

We should also consider deadlock issues. If the order is always "acquire
packed-refs lock, then acquire loose locks", we are fine. If this does
loose-then-packed, we are also fine with the current code, as
git-pack-refs does not prune the loose refs while under the packed-refs
lock. But I seem to recall discussion of pruning them under the
packed-refs lock, which would deadlock if repack_without_ref does
loose-then-packed.

But I guess we do not actually block on locks, but rather just die (and
release our locks), so deadlock is not an option for us.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20  7:49     ` Michael Haggerty
@ 2013-06-20 11:55       ` Jeff King
  2013-06-20 18:03         ` Michael Haggerty
  2013-06-20 17:11       ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Jeff King @ 2013-06-20 11:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On Thu, Jun 20, 2013 at 09:49:51AM +0200, Michael Haggerty wrote:

> [I just noticed that lock_ref_sha1_basic() leaks a struct lock_file
> every time that it is called.]

I noticed that recently, too. I have a patch series about 90% complete
that abstracts the tempfile handling (the ultimate goal of which is to
optionally clean up tmp_* files in the objects/ directory). It refactors
the lockfile cleanup, and it would not be too hard to have a committed
or rolled-back lockfile actually remove itself from the "to clean at
exit" list.

Which would make it perfectly safe to have a lockfile as an automatic
variable as long as you commit or rollback before leaving the function.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20  7:49     ` Michael Haggerty
  2013-06-20 11:55       ` Jeff King
@ 2013-06-20 17:11       ` Junio C Hamano
  2013-06-20 17:58         ` Michael Haggerty
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2013-06-20 17:11 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Ramsay Jones, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> But currently only the main packed ref cache can be locked, so it would
> be possible for lock_packed_refs() to use the static packlock instance
> for locking.

Perhaps I am missing something from the previous discussions, but I
am having trouble understanding the "main packed ref cache" part of
the above.  "main" as opposed to...?  Is it envisioned that later
somebody can lock one subpart while another can lock a different and
non-overlapping subpart, to make changes independently, and somehow
their non-overlapping changes will be consolidated into a single
consistent result?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20 17:11       ` Junio C Hamano
@ 2013-06-20 17:58         ` Michael Haggerty
  2013-06-20 18:36           ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2013-06-20 17:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Johan Herland, Ramsay Jones, git

On 06/20/2013 07:11 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> But currently only the main packed ref cache can be locked, so it would
>> be possible for lock_packed_refs() to use the static packlock instance
>> for locking.
> 
> Perhaps I am missing something from the previous discussions, but I
> am having trouble understanding the "main packed ref cache" part of
> the above.  "main" as opposed to...?

"main" as opposed to "submodule".

> Is it envisioned that later
> somebody can lock one subpart while another can lock a different and
> non-overlapping subpart, to make changes independently, and somehow
> their non-overlapping changes will be consolidated into a single
> consistent result?

No, the scenario would be that a git process wants to change a reference
in a submodule directly, as opposed to starting another git process
within the submodule, as I believe is done now.  Maybe it's too
far-fetched even to consider...

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20 11:55       ` Jeff King
@ 2013-06-20 18:03         ` Michael Haggerty
  2013-06-20 19:55           ` Jeff King
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2013-06-20 18:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On 06/20/2013 01:55 PM, Jeff King wrote:
> On Thu, Jun 20, 2013 at 09:49:51AM +0200, Michael Haggerty wrote:
> 
>> [I just noticed that lock_ref_sha1_basic() leaks a struct lock_file
>> every time that it is called.]
> 
> I noticed that recently, too. I have a patch series about 90% complete
> that abstracts the tempfile handling (the ultimate goal of which is to
> optionally clean up tmp_* files in the objects/ directory). It refactors
> the lockfile cleanup, and it would not be too hard to have a committed
> or rolled-back lockfile actually remove itself from the "to clean at
> exit" list.
> 
> Which would make it perfectly safe to have a lockfile as an automatic
> variable as long as you commit or rollback before leaving the function.

Cool, then I won't work on that.  You might also have to make the
lockfile list into a doubly-linked-list to avoid having to do a linear
scan to find the entry to delete, unless the total number of entries is
known to remain small.

Please CC me on the patch series when it is done.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20 17:58         ` Michael Haggerty
@ 2013-06-20 18:36           ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2013-06-20 18:36 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Ramsay Jones, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 06/20/2013 07:11 PM, Junio C Hamano wrote:
>
>> Perhaps I am missing something from the previous discussions, but I
>> am having trouble understanding the "main packed ref cache" part of
>> the above.  "main" as opposed to...?
>
> "main" as opposed to "submodule".

I see.

> No, the scenario would be that a git process wants to change a reference
> in a submodule directly, as opposed to starting another git process
> within the submodule, as I believe is done now.  Maybe it's too
> far-fetched even to consider...

Perhaps.  But the "singleton lock because we only handle main packed
ref cache" does not prevent us from doing so later, so I think v3 is
OK.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file
  2013-06-20 18:03         ` Michael Haggerty
@ 2013-06-20 19:55           ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2013-06-20 19:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Johan Herland, Ramsay Jones, git

On Thu, Jun 20, 2013 at 08:03:43PM +0200, Michael Haggerty wrote:

> > I noticed that recently, too. I have a patch series about 90% complete
> > that abstracts the tempfile handling (the ultimate goal of which is to
> > optionally clean up tmp_* files in the objects/ directory). It refactors
> > the lockfile cleanup, and it would not be too hard to have a committed
> > or rolled-back lockfile actually remove itself from the "to clean at
> > exit" list.
> > 
> > Which would make it perfectly safe to have a lockfile as an automatic
> > variable as long as you commit or rollback before leaving the function.
> 
> Cool, then I won't work on that.  You might also have to make the
> lockfile list into a doubly-linked-list to avoid having to do a linear
> scan to find the entry to delete, unless the total number of entries is
> known to remain small.

Yes, I noticed that potential issue, but I don't think it is worth
worrying about. We typically only take one lock at a time, or a handful
of tempfiles (e.g., one object at a time, or two files for diff).

And once it's abstracted out, it would be easy to handle later.

The part I am a little stuck on is plugging it into
pack-objects/index-pack. Their output handling is a little convoluted
because they may be writing to stdout, to a tempfile, to a named file,
or even appending to an existing file in the case of index-pack
--fix-thin. I don't think it's unmanageable, but I need to spend some
more time on the refactoring.

> Please CC me on the patch series when it is done.

Will do.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-06-20 19:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-19  7:51 [PATCH v2 00/12] Fix some reference-related races Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
2013-06-19 19:18   ` Junio C Hamano
2013-06-20  7:49     ` Michael Haggerty
2013-06-20 11:55       ` Jeff King
2013-06-20 18:03         ` Michael Haggerty
2013-06-20 19:55           ` Jeff King
2013-06-20 17:11       ` Junio C Hamano
2013-06-20 17:58         ` Michael Haggerty
2013-06-20 18:36           ` Junio C Hamano
2013-06-19  7:51 ` [PATCH v2 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 09/12] add a stat_validity struct Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 10/12] get_packed_ref_cache: reload packed-refs file when it changes Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
2013-06-19  7:51 ` [PATCH v2 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
2013-06-19 18:56 ` [PATCH v2 00/12] Fix some reference-related races Jeff King
2013-06-20  9:09   ` Michael Haggerty
2013-06-20 11:52     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).