git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 00/10] Implement transactions for the packed ref store
@ 2017-08-29  8:20 Michael Haggerty
  2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
                   ` (11 more replies)
  0 siblings, 12 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

This should be the second-to-last patch series in the quest for
mmapped packed-refs.

Before this patch series, there is still more coupling than necessary
between `files_ref_store` and `packed_ref_store`:

* `files_ref_store` adds packed references (e.g., during `git clone`
  and `git pack-refs`) by inserting them into the `packed_ref_store`'s
  internal `ref_cache`, then calling `commit_packed_refs()`. But
  `files_ref_store` shouldn't even *know* that `packed_ref_store` uses
  a `ref_cache`, let alone muck about in it.

* `files_ref_store` deletes packed references by calling
  `repack_without_refs()`.

Instead, implement reference transactions and `delete_refs()` for
`packed_ref_store`, and change `files_ref_store` to use these standard
methods rather than internal `packed_ref_store` methods.

This patch series finishes up the encapsulation of `packed_ref_store`.
At the end of the series, the outside world only interacts with it via
the refs API plus a couple of locking-related functions. That will
make it easy for the next patch series to change the internal workings
of `packed_ref_store` without affecting `files_ref_store`.

Along the way, we fix some longstanding problems with reference
updates. Quoting from patch [08/10]:

    First, the old code didn't obtain the `packed-refs` lock until
    `files_transaction_finish()`. This means that a failure to acquire the
    `packed-refs` lock (e.g., due to contention with another process)
    wasn't detected until it was too late (problems like this are supposed
    to be detected in the "prepare" phase). The new code acquires the
    `packed-refs` lock in `files_transaction_prepare()`, the same stage of
    the processing when the loose reference locks are being acquired,
    removing another reason why the "prepare" phase might succeed and the
    "finish" phase might nevertheless fail.

    Second, the old code deleted the loose version of a reference before
    deleting any packed version of the same reference. This left a moment
    when another process might think that the packed version of the
    reference is current, which is incorrect. (Even worse, the packed
    version of the reference can be arbitrarily old, and might even point
    at an object that has since been garbage-collected.)

    Third, if a reference deletion fails to acquire the `packed-refs` lock
    altogether, then the old code might leave the repository in the
    incorrect state (possibly corrupt) described in the previous
    paragraph.

    Now we activate the new "packed-refs" file (sans any references that
    are being deleted) *before* deleting the corresponding loose
    references. But we hold the "packed-refs" lock until after the loose
    references have been finalized, thus preventing a simultaneous
    "pack-refs" process from packing the loose version of the reference in
    the time gap, which would otherwise defeat our attempt to delete it.

This patch series is also available as branch
`packed-ref-transactions` in my fork on GitHub [1]. A draft of the
patch series to change `packed_ref_store` to use mmap and get rid of
its `ref_cache` is available as branch `mmap-packed-refs` from the
same source.

Michael

[1] https://github.com/mhagger/git

Michael Haggerty (10):
  packed-backend: don't adjust the reference count on lock/unlock
  struct ref_transaction: add a place for backends to store data
  packed_ref_store: implement reference transactions
  packed_delete_refs(): implement method
  files_pack_refs(): use a reference transaction to write packed refs
  files_initial_transaction_commit(): use a transaction for packed refs
  t1404: demonstrate two problems with reference transactions
  files_ref_store: use a transaction to update packed refs
  packed-backend: rip out some now-unused code
  files_transaction_finish(): delete reflogs before references

 refs/files-backend.c         | 200 ++++++++++++++-----
 refs/packed-backend.c        | 457 +++++++++++++++++++++++++++++--------------
 refs/packed-backend.h        |  17 +-
 refs/refs-internal.h         |   1 +
 t/t1404-update-ref-errors.sh |  71 +++++++
 5 files changed, 534 insertions(+), 212 deletions(-)

-- 
2.14.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-09-08  6:52   ` Jeff King
  2017-08-29  8:20 ` [PATCH 02/10] struct ref_transaction: add a place for backends to store data Michael Haggerty
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

The old code incremented the packed ref cache reference count when
acquiring the packed-refs lock, and decremented the count when
releasing the lock. This is unnecessary because a locked packed-refs
file cannot be changed, so there is no reason for the cache to become
stale.

Moreover, the extra reference count causes a problem if we
intentionally clear the packed refs cache, as we sometimes need to do
if we change the cache in anticipation of writing a change to disk,
but then the write to disk fails. In that case, `packed_refs_unlock()`
would have no easy way to find the cache whose reference count it
needs to decrement.

This whole issue will soon become moot due to upcoming changes that
avoid changing the in-memory cache as part of updating the packed-refs
on disk, but this change makes that transition easier.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/packed-backend.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 412c85034f..7e348feac3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -525,7 +525,6 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
 				"packed_refs_lock");
 	static int timeout_configured = 0;
 	static int timeout_value = 1000;
-	struct packed_ref_cache *packed_ref_cache;
 
 	if (!timeout_configured) {
 		git_config_get_int("core.packedrefstimeout", &timeout_value);
@@ -560,9 +559,7 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
 	 */
 	validate_packed_ref_cache(refs);
 
-	packed_ref_cache = get_packed_ref_cache(refs);
-	/* Increment the reference count to prevent it from being freed: */
-	acquire_packed_ref_cache(packed_ref_cache);
+	get_packed_ref_cache(refs);
 	return 0;
 }
 
@@ -576,7 +573,6 @@ void packed_refs_unlock(struct ref_store *ref_store)
 	if (!is_lock_file_locked(&refs->lock))
 		die("BUG: packed_refs_unlock() called when not locked");
 	rollback_lock_file(&refs->lock);
-	release_packed_ref_cache(refs->cache);
 }
 
 int packed_refs_is_locked(struct ref_store *ref_store)
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/10] struct ref_transaction: add a place for backends to store data
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
  2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-09-08  7:02   ` Jeff King
  2017-08-29  8:20 ` [PATCH 03/10] packed_ref_store: implement reference transactions Michael Haggerty
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

`packed_ref_store` is going to want to store some transaction-wide
data, so make a place for it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/refs-internal.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index b02dc5a7e3..d7d344de73 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -242,6 +242,7 @@ struct ref_transaction {
 	size_t alloc;
 	size_t nr;
 	enum ref_transaction_state state;
+	void *backend_data;
 };
 
 /*
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/10] packed_ref_store: implement reference transactions
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
  2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
  2017-08-29  8:20 ` [PATCH 02/10] struct ref_transaction: add a place for backends to store data Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-29  8:20 ` [PATCH 04/10] packed_delete_refs(): implement method Michael Haggerty
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Implement the methods needed to support reference transactions for
the packed-refs backend. The new methods are not yet used.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/packed-backend.c | 313 +++++++++++++++++++++++++++++++++++++++++++++++++-
 refs/packed-backend.h |   9 ++
 2 files changed, 319 insertions(+), 3 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 7e348feac3..d19b3bfba5 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -744,25 +744,332 @@ static int packed_init_db(struct ref_store *ref_store, struct strbuf *err)
 	return 0;
 }
 
+/*
+ * Write the packed-refs from the cache to the packed-refs tempfile,
+ * incorporating any changes from `updates`. `updates` must be a
+ * sorted string list whose keys are the refnames and whose util
+ * values are `struct ref_update *`. On error, rollback the tempfile,
+ * write an error message to `err`, and return a nonzero value.
+ *
+ * The packfile must be locked before calling this function and will
+ * remain locked when it is done.
+ */
+static int write_with_updates(struct packed_ref_store *refs,
+			      struct string_list *updates,
+			      struct strbuf *err)
+{
+	struct ref_iterator *iter = NULL;
+	size_t i;
+	int ok;
+	FILE *out;
+	struct strbuf sb = STRBUF_INIT;
+	char *packed_refs_path;
+
+	if (!is_lock_file_locked(&refs->lock))
+		die("BUG: write_with_updates() called while unlocked");
+
+	/*
+	 * If packed-refs is a symlink, we want to overwrite the
+	 * symlinked-to file, not the symlink itself. Also, put the
+	 * staging file next to it:
+	 */
+	packed_refs_path = get_locked_file_path(&refs->lock);
+	strbuf_addf(&sb, "%s.new", packed_refs_path);
+	free(packed_refs_path);
+	if (create_tempfile(&refs->tempfile, sb.buf) < 0) {
+		strbuf_addf(err, "unable to create file %s: %s",
+			    sb.buf, strerror(errno));
+		strbuf_release(&sb);
+		return -1;
+	}
+	strbuf_release(&sb);
+
+	out = fdopen_tempfile(&refs->tempfile, "w");
+	if (!out) {
+		strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
+			    strerror(errno));
+		goto error;
+	}
+
+	if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0)
+		goto write_error;
+
+	/*
+	 * We iterate in parallel through the current list of refs and
+	 * the list of updates, processing an entry from at least one
+	 * of the lists each time through the loop. When the current
+	 * list of refs is exhausted, set iter to NULL. When the list
+	 * of updates is exhausted, leave i set to updates->nr.
+	 */
+	iter = packed_ref_iterator_begin(&refs->base, "",
+					 DO_FOR_EACH_INCLUDE_BROKEN);
+	if ((ok = ref_iterator_advance(iter)) != ITER_OK)
+		iter = NULL;
+
+	i = 0;
+
+	while (iter || i < updates->nr) {
+		struct ref_update *update = NULL;
+		int cmp;
+
+		if (i >= updates->nr) {
+			cmp = -1;
+		} else {
+			update = updates->items[i].util;
+
+			if (!iter)
+				cmp = +1;
+			else
+				cmp = strcmp(iter->refname, update->refname);
+		}
+
+		if (!cmp) {
+			/*
+			 * There is both an old value and an update
+			 * for this reference. Check the old value if
+			 * necessary:
+			 */
+			if ((update->flags & REF_HAVE_OLD)) {
+				if (is_null_oid(&update->old_oid)) {
+					strbuf_addf(err, "cannot update ref '%s': "
+						    "reference already exists",
+						    update->refname);
+					goto error;
+				} else if (oidcmp(&update->old_oid, iter->oid)) {
+					strbuf_addf(err, "cannot update ref '%s': "
+						    "is at %s but expected %s",
+						    update->refname,
+						    oid_to_hex(iter->oid),
+						    oid_to_hex(&update->old_oid));
+					goto error;
+				}
+			}
+
+			/* Now figure out what to use for the new value: */
+			if ((update->flags & REF_HAVE_NEW)) {
+				/*
+				 * The update takes precedence. Skip
+				 * the iterator over the unneeded
+				 * value.
+				 */
+				if ((ok = ref_iterator_advance(iter)) != ITER_OK)
+					iter = NULL;
+				cmp = +1;
+			} else {
+				/*
+				 * The update doesn't actually want to
+				 * change anything. We're done with it.
+				 */
+				i++;
+				cmp = -1;
+			}
+		} else if (cmp > 0) {
+			/*
+			 * There is no old value but there is an
+			 * update for this reference. Make sure that
+			 * the update didn't expect an existing value:
+			 */
+			if ((update->flags & REF_HAVE_OLD) &&
+			    !is_null_oid(&update->old_oid)) {
+				strbuf_addf(err, "cannot update ref '%s': "
+					    "reference is missing but expected %s",
+					    update->refname,
+					    oid_to_hex(&update->old_oid));
+				goto error;
+			}
+		}
+
+		if (cmp < 0) {
+			/* Pass the old reference through. */
+
+			struct object_id peeled;
+			int peel_error = ref_iterator_peel(iter, &peeled);
+
+			if (write_packed_entry(out, iter->refname,
+					       iter->oid->hash,
+					       peel_error ? NULL : peeled.hash))
+				goto write_error;
+
+			if ((ok = ref_iterator_advance(iter)) != ITER_OK)
+				iter = NULL;
+		} else if (is_null_oid(&update->new_oid)) {
+			/*
+			 * The update wants to delete the reference,
+			 * and the reference either didn't exist or we
+			 * have already skipped it. So we're done with
+			 * the update (and don't have to write
+			 * anything).
+			 */
+			i++;
+		} else {
+			struct object_id peeled;
+			int peel_error = peel_object(update->new_oid.hash,
+						     peeled.hash);
+
+			if (write_packed_entry(out, update->refname,
+					       update->new_oid.hash,
+					       peel_error ? NULL : peeled.hash))
+				goto write_error;
+
+			i++;
+		}
+	}
+
+	if (ok != ITER_DONE) {
+		strbuf_addf(err, "unable to write packed-refs file: "
+			    "error iterating over old contents");
+		goto error;
+	}
+
+	if (close_tempfile(&refs->tempfile)) {
+		strbuf_addf(err, "error closing file %s: %s",
+			    get_tempfile_path(&refs->tempfile),
+			    strerror(errno));
+		strbuf_release(&sb);
+		return -1;
+	}
+
+	return 0;
+
+write_error:
+	strbuf_addf(err, "error writing to %s: %s",
+		    get_tempfile_path(&refs->tempfile), strerror(errno));
+
+error:
+	if (iter)
+		ref_iterator_abort(iter);
+
+	delete_tempfile(&refs->tempfile);
+	return -1;
+}
+
+struct packed_transaction_backend_data {
+	/* True iff the transaction owns the packed-refs lock. */
+	int own_lock;
+
+	struct string_list updates;
+};
+
+static void packed_transaction_cleanup(struct packed_ref_store *refs,
+				       struct ref_transaction *transaction)
+{
+	struct packed_transaction_backend_data *data = transaction->backend_data;
+
+	if (data) {
+		string_list_clear(&data->updates, 0);
+
+		if (is_tempfile_active(&refs->tempfile))
+			delete_tempfile(&refs->tempfile);
+
+		if (data->own_lock && is_lock_file_locked(&refs->lock)) {
+			packed_refs_unlock(&refs->base);
+			data->own_lock = 0;
+		}
+
+		free(data);
+		transaction->backend_data = NULL;
+	}
+
+	transaction->state = REF_TRANSACTION_CLOSED;
+}
+
 static int packed_transaction_prepare(struct ref_store *ref_store,
 				      struct ref_transaction *transaction,
 				      struct strbuf *err)
 {
-	die("BUG: not implemented yet");
+	struct packed_ref_store *refs = packed_downcast(
+			ref_store,
+			REF_STORE_READ | REF_STORE_WRITE | REF_STORE_ODB,
+			"ref_transaction_prepare");
+	struct packed_transaction_backend_data *data;
+	size_t i;
+	int ret = TRANSACTION_GENERIC_ERROR;
+
+	/*
+	 * Note that we *don't* skip transactions with zero updates,
+	 * because such a transaction might be executed for the side
+	 * effect of ensuring that all of the references are peeled.
+	 * If the caller wants to optimize away empty transactions, it
+	 * should do so itself.
+	 */
+
+	data = xcalloc(1, sizeof(*data));
+	string_list_init(&data->updates, 0);
+
+	transaction->backend_data = data;
+
+	/*
+	 * Stick the updates in a string list by refname so that we
+	 * can sort them:
+	 */
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct string_list_item *item =
+			string_list_append(&data->updates, update->refname);
+
+		/* Store a pointer to update in item->util: */
+		item->util = update;
+	}
+	string_list_sort(&data->updates);
+
+	if (ref_update_reject_duplicates(&data->updates, err))
+		goto failure;
+
+	if (!is_lock_file_locked(&refs->lock)) {
+		if (packed_refs_lock(ref_store, 0, err))
+			goto failure;
+		data->own_lock = 1;
+	}
+
+	if (write_with_updates(refs, &data->updates, err))
+		goto failure;
+
+	transaction->state = REF_TRANSACTION_PREPARED;
+	return 0;
+
+failure:
+	packed_transaction_cleanup(refs, transaction);
+	return ret;
 }
 
 static int packed_transaction_abort(struct ref_store *ref_store,
 				    struct ref_transaction *transaction,
 				    struct strbuf *err)
 {
-	die("BUG: not implemented yet");
+	struct packed_ref_store *refs = packed_downcast(
+			ref_store,
+			REF_STORE_READ | REF_STORE_WRITE | REF_STORE_ODB,
+			"ref_transaction_abort");
+
+	packed_transaction_cleanup(refs, transaction);
+	return 0;
 }
 
 static int packed_transaction_finish(struct ref_store *ref_store,
 				     struct ref_transaction *transaction,
 				     struct strbuf *err)
 {
-	die("BUG: not implemented yet");
+	struct packed_ref_store *refs = packed_downcast(
+			ref_store,
+			REF_STORE_READ | REF_STORE_WRITE | REF_STORE_ODB,
+			"ref_transaction_finish");
+	int ret = TRANSACTION_GENERIC_ERROR;
+	char *packed_refs_path;
+
+	packed_refs_path = get_locked_file_path(&refs->lock);
+	if (rename_tempfile(&refs->tempfile, packed_refs_path)) {
+		strbuf_addf(err, "error replacing %s: %s",
+			    refs->path, strerror(errno));
+		goto cleanup;
+	}
+
+	clear_packed_ref_cache(refs);
+	ret = 0;
+
+cleanup:
+	free(packed_refs_path);
+	packed_transaction_cleanup(refs, transaction);
+	return ret;
 }
 
 static int packed_initial_transaction_commit(struct ref_store *ref_store,
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index 03b7c1de95..7af2897757 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -1,6 +1,15 @@
 #ifndef REFS_PACKED_BACKEND_H
 #define REFS_PACKED_BACKEND_H
 
+/*
+ * Support for storing references in a `packed-refs` file.
+ *
+ * Note that this backend doesn't check for D/F conflicts, because it
+ * doesn't care about them. But usually it should be wrapped in a
+ * `files_ref_store` that prevents D/F conflicts from being created,
+ * even among packed refs.
+ */
+
 struct ref_store *packed_ref_store_create(const char *path,
 					  unsigned int store_flags);
 
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/10] packed_delete_refs(): implement method
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (2 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 03/10] packed_ref_store: implement reference transactions Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-29 18:07   ` Brandon Williams
  2017-08-29  8:20 ` [PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs Michael Haggerty
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Implement `packed_delete_refs()` using a reference transaction. This
means that `files_delete_refs()` can use `refs_delete_refs()` instead
of `repack_without_refs()` to delete any packed references, decreasing
the coupling between the classes.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c  |  2 +-
 refs/packed-backend.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index fccbc24ac4..2c78f63494 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1157,7 +1157,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
 	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
 		goto error;
 
-	if (repack_without_refs(refs->packed_ref_store, refnames, &err)) {
+	if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
 		packed_refs_unlock(refs->packed_ref_store);
 		goto error;
 	}
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index d19b3bfba5..83a088118f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store,
 static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
 			     struct string_list *refnames, unsigned int flags)
 {
-	die("BUG: not implemented yet");
+	struct packed_ref_store *refs =
+		packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *transaction;
+	struct string_list_item *item;
+	int ret;
+
+	(void)refs; /* We need the check above, but don't use the variable */
+
+	if (!refnames->nr)
+		return 0;
+
+	/*
+	 * Since we don't check the references' old_oids, the
+	 * individual updates can't fail, so we can pack all of the
+	 * updates into a single transaction.
+	 */
+
+	transaction = ref_store_transaction_begin(ref_store, &err);
+	if (!transaction)
+		return -1;
+
+	for_each_string_list_item(item, refnames) {
+		if (ref_transaction_delete(transaction, item->string, NULL,
+					   flags, msg, &err)) {
+			warning(_("could not delete reference %s: %s"),
+				item->string, err.buf);
+			strbuf_reset(&err);
+		}
+	}
+
+	ret = ref_transaction_commit(transaction, &err);
+
+	if (ret) {
+		if (refnames->nr == 1)
+			error(_("could not delete reference %s: %s"),
+			      refnames->items[0].string, err.buf);
+		else
+			error(_("could not delete references: %s"), err.buf);
+	}
+
+	ref_transaction_free(transaction);
+	strbuf_release(&err);
+	return ret;
 }
 
 static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (3 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 04/10] packed_delete_refs(): implement method Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-29  8:20 ` [PATCH 06/10] files_initial_transaction_commit(): use a transaction for " Michael Haggerty
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Now that the packed reference store supports transactions, we can use
a transaction to write the packed versions of references that we want
to pack. This decreases the coupling between `files_ref_store` and
`packed_ref_store`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2c78f63494..0fc44eaca5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1100,9 +1100,14 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 	int ok;
 	struct ref_to_prune *refs_to_prune = NULL;
 	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *transaction;
 
 	packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
+	transaction = ref_store_transaction_begin(refs->packed_ref_store, &err);
+	if (!transaction)
+		return -1;
+
 	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		/*
@@ -1115,12 +1120,14 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 			continue;
 
 		/*
-		 * Create an entry in the packed-refs cache equivalent
-		 * to the one from the loose ref cache, except that
-		 * we don't copy the peeled status, because we want it
-		 * to be re-peeled.
+		 * Add a reference creation for this reference to the
+		 * packed-refs transaction:
 		 */
-		add_packed_ref(refs->packed_ref_store, iter->refname, iter->oid);
+		if (ref_transaction_update(transaction, iter->refname,
+					   iter->oid->hash, NULL,
+					   REF_NODEREF, NULL, &err))
+			die("failure preparing to create packed reference %s: %s",
+			    iter->refname, err.buf);
 
 		/* Schedule the loose reference for pruning if requested. */
 		if ((flags & PACK_REFS_PRUNE)) {
@@ -1134,8 +1141,11 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
 	if (ok != ITER_DONE)
 		die("error while iterating over references");
 
-	if (commit_packed_refs(refs->packed_ref_store, &err))
-		die("unable to overwrite old ref-pack file: %s", err.buf);
+	if (ref_transaction_commit(transaction, &err))
+		die("unable to write new packed-refs: %s", err.buf);
+
+	ref_transaction_free(transaction);
+
 	packed_refs_unlock(refs->packed_ref_store);
 
 	prune_refs(refs, refs_to_prune);
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/10] files_initial_transaction_commit(): use a transaction for packed refs
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (4 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-09-08  7:27   ` Jeff King
  2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Used a `packed_ref_store` transaction in the implementation of
`files_initial_transaction_commit()` rather than using internal
features of the packed ref store. This further decouples
`files_ref_store` from `packed_ref_store`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fc44eaca5..29c7c78602 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2663,6 +2663,7 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
 	size_t i;
 	int ret = 0;
 	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
+	struct ref_transaction *packed_transaction = NULL;
 
 	assert(err);
 
@@ -2695,6 +2696,12 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
 				 &affected_refnames))
 		die("BUG: initial ref transaction called with existing refs");
 
+	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, err);
+	if (!packed_transaction) {
+		ret = TRANSACTION_GENERIC_ERROR;
+		goto cleanup;
+	}
+
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
@@ -2707,6 +2714,15 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
 			ret = TRANSACTION_NAME_CONFLICT;
 			goto cleanup;
 		}
+
+		/*
+		 * Add a reference creation for this reference to the
+		 * packed-refs transaction:
+		 */
+		ref_transaction_add_update(packed_transaction, update->refname,
+					   update->flags & ~REF_HAVE_OLD,
+					   update->new_oid.hash, update->old_oid.hash,
+					   NULL);
 	}
 
 	if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
@@ -2714,21 +2730,14 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
 		goto cleanup;
 	}
 
-	for (i = 0; i < transaction->nr; i++) {
-		struct ref_update *update = transaction->updates[i];
-
-		if ((update->flags & REF_HAVE_NEW) &&
-		    !is_null_oid(&update->new_oid))
-			add_packed_ref(refs->packed_ref_store, update->refname,
-				       &update->new_oid);
-	}
-
-	if (commit_packed_refs(refs->packed_ref_store, err)) {
+	if (initial_ref_transaction_commit(packed_transaction, err)) {
 		ret = TRANSACTION_GENERIC_ERROR;
 		goto cleanup;
 	}
 
 cleanup:
+	if (packed_transaction)
+		ref_transaction_free(packed_transaction);
 	packed_refs_unlock(refs->packed_ref_store);
 	transaction->state = REF_TRANSACTION_CLOSED;
 	string_list_clear(&affected_refnames, 0);
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (5 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 06/10] files_initial_transaction_commit(): use a transaction for " Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-30 17:21   ` Stefan Beller
  2017-09-08  4:44   ` Junio C Hamano
  2017-08-29  8:20 ` [PATCH 08/10] files_ref_store: use a transaction to update packed refs Michael Haggerty
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Currently, a loose reference is deleted even before locking the
`packed-refs` file, let alone deleting any packed version of the
reference. This leads to two problems, demonstrated by two new tests:

* While a reference is being deleted, other processes might see the
  old, packed value of the reference for a moment before the packed
  version is deleted. Normally this would be hard to observe, but we
  can prolong the window by locking the `packed-refs` file externally
  before running `update-ref`, then unlocking it before `update-ref`'s
  attempt to acquire the lock times out.

* If the `packed-refs` file is locked so long that `update-ref` fails
  to lock it, then the reference can be left permanently in the
  incorrect state described in the previous point.

In a moment, both problems will be fixed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
The first of the new tests is rather involved; it uses two background
processes plus a foreground process that polls the value of a
reference. But despite sensitivity to timing, I think it should be
robust even in this broken state. Once the functionality being tested
is fixed, this test should never produce false positives, though
really bad timing (e.g., if it takes more than a second for
`update-ref` to get going) could still lead to false negatives.

Each of the new tests takes about a second to run because they
simulate lock contention.

If anybody has suggestions for better ways to test these things,
please speak up :-)

 t/t1404-update-ref-errors.sh | 71 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
index c34ece48f5..752f83c377 100755
--- a/t/t1404-update-ref-errors.sh
+++ b/t/t1404-update-ref-errors.sh
@@ -404,4 +404,75 @@ test_expect_success 'broken reference blocks indirect create' '
 	test_cmp expected output.err
 '
 
+test_expect_failure 'no bogus intermediate values during delete' '
+	prefix=refs/slow-transaction &&
+	# Set up a reference with differing loose and packed versions:
+	git update-ref $prefix/foo $C &&
+	git pack-refs --all &&
+	git update-ref $prefix/foo $D &&
+	git for-each-ref $prefix >unchanged &&
+	# Now try to update the reference, but hold the `packed-refs` lock
+	# for a while to see what happens while the process is blocked:
+	: >.git/packed-refs.lock &&
+	test_when_finished "rm -f .git/packed-refs.lock" &&
+	{
+		sleep 1 &&
+		rm -f .git/packed-refs.lock &
+	} &&
+	pid1=$! &&
+	{
+		# Note: the following command is intentionally run in the
+		# background. We extend the timeout so that `update-ref`
+		# tries to acquire the `packed-refs` lock longer than it
+		# takes the background process above to delete it:
+		git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
+	} &&
+	pid2=$! &&
+	ok=true &&
+	while kill -0 $pid2 2>/dev/null
+	do
+		sha1=$(git rev-parse --verify --quiet $prefix/foo || echo undefined) &&
+		case "$sha1" in
+		$D)
+			# This is OK; it just means that nothing has happened yet.
+			: ;;
+		undefined)
+			# This is OK; it means the deletion was successful.
+			: ;;
+		$C)
+			# This value should never be seen. Probably the loose
+			# reference has been deleted but the packed reference
+			# is still there:
+			echo "$prefix/foo incorrectly observed to be C" &&
+			break
+			;;
+		*)
+			# WTF?
+			echo "$prefix/foo unexpected value observed: $sha1" &&
+			break
+			;;
+		esac
+	done >out &&
+	wait $pid1 &&
+	wait $pid2 &&
+	test_must_be_empty out &&
+	test_must_fail git rev-parse --verify --quiet $prefix/foo
+'
+
+test_expect_failure 'delete fails cleanly if packed-refs file is locked' '
+	prefix=refs/locked-packed-refs &&
+	# Set up a reference with differing loose and packed versions:
+	git update-ref $prefix/foo $C &&
+	git pack-refs --all &&
+	git update-ref $prefix/foo $D &&
+	git for-each-ref $prefix >unchanged &&
+	# Now try to delete it while the `packed-refs` lock is held:
+	: >.git/packed-refs.lock &&
+	test_when_finished "rm -f .git/packed-refs.lock" &&
+	test_must_fail git update-ref -d $prefix/foo >out 2>err &&
+	git for-each-ref $prefix >actual &&
+	grep "Unable to create $Q.*packed-refs.lock$Q: File exists" err &&
+	test_cmp unchanged actual
+'
+
 test_done
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/10] files_ref_store: use a transaction to update packed refs
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (6 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-09-08  7:38   ` Jeff King
  2017-08-29  8:20 ` [PATCH 09/10] packed-backend: rip out some now-unused code Michael Haggerty
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

When processing a `files_ref_store` transaction, it is sometimes
necessary to delete some references from the "packed-refs" file. Do
that using a reference transaction conducted against the
`packed_ref_store`.

This change further decouples `files_ref_store` from
`packed_ref_store`. It also fixes multiple problems, including the two
revealed by test cases added in the previous commit.

First, the old code didn't obtain the `packed-refs` lock until
`files_transaction_finish()`. This means that a failure to acquire the
`packed-refs` lock (e.g., due to contention with another process)
wasn't detected until it was too late (problems like this are supposed
to be detected in the "prepare" phase). The new code acquires the
`packed-refs` lock in `files_transaction_prepare()`, the same stage of
the processing when the loose reference locks are being acquired,
removing another reason why the "prepare" phase might succeed and the
"finish" phase might nevertheless fail.

Second, the old code deleted the loose version of a reference before
deleting any packed version of the same reference. This left a moment
when another process might think that the packed version of the
reference is current, which is incorrect. (Even worse, the packed
version of the reference can be arbitrarily old, and might even point
at an object that has since been garbage-collected.)

Third, if a reference deletion fails to acquire the `packed-refs` lock
altogether, then the old code might leave the repository in the
incorrect state (possibly corrupt) described in the previous
paragraph.

Now we activate the new "packed-refs" file (sans any references that
are being deleted) *before* deleting the corresponding loose
references. But we hold the "packed-refs" lock until after the loose
references have been finalized, thus preventing a simultaneous
"pack-refs" process from packing the loose version of the reference in
the time gap, which would otherwise defeat our attempt to delete it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c         | 132 +++++++++++++++++++++++++++++++++----------
 t/t1404-update-ref-errors.sh |   4 +-
 2 files changed, 103 insertions(+), 33 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 29c7c78602..4f4c47b9db 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2391,13 +2391,22 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 	return 0;
 }
 
+struct files_transaction_backend_data {
+	struct ref_transaction *packed_transaction;
+	int packed_refs_locked;
+};
+
 /*
  * Unlock any references in `transaction` that are still locked, and
  * mark the transaction closed.
  */
-static void files_transaction_cleanup(struct ref_transaction *transaction)
+static void files_transaction_cleanup(struct files_ref_store *refs,
+				      struct ref_transaction *transaction)
 {
 	size_t i;
+	struct files_transaction_backend_data *backend_data =
+		transaction->backend_data;
+	struct strbuf err = STRBUF_INIT;
 
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
@@ -2409,6 +2418,17 @@ static void files_transaction_cleanup(struct ref_transaction *transaction)
 		}
 	}
 
+	if (backend_data->packed_transaction &&
+	    ref_transaction_abort(backend_data->packed_transaction, &err)) {
+		error("error aborting transaction: %s", err.buf);
+		strbuf_release(&err);
+	}
+
+	if (backend_data->packed_refs_locked)
+		packed_refs_unlock(refs->packed_ref_store);
+
+	free(backend_data);
+
 	transaction->state = REF_TRANSACTION_CLOSED;
 }
 
@@ -2425,12 +2445,17 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 	char *head_ref = NULL;
 	int head_type;
 	struct object_id head_oid;
+	struct files_transaction_backend_data *backend_data;
+	struct ref_transaction *packed_transaction = NULL;
 
 	assert(err);
 
 	if (!transaction->nr)
 		goto cleanup;
 
+	backend_data = xcalloc(1, sizeof(*backend_data));
+	transaction->backend_data = backend_data;
+
 	/*
 	 * Fail if a refname appears more than once in the
 	 * transaction. (If we end up splitting up any updates using
@@ -2497,6 +2522,41 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &affected_refnames, err);
 		if (ret)
 			break;
+
+		if (update->flags & REF_DELETING &&
+		    !(update->flags & REF_LOG_ONLY) &&
+		    !(update->flags & REF_ISPRUNING)) {
+			/*
+			 * This reference has to be deleted from
+			 * packed-refs if it exists there.
+			 */
+			if (!packed_transaction) {
+				packed_transaction = ref_store_transaction_begin(
+						refs->packed_ref_store, err);
+				if (!packed_transaction) {
+					ret = TRANSACTION_GENERIC_ERROR;
+					goto cleanup;
+				}
+
+				backend_data->packed_transaction =
+					packed_transaction;
+			}
+
+			ref_transaction_add_update(
+					packed_transaction, update->refname,
+					update->flags & ~REF_HAVE_OLD,
+					update->new_oid.hash, update->old_oid.hash,
+					NULL);
+		}
+	}
+
+	if (packed_transaction) {
+		if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto cleanup;
+		}
+		backend_data->packed_refs_locked = 1;
+		ret = ref_transaction_prepare(packed_transaction, err);
 	}
 
 cleanup:
@@ -2504,7 +2564,7 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 	string_list_clear(&affected_refnames, 0);
 
 	if (ret)
-		files_transaction_cleanup(transaction);
+		files_transaction_cleanup(refs, transaction);
 	else
 		transaction->state = REF_TRANSACTION_PREPARED;
 
@@ -2519,9 +2579,10 @@ static int files_transaction_finish(struct ref_store *ref_store,
 		files_downcast(ref_store, 0, "ref_transaction_finish");
 	size_t i;
 	int ret = 0;
-	struct string_list refs_to_delete = STRING_LIST_INIT_NODUP;
-	struct string_list_item *ref_to_delete;
 	struct strbuf sb = STRBUF_INIT;
+	struct files_transaction_backend_data *backend_data;
+	struct ref_transaction *packed_transaction;
+
 
 	assert(err);
 
@@ -2530,6 +2591,9 @@ static int files_transaction_finish(struct ref_store *ref_store,
 		return 0;
 	}
 
+	backend_data = transaction->backend_data;
+	packed_transaction = backend_data->packed_transaction;
+
 	/* Perform updates first so live commits remain referenced */
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
@@ -2565,7 +2629,23 @@ static int files_transaction_finish(struct ref_store *ref_store,
 			}
 		}
 	}
-	/* Perform deletes now that updates are safely completed */
+
+	/*
+	 * Perform deletes now that updates are safely completed.
+	 *
+	 * First delete any packed versions of the references, while
+	 * retaining the packed-refs lock:
+	 */
+	if (packed_transaction) {
+		ret = ref_transaction_commit(packed_transaction, err);
+		ref_transaction_free(packed_transaction);
+		packed_transaction = NULL;
+		backend_data->packed_transaction = NULL;
+		if (ret)
+			goto cleanup;
+	}
+
+	/* Now delete the loose versions of the references: */
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 		struct ref_lock *lock = update->backend_data;
@@ -2583,39 +2663,27 @@ static int files_transaction_finish(struct ref_store *ref_store,
 				}
 				update->flags |= REF_DELETED_LOOSE;
 			}
-
-			if (!(update->flags & REF_ISPRUNING))
-				string_list_append(&refs_to_delete,
-						   lock->ref_name);
 		}
 	}
 
-	if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
-		ret = TRANSACTION_GENERIC_ERROR;
-		goto cleanup;
-	}
-
-	if (repack_without_refs(refs->packed_ref_store, &refs_to_delete, err)) {
-		ret = TRANSACTION_GENERIC_ERROR;
-		packed_refs_unlock(refs->packed_ref_store);
-		goto cleanup;
-	}
-
-	packed_refs_unlock(refs->packed_ref_store);
-
 	/* Delete the reflogs of any references that were deleted: */
-	for_each_string_list_item(ref_to_delete, &refs_to_delete) {
-		strbuf_reset(&sb);
-		files_reflog_path(refs, &sb, ref_to_delete->string);
-		if (!unlink_or_warn(sb.buf))
-			try_remove_empty_parents(refs, ref_to_delete->string,
-						 REMOVE_EMPTY_PARENTS_REFLOG);
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		if (update->flags & REF_DELETING &&
+		    !(update->flags & REF_LOG_ONLY) &&
+		    !(update->flags & REF_ISPRUNING)) {
+			strbuf_reset(&sb);
+			files_reflog_path(refs, &sb, update->refname);
+			if (!unlink_or_warn(sb.buf))
+				try_remove_empty_parents(refs, update->refname,
+							 REMOVE_EMPTY_PARENTS_REFLOG);
+		}
 	}
 
 	clear_loose_ref_cache(refs);
 
 cleanup:
-	files_transaction_cleanup(transaction);
+	files_transaction_cleanup(refs, transaction);
 
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
@@ -2633,7 +2701,6 @@ static int files_transaction_finish(struct ref_store *ref_store,
 	}
 
 	strbuf_release(&sb);
-	string_list_clear(&refs_to_delete, 0);
 	return ret;
 }
 
@@ -2641,7 +2708,10 @@ static int files_transaction_abort(struct ref_store *ref_store,
 				   struct ref_transaction *transaction,
 				   struct strbuf *err)
 {
-	files_transaction_cleanup(transaction);
+	struct files_ref_store *refs =
+		files_downcast(ref_store, 0, "ref_transaction_abort");
+
+	files_transaction_cleanup(refs, transaction);
 	return 0;
 }
 
diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
index 752f83c377..26e5ff6843 100755
--- a/t/t1404-update-ref-errors.sh
+++ b/t/t1404-update-ref-errors.sh
@@ -404,7 +404,7 @@ test_expect_success 'broken reference blocks indirect create' '
 	test_cmp expected output.err
 '
 
-test_expect_failure 'no bogus intermediate values during delete' '
+test_expect_success 'no bogus intermediate values during delete' '
 	prefix=refs/slow-transaction &&
 	# Set up a reference with differing loose and packed versions:
 	git update-ref $prefix/foo $C &&
@@ -459,7 +459,7 @@ test_expect_failure 'no bogus intermediate values during delete' '
 	test_must_fail git rev-parse --verify --quiet $prefix/foo
 '
 
-test_expect_failure 'delete fails cleanly if packed-refs file is locked' '
+test_expect_success 'delete fails cleanly if packed-refs file is locked' '
 	prefix=refs/locked-packed-refs &&
 	# Set up a reference with differing loose and packed versions:
 	git update-ref $prefix/foo $C &&
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/10] packed-backend: rip out some now-unused code
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (7 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 08/10] files_ref_store: use a transaction to update packed refs Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-29 18:24   ` Brandon Williams
  2017-08-29  8:20 ` [PATCH 10/10] files_transaction_finish(): delete reflogs before references Michael Haggerty
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

Now the outside world interacts with the packed ref store only via the
generic refs API plus a few lock-related functions. This allows us to
delete some functions that are no longer used, thereby completing the
encapsulation of the packed ref store.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/packed-backend.c | 193 --------------------------------------------------
 refs/packed-backend.h |   8 ---
 2 files changed, 201 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 83a088118f..90f44c1fbb 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -91,19 +91,6 @@ struct ref_store *packed_ref_store_create(const char *path,
 	return ref_store;
 }
 
-/*
- * Die if refs is not the main ref store. caller is used in any
- * necessary error messages.
- */
-static void packed_assert_main_repository(struct packed_ref_store *refs,
-					  const char *caller)
-{
-	if (refs->store_flags & REF_STORE_MAIN)
-		return;
-
-	die("BUG: operation %s only allowed for main ref store", caller);
-}
-
 /*
  * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is
  * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't
@@ -321,40 +308,6 @@ static struct ref_dir *get_packed_refs(struct packed_ref_store *refs)
 	return get_packed_ref_dir(get_packed_ref_cache(refs));
 }
 
-/*
- * Add or overwrite a reference in the in-memory packed reference
- * cache. This may only be called while the packed-refs file is locked
- * (see packed_refs_lock()). To actually write the packed-refs file,
- * call commit_packed_refs().
- */
-void add_packed_ref(struct ref_store *ref_store,
-		    const char *refname, const struct object_id *oid)
-{
-	struct packed_ref_store *refs =
-		packed_downcast(ref_store, REF_STORE_WRITE,
-				"add_packed_ref");
-	struct ref_dir *packed_refs;
-	struct ref_entry *packed_entry;
-
-	if (!is_lock_file_locked(&refs->lock))
-		die("BUG: packed refs not locked");
-
-	if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
-		die("Reference has invalid format: '%s'", refname);
-
-	packed_refs = get_packed_refs(refs);
-	packed_entry = find_ref_entry(packed_refs, refname);
-	if (packed_entry) {
-		/* Overwrite the existing entry: */
-		oidcpy(&packed_entry->u.value.oid, oid);
-		packed_entry->flag = REF_ISPACKED;
-		oidclr(&packed_entry->u.value.peeled);
-	} else {
-		packed_entry = create_ref_entry(refname, oid, REF_ISPACKED);
-		add_ref_entry(packed_refs, packed_entry);
-	}
-}
-
 /*
  * Return the ref_entry for the given refname from the packed
  * references.  If it does not exist, return NULL.
@@ -592,152 +545,6 @@ int packed_refs_is_locked(struct ref_store *ref_store)
 static const char PACKED_REFS_HEADER[] =
 	"# pack-refs with: peeled fully-peeled \n";
 
-/*
- * Write the current version of the packed refs cache from memory to
- * disk. The packed-refs file must already be locked for writing (see
- * packed_refs_lock()). Return zero on success. On errors, rollback
- * the lockfile, write an error message to `err`, and return a nonzero
- * value.
- */
-int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err)
-{
-	struct packed_ref_store *refs =
-		packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
-				"commit_packed_refs");
-	struct packed_ref_cache *packed_ref_cache =
-		get_packed_ref_cache(refs);
-	int ok;
-	int ret = -1;
-	struct strbuf sb = STRBUF_INIT;
-	FILE *out;
-	struct ref_iterator *iter;
-	char *packed_refs_path;
-
-	if (!is_lock_file_locked(&refs->lock))
-		die("BUG: commit_packed_refs() called when unlocked");
-
-	/*
-	 * If packed-refs is a symlink, we want to overwrite the
-	 * symlinked-to file, not the symlink itself. Also, put the
-	 * staging file next to it:
-	 */
-	packed_refs_path = get_locked_file_path(&refs->lock);
-	strbuf_addf(&sb, "%s.new", packed_refs_path);
-	if (create_tempfile(&refs->tempfile, sb.buf) < 0) {
-		strbuf_addf(err, "unable to create file %s: %s",
-			    sb.buf, strerror(errno));
-		strbuf_release(&sb);
-		goto out;
-	}
-	strbuf_release(&sb);
-
-	out = fdopen_tempfile(&refs->tempfile, "w");
-	if (!out) {
-		strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
-			    strerror(errno));
-		goto error;
-	}
-
-	if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) {
-		strbuf_addf(err, "error writing to %s: %s",
-			    get_tempfile_path(&refs->tempfile), strerror(errno));
-		goto error;
-	}
-
-	iter = cache_ref_iterator_begin(packed_ref_cache->cache, NULL, 0);
-	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
-		struct object_id peeled;
-		int peel_error = ref_iterator_peel(iter, &peeled);
-
-		if (write_packed_entry(out, iter->refname, iter->oid->hash,
-				       peel_error ? NULL : peeled.hash)) {
-			strbuf_addf(err, "error writing to %s: %s",
-				    get_tempfile_path(&refs->tempfile),
-				    strerror(errno));
-			ref_iterator_abort(iter);
-			goto error;
-		}
-	}
-
-	if (ok != ITER_DONE) {
-		strbuf_addf(err, "unable to rewrite packed-refs file: "
-			    "error iterating over old contents");
-		goto error;
-	}
-
-	if (rename_tempfile(&refs->tempfile, packed_refs_path)) {
-		strbuf_addf(err, "error replacing %s: %s",
-			    refs->path, strerror(errno));
-		goto out;
-	}
-
-	ret = 0;
-	goto out;
-
-error:
-	delete_tempfile(&refs->tempfile);
-
-out:
-	free(packed_refs_path);
-	return ret;
-}
-
-/*
- * Rewrite the packed-refs file, omitting any refs listed in
- * 'refnames'. On error, leave packed-refs unchanged, write an error
- * message to 'err', and return a nonzero value. The packed refs lock
- * must be held when calling this function; it will still be held when
- * the function returns.
- *
- * The refs in 'refnames' needn't be sorted. `err` must not be NULL.
- */
-int repack_without_refs(struct ref_store *ref_store,
-			struct string_list *refnames, struct strbuf *err)
-{
-	struct packed_ref_store *refs =
-		packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
-				"repack_without_refs");
-	struct ref_dir *packed;
-	struct string_list_item *refname;
-	int needs_repacking = 0, removed = 0;
-
-	packed_assert_main_repository(refs, "repack_without_refs");
-	assert(err);
-
-	if (!is_lock_file_locked(&refs->lock))
-		die("BUG: repack_without_refs called without holding lock");
-
-	/* Look for a packed ref */
-	for_each_string_list_item(refname, refnames) {
-		if (get_packed_ref(refs, refname->string)) {
-			needs_repacking = 1;
-			break;
-		}
-	}
-
-	/* Avoid locking if we have nothing to do */
-	if (!needs_repacking)
-		return 0; /* no refname exists in packed refs */
-
-	packed = get_packed_refs(refs);
-
-	/* Remove refnames from the cache */
-	for_each_string_list_item(refname, refnames)
-		if (remove_entry_from_dir(packed, refname->string) != -1)
-			removed = 1;
-	if (!removed) {
-		/*
-		 * All packed entries disappeared while we were
-		 * acquiring the lock.
-		 */
-		clear_packed_ref_cache(refs);
-		return 0;
-	}
-
-	/* Write what remains */
-	return commit_packed_refs(&refs->base, err);
-}
-
 static int packed_init_db(struct ref_store *ref_store, struct strbuf *err)
 {
 	/* Nothing to do. */
diff --git a/refs/packed-backend.h b/refs/packed-backend.h
index 7af2897757..61687e408a 100644
--- a/refs/packed-backend.h
+++ b/refs/packed-backend.h
@@ -23,12 +23,4 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
 void packed_refs_unlock(struct ref_store *ref_store);
 int packed_refs_is_locked(struct ref_store *ref_store);
 
-void add_packed_ref(struct ref_store *ref_store,
-		    const char *refname, const struct object_id *oid);
-
-int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err);
-
-int repack_without_refs(struct ref_store *ref_store,
-			struct string_list *refnames, struct strbuf *err);
-
 #endif /* REFS_PACKED_BACKEND_H */
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/10] files_transaction_finish(): delete reflogs before references
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (8 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 09/10] packed-backend: rip out some now-unused code Michael Haggerty
@ 2017-08-29  8:20 ` Michael Haggerty
  2017-08-29 18:30 ` [PATCH 00/10] Implement transactions for the packed ref store Brandon Williams
  2017-09-08  7:42 ` Jeff King
  11 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git,
	Michael Haggerty

If the deletion steps unexpectedly fail, it is less bad to leave a
reference without its reflog than it is to leave a reflog without its
reference, since the latter is an invalid repository state.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs/files-backend.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4f4c47b9db..36f81b4f28 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2630,6 +2630,27 @@ static int files_transaction_finish(struct ref_store *ref_store,
 		}
 	}
 
+	/*
+	 * Now that updates are safely completed, we can perform
+	 * deletes. First delete the reflogs of any references that
+	 * will be deleted, since (in the unexpected event of an
+	 * error) leaving a reference without a reflog is less bad
+	 * than leaving a reflog without a reference (the latter is a
+	 * mildly invalid repository state):
+	 */
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		if (update->flags & REF_DELETING &&
+		    !(update->flags & REF_LOG_ONLY) &&
+		    !(update->flags & REF_ISPRUNING)) {
+			strbuf_reset(&sb);
+			files_reflog_path(refs, &sb, update->refname);
+			if (!unlink_or_warn(sb.buf))
+				try_remove_empty_parents(refs, update->refname,
+							 REMOVE_EMPTY_PARENTS_REFLOG);
+		}
+	}
+
 	/*
 	 * Perform deletes now that updates are safely completed.
 	 *
@@ -2666,20 +2687,6 @@ static int files_transaction_finish(struct ref_store *ref_store,
 		}
 	}
 
-	/* Delete the reflogs of any references that were deleted: */
-	for (i = 0; i < transaction->nr; i++) {
-		struct ref_update *update = transaction->updates[i];
-		if (update->flags & REF_DELETING &&
-		    !(update->flags & REF_LOG_ONLY) &&
-		    !(update->flags & REF_ISPRUNING)) {
-			strbuf_reset(&sb);
-			files_reflog_path(refs, &sb, update->refname);
-			if (!unlink_or_warn(sb.buf))
-				try_remove_empty_parents(refs, update->refname,
-							 REMOVE_EMPTY_PARENTS_REFLOG);
-		}
-	}
-
 	clear_loose_ref_cache(refs);
 
 cleanup:
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/10] packed_delete_refs(): implement method
  2017-08-29  8:20 ` [PATCH 04/10] packed_delete_refs(): implement method Michael Haggerty
@ 2017-08-29 18:07   ` Brandon Williams
  2017-08-30  3:00     ` Michael Haggerty
  0 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-29 18:07 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Jeff King, Ævar Arnfjörð Bjarmason,
	git

On 08/29, Michael Haggerty wrote:
> Implement `packed_delete_refs()` using a reference transaction. This
> means that `files_delete_refs()` can use `refs_delete_refs()` instead
> of `repack_without_refs()` to delete any packed references, decreasing
> the coupling between the classes.
> 
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs/files-backend.c  |  2 +-
>  refs/packed-backend.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 45 insertions(+), 2 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index fccbc24ac4..2c78f63494 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1157,7 +1157,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
>  	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
>  		goto error;
>  
> -	if (repack_without_refs(refs->packed_ref_store, refnames, &err)) {
> +	if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
>  		packed_refs_unlock(refs->packed_ref_store);
>  		goto error;
>  	}
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index d19b3bfba5..83a088118f 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store,
>  static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
>  			     struct string_list *refnames, unsigned int flags)
>  {
> -	die("BUG: not implemented yet");
> +	struct packed_ref_store *refs =
> +		packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> +	struct strbuf err = STRBUF_INIT;
> +	struct ref_transaction *transaction;
> +	struct string_list_item *item;
> +	int ret;
> +
> +	(void)refs; /* We need the check above, but don't use the variable */

Can't say I've seen a line like this before, Is the intent to just mark
that this variable won't be used like you mention in the comment?

> +
> +	if (!refnames->nr)
> +		return 0;
> +
> +	/*
> +	 * Since we don't check the references' old_oids, the
> +	 * individual updates can't fail, so we can pack all of the
> +	 * updates into a single transaction.
> +	 */
> +
> +	transaction = ref_store_transaction_begin(ref_store, &err);
> +	if (!transaction)
> +		return -1;
> +
> +	for_each_string_list_item(item, refnames) {
> +		if (ref_transaction_delete(transaction, item->string, NULL,
> +					   flags, msg, &err)) {
> +			warning(_("could not delete reference %s: %s"),
> +				item->string, err.buf);
> +			strbuf_reset(&err);
> +		}
> +	}
> +
> +	ret = ref_transaction_commit(transaction, &err);
> +
> +	if (ret) {
> +		if (refnames->nr == 1)
> +			error(_("could not delete reference %s: %s"),
> +			      refnames->items[0].string, err.buf);
> +		else
> +			error(_("could not delete references: %s"), err.buf);
> +	}
> +
> +	ref_transaction_free(transaction);
> +	strbuf_release(&err);
> +	return ret;
>  }
>  
>  static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
> -- 
> 2.14.1
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 09/10] packed-backend: rip out some now-unused code
  2017-08-29  8:20 ` [PATCH 09/10] packed-backend: rip out some now-unused code Michael Haggerty
@ 2017-08-29 18:24   ` Brandon Williams
  0 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-29 18:24 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Jeff King, Ævar Arnfjörð Bjarmason,
	git

On 08/29, Michael Haggerty wrote:
> Now the outside world interacts with the packed ref store only via the
> generic refs API plus a few lock-related functions. This allows us to
> delete some functions that are no longer used, thereby completing the
> encapsulation of the packed ref store.

Love seeing patches which only remove old code :)

> 
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs/packed-backend.c | 193 --------------------------------------------------
>  refs/packed-backend.h |   8 ---
>  2 files changed, 201 deletions(-)
> 
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 83a088118f..90f44c1fbb 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -91,19 +91,6 @@ struct ref_store *packed_ref_store_create(const char *path,
>  	return ref_store;
>  }
>  
> -/*
> - * Die if refs is not the main ref store. caller is used in any
> - * necessary error messages.
> - */
> -static void packed_assert_main_repository(struct packed_ref_store *refs,
> -					  const char *caller)
> -{
> -	if (refs->store_flags & REF_STORE_MAIN)
> -		return;
> -
> -	die("BUG: operation %s only allowed for main ref store", caller);
> -}
> -
>  /*
>   * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is
>   * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't
> @@ -321,40 +308,6 @@ static struct ref_dir *get_packed_refs(struct packed_ref_store *refs)
>  	return get_packed_ref_dir(get_packed_ref_cache(refs));
>  }
>  
> -/*
> - * Add or overwrite a reference in the in-memory packed reference
> - * cache. This may only be called while the packed-refs file is locked
> - * (see packed_refs_lock()). To actually write the packed-refs file,
> - * call commit_packed_refs().
> - */
> -void add_packed_ref(struct ref_store *ref_store,
> -		    const char *refname, const struct object_id *oid)
> -{
> -	struct packed_ref_store *refs =
> -		packed_downcast(ref_store, REF_STORE_WRITE,
> -				"add_packed_ref");
> -	struct ref_dir *packed_refs;
> -	struct ref_entry *packed_entry;
> -
> -	if (!is_lock_file_locked(&refs->lock))
> -		die("BUG: packed refs not locked");
> -
> -	if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
> -		die("Reference has invalid format: '%s'", refname);
> -
> -	packed_refs = get_packed_refs(refs);
> -	packed_entry = find_ref_entry(packed_refs, refname);
> -	if (packed_entry) {
> -		/* Overwrite the existing entry: */
> -		oidcpy(&packed_entry->u.value.oid, oid);
> -		packed_entry->flag = REF_ISPACKED;
> -		oidclr(&packed_entry->u.value.peeled);
> -	} else {
> -		packed_entry = create_ref_entry(refname, oid, REF_ISPACKED);
> -		add_ref_entry(packed_refs, packed_entry);
> -	}
> -}
> -
>  /*
>   * Return the ref_entry for the given refname from the packed
>   * references.  If it does not exist, return NULL.
> @@ -592,152 +545,6 @@ int packed_refs_is_locked(struct ref_store *ref_store)
>  static const char PACKED_REFS_HEADER[] =
>  	"# pack-refs with: peeled fully-peeled \n";
>  
> -/*
> - * Write the current version of the packed refs cache from memory to
> - * disk. The packed-refs file must already be locked for writing (see
> - * packed_refs_lock()). Return zero on success. On errors, rollback
> - * the lockfile, write an error message to `err`, and return a nonzero
> - * value.
> - */
> -int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err)
> -{
> -	struct packed_ref_store *refs =
> -		packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
> -				"commit_packed_refs");
> -	struct packed_ref_cache *packed_ref_cache =
> -		get_packed_ref_cache(refs);
> -	int ok;
> -	int ret = -1;
> -	struct strbuf sb = STRBUF_INIT;
> -	FILE *out;
> -	struct ref_iterator *iter;
> -	char *packed_refs_path;
> -
> -	if (!is_lock_file_locked(&refs->lock))
> -		die("BUG: commit_packed_refs() called when unlocked");
> -
> -	/*
> -	 * If packed-refs is a symlink, we want to overwrite the
> -	 * symlinked-to file, not the symlink itself. Also, put the
> -	 * staging file next to it:
> -	 */
> -	packed_refs_path = get_locked_file_path(&refs->lock);
> -	strbuf_addf(&sb, "%s.new", packed_refs_path);
> -	if (create_tempfile(&refs->tempfile, sb.buf) < 0) {
> -		strbuf_addf(err, "unable to create file %s: %s",
> -			    sb.buf, strerror(errno));
> -		strbuf_release(&sb);
> -		goto out;
> -	}
> -	strbuf_release(&sb);
> -
> -	out = fdopen_tempfile(&refs->tempfile, "w");
> -	if (!out) {
> -		strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
> -			    strerror(errno));
> -		goto error;
> -	}
> -
> -	if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) {
> -		strbuf_addf(err, "error writing to %s: %s",
> -			    get_tempfile_path(&refs->tempfile), strerror(errno));
> -		goto error;
> -	}
> -
> -	iter = cache_ref_iterator_begin(packed_ref_cache->cache, NULL, 0);
> -	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
> -		struct object_id peeled;
> -		int peel_error = ref_iterator_peel(iter, &peeled);
> -
> -		if (write_packed_entry(out, iter->refname, iter->oid->hash,
> -				       peel_error ? NULL : peeled.hash)) {
> -			strbuf_addf(err, "error writing to %s: %s",
> -				    get_tempfile_path(&refs->tempfile),
> -				    strerror(errno));
> -			ref_iterator_abort(iter);
> -			goto error;
> -		}
> -	}
> -
> -	if (ok != ITER_DONE) {
> -		strbuf_addf(err, "unable to rewrite packed-refs file: "
> -			    "error iterating over old contents");
> -		goto error;
> -	}
> -
> -	if (rename_tempfile(&refs->tempfile, packed_refs_path)) {
> -		strbuf_addf(err, "error replacing %s: %s",
> -			    refs->path, strerror(errno));
> -		goto out;
> -	}
> -
> -	ret = 0;
> -	goto out;
> -
> -error:
> -	delete_tempfile(&refs->tempfile);
> -
> -out:
> -	free(packed_refs_path);
> -	return ret;
> -}
> -
> -/*
> - * Rewrite the packed-refs file, omitting any refs listed in
> - * 'refnames'. On error, leave packed-refs unchanged, write an error
> - * message to 'err', and return a nonzero value. The packed refs lock
> - * must be held when calling this function; it will still be held when
> - * the function returns.
> - *
> - * The refs in 'refnames' needn't be sorted. `err` must not be NULL.
> - */
> -int repack_without_refs(struct ref_store *ref_store,
> -			struct string_list *refnames, struct strbuf *err)
> -{
> -	struct packed_ref_store *refs =
> -		packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
> -				"repack_without_refs");
> -	struct ref_dir *packed;
> -	struct string_list_item *refname;
> -	int needs_repacking = 0, removed = 0;
> -
> -	packed_assert_main_repository(refs, "repack_without_refs");
> -	assert(err);
> -
> -	if (!is_lock_file_locked(&refs->lock))
> -		die("BUG: repack_without_refs called without holding lock");
> -
> -	/* Look for a packed ref */
> -	for_each_string_list_item(refname, refnames) {
> -		if (get_packed_ref(refs, refname->string)) {
> -			needs_repacking = 1;
> -			break;
> -		}
> -	}
> -
> -	/* Avoid locking if we have nothing to do */
> -	if (!needs_repacking)
> -		return 0; /* no refname exists in packed refs */
> -
> -	packed = get_packed_refs(refs);
> -
> -	/* Remove refnames from the cache */
> -	for_each_string_list_item(refname, refnames)
> -		if (remove_entry_from_dir(packed, refname->string) != -1)
> -			removed = 1;
> -	if (!removed) {
> -		/*
> -		 * All packed entries disappeared while we were
> -		 * acquiring the lock.
> -		 */
> -		clear_packed_ref_cache(refs);
> -		return 0;
> -	}
> -
> -	/* Write what remains */
> -	return commit_packed_refs(&refs->base, err);
> -}
> -
>  static int packed_init_db(struct ref_store *ref_store, struct strbuf *err)
>  {
>  	/* Nothing to do. */
> diff --git a/refs/packed-backend.h b/refs/packed-backend.h
> index 7af2897757..61687e408a 100644
> --- a/refs/packed-backend.h
> +++ b/refs/packed-backend.h
> @@ -23,12 +23,4 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
>  void packed_refs_unlock(struct ref_store *ref_store);
>  int packed_refs_is_locked(struct ref_store *ref_store);
>  
> -void add_packed_ref(struct ref_store *ref_store,
> -		    const char *refname, const struct object_id *oid);
> -
> -int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err);
> -
> -int repack_without_refs(struct ref_store *ref_store,
> -			struct string_list *refnames, struct strbuf *err);
> -
>  #endif /* REFS_PACKED_BACKEND_H */
> -- 
> 2.14.1
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/10] Implement transactions for the packed ref store
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (9 preceding siblings ...)
  2017-08-29  8:20 ` [PATCH 10/10] files_transaction_finish(): delete reflogs before references Michael Haggerty
@ 2017-08-29 18:30 ` Brandon Williams
  2017-09-08  7:42 ` Jeff King
  11 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-29 18:30 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Jeff King, Ævar Arnfjörð Bjarmason,
	git

On 08/29, Michael Haggerty wrote:
> This should be the second-to-last patch series in the quest for
> mmapped packed-refs.
> 
> Before this patch series, there is still more coupling than necessary
> between `files_ref_store` and `packed_ref_store`:
> 
> * `files_ref_store` adds packed references (e.g., during `git clone`
>   and `git pack-refs`) by inserting them into the `packed_ref_store`'s
>   internal `ref_cache`, then calling `commit_packed_refs()`. But
>   `files_ref_store` shouldn't even *know* that `packed_ref_store` uses
>   a `ref_cache`, let alone muck about in it.
> 
> * `files_ref_store` deletes packed references by calling
>   `repack_without_refs()`.
> 
> Instead, implement reference transactions and `delete_refs()` for
> `packed_ref_store`, and change `files_ref_store` to use these standard
> methods rather than internal `packed_ref_store` methods.
> 
> This patch series finishes up the encapsulation of `packed_ref_store`.
> At the end of the series, the outside world only interacts with it via
> the refs API plus a couple of locking-related functions. That will
> make it easy for the next patch series to change the internal workings
> of `packed_ref_store` without affecting `files_ref_store`.
> 
> Along the way, we fix some longstanding problems with reference
> updates. Quoting from patch [08/10]:
> 
>     First, the old code didn't obtain the `packed-refs` lock until
>     `files_transaction_finish()`. This means that a failure to acquire the
>     `packed-refs` lock (e.g., due to contention with another process)
>     wasn't detected until it was too late (problems like this are supposed
>     to be detected in the "prepare" phase). The new code acquires the
>     `packed-refs` lock in `files_transaction_prepare()`, the same stage of
>     the processing when the loose reference locks are being acquired,
>     removing another reason why the "prepare" phase might succeed and the
>     "finish" phase might nevertheless fail.
> 
>     Second, the old code deleted the loose version of a reference before
>     deleting any packed version of the same reference. This left a moment
>     when another process might think that the packed version of the
>     reference is current, which is incorrect. (Even worse, the packed
>     version of the reference can be arbitrarily old, and might even point
>     at an object that has since been garbage-collected.)
> 
>     Third, if a reference deletion fails to acquire the `packed-refs` lock
>     altogether, then the old code might leave the repository in the
>     incorrect state (possibly corrupt) described in the previous
>     paragraph.
> 
>     Now we activate the new "packed-refs" file (sans any references that
>     are being deleted) *before* deleting the corresponding loose
>     references. But we hold the "packed-refs" lock until after the loose
>     references have been finalized, thus preventing a simultaneous
>     "pack-refs" process from packing the loose version of the reference in
>     the time gap, which would otherwise defeat our attempt to delete it.
> 
> This patch series is also available as branch
> `packed-ref-transactions` in my fork on GitHub [1]. A draft of the
> patch series to change `packed_ref_store` to use mmap and get rid of
> its `ref_cache` is available as branch `mmap-packed-refs` from the
> same source.

Overall the patches look sane to me, though I don't believe I'm
qualified in this area to give you a complete thumbs up since I don't
understand the refs code super well yet.  I do like reading patch from
you as you do a great job of laying out what you are doing in code,
comments and commit messages, something I'm trying to get better at :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/10] packed_delete_refs(): implement method
  2017-08-29 18:07   ` Brandon Williams
@ 2017-08-30  3:00     ` Michael Haggerty
  0 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-30  3:00 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Jeff King, Ævar Arnfjörð Bjarmason,
	git

On 08/29/2017 08:07 PM, Brandon Williams wrote:
> On 08/29, Michael Haggerty wrote:
>> [...]
>> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
>> index d19b3bfba5..83a088118f 100644
>> --- a/refs/packed-backend.c
>> +++ b/refs/packed-backend.c
>> @@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store,
>>  static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
>>  			     struct string_list *refnames, unsigned int flags)
>>  {
>> -	die("BUG: not implemented yet");
>> +	struct packed_ref_store *refs =
>> +		packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
>> +	struct strbuf err = STRBUF_INIT;
>> +	struct ref_transaction *transaction;
>> +	struct string_list_item *item;
>> +	int ret;
>> +
>> +	(void)refs; /* We need the check above, but don't use the variable */
> 
> Can't say I've seen a line like this before, Is the intent to just mark
> that this variable won't be used like you mention in the comment?


The `(void)refs;` part tells the compiler not to complain about a
variable that is defined but never used.

So why define the variable in the first place? I could instead do

    (void)packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");

My reason (which people might disagree with) is that I like to keep this
boilerplate consistent from method to method. To switch to the second
variant, not only would the line look different, but it would also have
to be moved down several lines, past the other variables' declarations,
to avoid the dreaded "ISO C90 forbids mixed declarations and code".

I'd prefer to be able to mix declarations and code, but the project's
style prohibits it.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
@ 2017-08-30 17:21   ` Stefan Beller
  2017-08-31  3:42     ` Michael Haggerty
  2017-09-08  4:44   ` Junio C Hamano
  1 sibling, 1 reply; 31+ messages in thread
From: Stefan Beller @ 2017-08-30 17:21 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams,
	git@vger.kernel.org

On Tue, Aug 29, 2017 at 1:20 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Currently, a loose reference is deleted even before locking the
> `packed-refs` file, let alone deleting any packed version of the
> reference. This leads to two problems, demonstrated by two new tests:
>
> * While a reference is being deleted, other processes might see the
>   old, packed value of the reference for a moment before the packed
>   version is deleted. Normally this would be hard to observe, but we
>   can prolong the window by locking the `packed-refs` file externally
>   before running `update-ref`, then unlocking it before `update-ref`'s
>   attempt to acquire the lock times out.
>
> * If the `packed-refs` file is locked so long that `update-ref` fails
>   to lock it, then the reference can be left permanently in the
>   incorrect state described in the previous point.
>
> In a moment, both problems will be fixed.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
> The first of the new tests is rather involved; it uses two background
> processes plus a foreground process that polls the value of a
> reference. But despite sensitivity to timing, I think it should be
> robust even in this broken state. Once the functionality being tested
> is fixed, this test should never produce false positives, though
> really bad timing (e.g., if it takes more than a second for
> `update-ref` to get going) could still lead to false negatives.
>
> Each of the new tests takes about a second to run because they
> simulate lock contention.
>
> If anybody has suggestions for better ways to test these things,
> please speak up :-)
>
>  t/t1404-update-ref-errors.sh | 71 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 71 insertions(+)
>
> diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
> index c34ece48f5..752f83c377 100755
> --- a/t/t1404-update-ref-errors.sh
> +++ b/t/t1404-update-ref-errors.sh
> @@ -404,4 +404,75 @@ test_expect_success 'broken reference blocks indirect create' '
>         test_cmp expected output.err
>  '
>
> +test_expect_failure 'no bogus intermediate values during delete' '
> +       prefix=refs/slow-transaction &&
> +       # Set up a reference with differing loose and packed versions:
> +       git update-ref $prefix/foo $C &&
> +       git pack-refs --all &&
> +       git update-ref $prefix/foo $D &&
> +       git for-each-ref $prefix >unchanged &&
> +       # Now try to update the reference, but hold the `packed-refs` lock
> +       # for a while to see what happens while the process is blocked:
> +       : >.git/packed-refs.lock &&
> +       test_when_finished "rm -f .git/packed-refs.lock" &&
> +       {
> +               sleep 1 &&
> +               rm -f .git/packed-refs.lock &
> +       } &&
> +       pid1=$! &&
> +       {
> +               # Note: the following command is intentionally run in the
> +               # background. We extend the timeout so that `update-ref`
> +               # tries to acquire the `packed-refs` lock longer than it
> +               # takes the background process above to delete it:
> +               git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
> +       } &&
> +       pid2=$! &&
> +       ok=true &&
> +       while kill -0 $pid2 2>/dev/null

    If sig is 0, then no signal is sent, but error checking is still
    performed; this can be used to check for the existence of a
    process ID or process group ID.

So the kill -0 is the idiomatic form of "while $pid2 is still alive"?
ignoring errors due to the dev/null redirection?

And due to the nature of this test we have to have a busy
loop, we cannot rate limit the cpu usage inside the loop
via some shorter sleeps, as ideally we want to observe
the ref at any time.

    In an ideal world this test would instruct the kernel to interrupt
    the executing program (update-ref) at certain events such as
    touching/writing/deleting files and in each interrupt we could
    inspect the file system in a read only fashion.


> +       do
> +               sha1=$(git rev-parse --verify --quiet $prefix/foo || echo undefined) &&
> +               case "$sha1" in
> +               $D)
> +                       # This is OK; it just means that nothing has happened yet.
> +                       : ;;
> +               undefined)
> +                       # This is OK; it means the deletion was successful.
> +                       : ;;
> +               $C)
> +                       # This value should never be seen. Probably the loose
> +                       # reference has been deleted but the packed reference
> +                       # is still there:
> +                       echo "$prefix/foo incorrectly observed to be C" &&
> +                       break
> +                       ;;
> +               *)
> +                       # WTF?
> +                       echo "$prefix/foo unexpected value observed: $sha1" &&
> +                       break
> +                       ;;
> +               esac
> +       done >out &&
> +       wait $pid1 &&
> +       wait $pid2 &&

oh, you use explicit pids here to check each exit code.

> If anybody has suggestions for better ways to test these things,
> please speak up :-)

I don't think I'd have a satisfactory answer to that, as the timing is inherent
to the things we test. In other software projects that are less low level, I
would have suggested to use a time/clock mock, which can be stopped
and then inspection can be performed at defined states.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-08-30 17:21   ` Stefan Beller
@ 2017-08-31  3:42     ` Michael Haggerty
  0 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-08-31  3:42 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams,
	git@vger.kernel.org

On 08/30/2017 07:21 PM, Stefan Beller wrote:
> On Tue, Aug 29, 2017 at 1:20 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> [...]
>> +test_expect_failure 'no bogus intermediate values during delete' '
>> +       prefix=refs/slow-transaction &&
>> +       # Set up a reference with differing loose and packed versions:
>> +       git update-ref $prefix/foo $C &&
>> +       git pack-refs --all &&
>> +       git update-ref $prefix/foo $D &&
>> +       git for-each-ref $prefix >unchanged &&
>> +       # Now try to update the reference, but hold the `packed-refs` lock
>> +       # for a while to see what happens while the process is blocked:
>> +       : >.git/packed-refs.lock &&
>> +       test_when_finished "rm -f .git/packed-refs.lock" &&
>> +       {
>> +               sleep 1 &&
>> +               rm -f .git/packed-refs.lock &
>> +       } &&
>> +       pid1=$! &&
>> +       {
>> +               # Note: the following command is intentionally run in the
>> +               # background. We extend the timeout so that `update-ref`
>> +               # tries to acquire the `packed-refs` lock longer than it
>> +               # takes the background process above to delete it:
>> +               git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
>> +       } &&
>> +       pid2=$! &&
>> +       ok=true &&
>> +       while kill -0 $pid2 2>/dev/null
> 
>     If sig is 0, then no signal is sent, but error checking is still
>     performed; this can be used to check for the existence of a
>     process ID or process group ID.
> 
> So the kill -0 is the idiomatic form of "while $pid2 is still alive"?
> ignoring errors due to the dev/null redirection?
> 
> And due to the nature of this test we have to have a busy
> loop, we cannot rate limit the cpu usage inside the loop
> via some shorter sleeps, as ideally we want to observe
> the ref at any time.

Correct on both counts.

I just noticed that there is a stray line `ok=true &&` from an earlier
draft. I'll remove that in v2.

>     In an ideal world this test would instruct the kernel to interrupt
>     the executing program (update-ref) at certain events such as
>     touching/writing/deleting files and in each interrupt we could
>     inspect the file system in a read only fashion.

A tool like `strace` could be used for tests like this, but it would be
terribly non-portable. (But I often use strace manually to check that
the ordering of filesystem events is correct.)

>> +       do
>> +               sha1=$(git rev-parse --verify --quiet $prefix/foo || echo undefined) &&
>> +               case "$sha1" in
>> +               $D)
>> +                       # This is OK; it just means that nothing has happened yet.
>> +                       : ;;
>> +               undefined)
>> +                       # This is OK; it means the deletion was successful.
>> +                       : ;;
>> +               $C)
>> +                       # This value should never be seen. Probably the loose
>> +                       # reference has been deleted but the packed reference
>> +                       # is still there:
>> +                       echo "$prefix/foo incorrectly observed to be C" &&
>> +                       break
>> +                       ;;
>> +               *)
>> +                       # WTF?
>> +                       echo "$prefix/foo unexpected value observed: $sha1" &&
>> +                       break
>> +                       ;;
>> +               esac
>> +       done >out &&
>> +       wait $pid1 &&
>> +       wait $pid2 &&
> 
> oh, you use explicit pids here to check each exit code.
> 
>> If anybody has suggestions for better ways to test these things,
>> please speak up :-)
> 
> I don't think I'd have a satisfactory answer to that, as the timing is inherent
> to the things we test. In other software projects that are less low level, I
> would have suggested to use a time/clock mock, which can be stopped
> and then inspection can be performed at defined states.

I just realized that, given that the main goal here is to check the
value of the reference while `update-ref` is waiting on the
`packed-refs` lock, we can do the test without a busy loop. Instead, we
roughly

	: >.git/packed-refs.lock &&
	{
		git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
	} &&
	pid2=$! &&
	sleep 1 &&
	# Verify that update-ref is still running:
	kill -0 $pid2 &&
	# ...verify that the reference still has its old value...
	rm -f .git/packed-refs.lock &&
	wait $pid2 &&
	# ...verify that the reference is now gone...

It's true that this version wouldn't discover incorrect transitional
values of the reference that happen at other times, but that was very
unlikely anyway given the speed disparity between C and shell. I'll make
this change in v2.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
  2017-08-30 17:21   ` Stefan Beller
@ 2017-09-08  4:44   ` Junio C Hamano
  2017-09-08  7:45     ` Jeff King
  2017-09-08 10:06     ` Michael Haggerty
  1 sibling, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2017-09-08  4:44 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> +	git for-each-ref $prefix >actual &&
> +	grep "Unable to create $Q.*packed-refs.lock$Q: File exists" err &&

I added a squash for doing s/grep/test_i18n&/ here; are there other
issues in the series, or is the series more or less ready to be
polished in 'next'?

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock
  2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
@ 2017-09-08  6:52   ` Jeff King
  2017-09-08 10:02     ` Michael Haggerty
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-09-08  6:52 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Tue, Aug 29, 2017 at 10:20:25AM +0200, Michael Haggerty wrote:

> The old code incremented the packed ref cache reference count when
> acquiring the packed-refs lock, and decremented the count when
> releasing the lock. This is unnecessary because a locked packed-refs
> file cannot be changed, so there is no reason for the cache to become
> stale.

Hmm, I thought that after your last series, we might hold the lock but
update the packed-refs from a separate tempfile. I.e., 42dfa7ecef
(commit_packed_refs(): use a staging file separate from the lockfile,
2017-06-23).

> Moreover, the extra reference count causes a problem if we
> intentionally clear the packed refs cache, as we sometimes need to do
> if we change the cache in anticipation of writing a change to disk,
> but then the write to disk fails. In that case, `packed_refs_unlock()`
> would have no easy way to find the cache whose reference count it
> needs to decrement.
> 
> This whole issue will soon become moot due to upcoming changes that
> avoid changing the in-memory cache as part of updating the packed-refs
> on disk, but this change makes that transition easier.

All of this makes sense, and I'm happy this complexity is going away in
the long run. But I guess what I'm wondering is in the meantime if we
can have a sequence like:

  1. We hold packed-refs.lock

  2. We update the file without releasing the lock, via 42dfa7ecef.

  3. Still holding the lock, we try to look at packed-refs. The
     stat_validity code says no, we're not up to date.

  4. We discard the old packed_ref_cache and reload it. Because its
     reference count was not incremented during step 1, we actually
     free() it.

  5. We try to look at at the old freed pointer.

There are several steps in there that might be implausible. So I'm
mostly playing devil's advocate here.

I'm wondering if the "don't validate while we hold the lock" logic in
get_packed_refs_cache() means that step 3 is impossible.

> @@ -560,9 +559,7 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
>  	 */
>  	validate_packed_ref_cache(refs);
>  
> -	packed_ref_cache = get_packed_ref_cache(refs);
> -	/* Increment the reference count to prevent it from being freed: */
> -	acquire_packed_ref_cache(packed_ref_cache);
> +	get_packed_ref_cache(refs);

It seems a bit funny to call a "get" function and throw away the return
value. Presumably we care about its side effect of updating refs->cache.
That might be worth a comment (though if this is all going away soon, I
care a lot less about niceties like that).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/10] struct ref_transaction: add a place for backends to store data
  2017-08-29  8:20 ` [PATCH 02/10] struct ref_transaction: add a place for backends to store data Michael Haggerty
@ 2017-09-08  7:02   ` Jeff King
  2017-09-08  8:19     ` Michael Haggerty
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-09-08  7:02 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Tue, Aug 29, 2017 at 10:20:26AM +0200, Michael Haggerty wrote:

> `packed_ref_store` is going to want to store some transaction-wide
> data, so make a place for it.

That makes sense, although...

> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index b02dc5a7e3..d7d344de73 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -242,6 +242,7 @@ struct ref_transaction {
>  	size_t alloc;
>  	size_t nr;
>  	enum ref_transaction_state state;
> +	void *backend_data;
>  };

This is just one pointer. Once we start layering ref backends (and
already we're moving towards a "files" layer which sits atop loose and
packed backends, right?), how do we avoid backends stomping on each
other (or worse, dereferencing somebody else's data as their own
struct)?

I don't know that we necessarily need to answer that question right now,
but I'm worried that this pattern might need adjustment eventually.

I guess the hand-wavy answer is that whatever is doing the layering
would need to manage the pointers. So if you imagine that we had a
"union" backend that took two other arbitrary backends, it would
probably have something like:

  struct union_backend_data {
	void *data_a;
	void *data_b;
  }

and when it forwarded calls to the separate backends, it would give them
a view of the transaction with only their data. Something like:

  void union_backend_foo(void *be, struct ref_transaction *transaction)
  {
	struct union_backend *me = be;
	struct union_backend_data *my_data = transaction->backend_data;

        /* "a" sees only it's data, and we remember any modifications */
	transaction->backend_data = my_data->data_a;
	me->backend_a->foo(me->backend_a, transaction);
	my_data->data_a = transaction->backend_data;

        /* ditto for "b" */
	transaction->backend_data = my_data->data_b;
	me->backend_b->foo(me->backend_b, transaction);
	my_data->data_b = transaction->backend_data;

	/* and then we restore our own view */
	transaction->backend_data = my_data;
  }

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 06/10] files_initial_transaction_commit(): use a transaction for packed refs
  2017-08-29  8:20 ` [PATCH 06/10] files_initial_transaction_commit(): use a transaction for " Michael Haggerty
@ 2017-09-08  7:27   ` Jeff King
  2017-09-08 10:04     ` Michael Haggerty
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-09-08  7:27 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Tue, Aug 29, 2017 at 10:20:30AM +0200, Michael Haggerty wrote:

> Used a `packed_ref_store` transaction in the implementation of
> `files_initial_transaction_commit()` rather than using internal
> features of the packed ref store. This further decouples
> `files_ref_store` from `packed_ref_store`.

Very nice to see these couplings going away.

Minor nit: s/Used/Use/ in the commit message.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] files_ref_store: use a transaction to update packed refs
  2017-08-29  8:20 ` [PATCH 08/10] files_ref_store: use a transaction to update packed refs Michael Haggerty
@ 2017-09-08  7:38   ` Jeff King
  2017-09-08 12:44     ` Michael Haggerty
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-09-08  7:38 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Tue, Aug 29, 2017 at 10:20:32AM +0200, Michael Haggerty wrote:

> First, the old code didn't obtain the `packed-refs` lock until
> `files_transaction_finish()`. This means that a failure to acquire the
> `packed-refs` lock (e.g., due to contention with another process)
> wasn't detected until it was too late (problems like this are supposed
> to be detected in the "prepare" phase). The new code acquires the
> `packed-refs` lock in `files_transaction_prepare()`, the same stage of
> the processing when the loose reference locks are being acquired,
> removing another reason why the "prepare" phase might succeed and the
> "finish" phase might nevertheless fail.

That means we're holding the packed-refs lock for a slightly longer
period. I think this could mean worse lock contention between otherwise
unrelated transactions over the packed-refs file. I wonder if the
lock-retry timeout might need to be increased to accommodate this. On
the other hand, it looks like we take it after getting the individual
locks, which I'd think would be the expensive part.

Are there other callers who take the packed-refs and individual locks in
the reverse order? I think git-pack-refs might. Could we "deadlock" with
pack-refs? There are timeouts so it would resolve itself quickly, but I
wonder if we'd have a case that would deadlock-until-timeout that
otherwise could succeed.

I guess such a deadlock would exist today, as well, if we take the
packed-refs lock before giving up the individual loose ref locks.

> Second, the old code deleted the loose version of a reference before
> deleting any packed version of the same reference. This left a moment
> when another process might think that the packed version of the
> reference is current, which is incorrect. (Even worse, the packed
> version of the reference can be arbitrarily old, and might even point
> at an object that has since been garbage-collected.)
> 
> Third, if a reference deletion fails to acquire the `packed-refs` lock
> altogether, then the old code might leave the repository in the
> incorrect state (possibly corrupt) described in the previous
> paragraph.

And to think I had the hubris to claim a few weeks ago that we had
probably weeded out all of the weird packed-refs inconsistency and
race-condition bugs. :)

Good find.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/10] Implement transactions for the packed ref store
  2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
                   ` (10 preceding siblings ...)
  2017-08-29 18:30 ` [PATCH 00/10] Implement transactions for the packed ref store Brandon Williams
@ 2017-09-08  7:42 ` Jeff King
  11 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2017-09-08  7:42 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Tue, Aug 29, 2017 at 10:20:24AM +0200, Michael Haggerty wrote:

> This should be the second-to-last patch series in the quest for
> mmapped packed-refs.

I just gave this a careful read, and it looks pretty good to me. I had a
few questions, but no show-stoppers. I'll admit to glossing a little bit
over the conversions in patch 3 and 4, after seeing that they're
basically adaptations of the code that goes away in patch 9.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-09-08  4:44   ` Junio C Hamano
@ 2017-09-08  7:45     ` Jeff King
  2017-09-08 10:06     ` Michael Haggerty
  1 sibling, 0 replies; 31+ messages in thread
From: Jeff King @ 2017-09-08  7:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Fri, Sep 08, 2017 at 01:44:30PM +0900, Junio C Hamano wrote:

> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
> > +	git for-each-ref $prefix >actual &&
> > +	grep "Unable to create $Q.*packed-refs.lock$Q: File exists" err &&
> 
> I added a squash for doing s/grep/test_i18n&/ here; are there other
> issues in the series, or is the series more or less ready to be
> polished in 'next'?

I think we'd want the test fix to avoid busy-waiting that was discussed
upthread. I just left a few comments, too. I suspect they will all be
answered without code changes, but I think we'd probably want to wait
for a v2.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/10] struct ref_transaction: add a place for backends to store data
  2017-09-08  7:02   ` Jeff King
@ 2017-09-08  8:19     ` Michael Haggerty
  2017-09-08  8:33       ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-09-08  8:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On 09/08/2017 09:02 AM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 10:20:26AM +0200, Michael Haggerty wrote:
> 
>> `packed_ref_store` is going to want to store some transaction-wide
>> data, so make a place for it.
> 
> That makes sense, although...
> 
>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>> index b02dc5a7e3..d7d344de73 100644
>> --- a/refs/refs-internal.h
>> +++ b/refs/refs-internal.h
>> @@ -242,6 +242,7 @@ struct ref_transaction {
>>  	size_t alloc;
>>  	size_t nr;
>>  	enum ref_transaction_state state;
>> +	void *backend_data;
>>  };
> 
> This is just one pointer. Once we start layering ref backends (and
> already we're moving towards a "files" layer which sits atop loose and
> packed backends, right?), how do we avoid backends stomping on each
> other (or worse, dereferencing somebody else's data as their own
> struct)?

My conception is that layered backends would be separated as much as
possible, and if the "top" layer needs to modify the "bottom" layer, it
would do so via a separate reference transaction on the bottom layer.
That transaction would be owned by the bottom layer, which would be able
to use the corresponding `backend_data` pointers however it likes.

You can see an example of this construct in patch 08.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/10] struct ref_transaction: add a place for backends to store data
  2017-09-08  8:19     ` Michael Haggerty
@ 2017-09-08  8:33       ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2017-09-08  8:33 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Fri, Sep 08, 2017 at 10:19:48AM +0200, Michael Haggerty wrote:

> > This is just one pointer. Once we start layering ref backends (and
> > already we're moving towards a "files" layer which sits atop loose and
> > packed backends, right?), how do we avoid backends stomping on each
> > other (or worse, dereferencing somebody else's data as their own
> > struct)?
> 
> My conception is that layered backends would be separated as much as
> possible, and if the "top" layer needs to modify the "bottom" layer, it
> would do so via a separate reference transaction on the bottom layer.
> That transaction would be owned by the bottom layer, which would be able
> to use the corresponding `backend_data` pointers however it likes.
>
> You can see an example of this construct in patch 08.

OK, that makes sense.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock
  2017-09-08  6:52   ` Jeff King
@ 2017-09-08 10:02     ` Michael Haggerty
  0 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-09-08 10:02 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On 09/08/2017 08:52 AM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 10:20:25AM +0200, Michael Haggerty wrote:
> 
>> The old code incremented the packed ref cache reference count when
>> acquiring the packed-refs lock, and decremented the count when
>> releasing the lock. This is unnecessary because a locked packed-refs
>> file cannot be changed, so there is no reason for the cache to become
>> stale.
> 
> Hmm, I thought that after your last series, we might hold the lock but
> update the packed-refs from a separate tempfile. I.e., 42dfa7ecef
> (commit_packed_refs(): use a staging file separate from the lockfile,
> 2017-06-23).
> 
>> Moreover, the extra reference count causes a problem if we
>> intentionally clear the packed refs cache, as we sometimes need to do
>> if we change the cache in anticipation of writing a change to disk,
>> but then the write to disk fails. In that case, `packed_refs_unlock()`
>> would have no easy way to find the cache whose reference count it
>> needs to decrement.
>>
>> This whole issue will soon become moot due to upcoming changes that
>> avoid changing the in-memory cache as part of updating the packed-refs
>> on disk, but this change makes that transition easier.
> 
> All of this makes sense, and I'm happy this complexity is going away in
> the long run. But I guess what I'm wondering is in the meantime if we
> can have a sequence like:
> 
>   1. We hold packed-refs.lock
> 
>   2. We update the file without releasing the lock, via 42dfa7ecef.
> 
>   3. Still holding the lock, we try to look at packed-refs. The
>      stat_validity code says no, we're not up to date.
> 
>   4. We discard the old packed_ref_cache and reload it. Because its
>      reference count was not incremented during step 1, we actually
>      free() it.
> 
>   5. We try to look at at the old freed pointer.
> 
> There are several steps in there that might be implausible. So I'm
> mostly playing devil's advocate here.
> 
> I'm wondering if the "don't validate while we hold the lock" logic in
> get_packed_refs_cache() means that step 3 is impossible.

That's one of the reasons your scenario can't happen, but that just begs
the question of whether the "don't validate while we hold the lock" code
is wrongheaded.

In fact it's OK. The method by which the packed-refs file on disk is
modified at this point in the patch series is by modifying the packed
ref-cache and then writing the data from the ref-cache to disk. So the
packed ref-cache remains fresh because any changes that we plan to make
to the file are made in the cache first anyway.

I'll explain that a bit better in the log message.

The next question is whether this change interacts badly with changes
later in the patch series, when we cease to modify the ref-cache before
writing the new packed-refs file. Here we're OK, too, because
immediately after `rename_tempfile()` is used to rename the new
packed-refs file into place, we clear the packed ref-cache, so no
subsequent callers of `get_packed_refs_cache()` should see the stale cache.

Which in turn is partly because your step 5 is also implausible: code
shouldn't be holding a pointer to the packed ref-cache across operations
that might change the file. (Code that calls `get_packed_refs_cache()`
is OK because that function would see that `refs->cache` is NULL and
reload it regardless of whether we hold the lock.)

So I think everything is OK, but thanks for making me think through and
explain it better :-)

>> @@ -560,9 +559,7 @@ int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err)
>>  	 */
>>  	validate_packed_ref_cache(refs);
>>  
>> -	packed_ref_cache = get_packed_ref_cache(refs);
>> -	/* Increment the reference count to prevent it from being freed: */
>> -	acquire_packed_ref_cache(packed_ref_cache);
>> +	get_packed_ref_cache(refs);
> 
> It seems a bit funny to call a "get" function and throw away the return
> value. Presumably we care about its side effect of updating refs->cache.
> That might be worth a comment (though if this is all going away soon, I
> care a lot less about niceties like that).

I'm rerolling anyway, so I'll add a comment. Thanks.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 06/10] files_initial_transaction_commit(): use a transaction for packed refs
  2017-09-08  7:27   ` Jeff King
@ 2017-09-08 10:04     ` Michael Haggerty
  0 siblings, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-09-08 10:04 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On 09/08/2017 09:27 AM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 10:20:30AM +0200, Michael Haggerty wrote:
> 
>> Used a `packed_ref_store` transaction in the implementation of
>> `files_initial_transaction_commit()` rather than using internal
>> features of the packed ref store. This further decouples
>> `files_ref_store` from `packed_ref_store`.
> 
> Very nice to see these couplings going away.
> 
> Minor nit: s/Used/Use/ in the commit message.

Thanks; will fix.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
  2017-09-08  4:44   ` Junio C Hamano
  2017-09-08  7:45     ` Jeff King
@ 2017-09-08 10:06     ` Michael Haggerty
  1 sibling, 0 replies; 31+ messages in thread
From: Michael Haggerty @ 2017-09-08 10:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller, Jeff King,
	Ævar Arnfjörð Bjarmason, Brandon Williams, git

On 09/08/2017 06:44 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> +	git for-each-ref $prefix >actual &&
>> +	grep "Unable to create $Q.*packed-refs.lock$Q: File exists" err &&
> 
> I added a squash for doing s/grep/test_i18n&/ here

Thanks for the fix. I always forget that gotcha.

> are there other
> issues in the series, or is the series more or less ready to be
> polished in 'next'?

I'm working on v2 right now.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] files_ref_store: use a transaction to update packed refs
  2017-09-08  7:38   ` Jeff King
@ 2017-09-08 12:44     ` Michael Haggerty
  2017-09-08 12:57       ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Michael Haggerty @ 2017-09-08 12:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On 09/08/2017 09:38 AM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 10:20:32AM +0200, Michael Haggerty wrote:
> 
>> First, the old code didn't obtain the `packed-refs` lock until
>> `files_transaction_finish()`. This means that a failure to acquire the
>> `packed-refs` lock (e.g., due to contention with another process)
>> wasn't detected until it was too late (problems like this are supposed
>> to be detected in the "prepare" phase). The new code acquires the
>> `packed-refs` lock in `files_transaction_prepare()`, the same stage of
>> the processing when the loose reference locks are being acquired,
>> removing another reason why the "prepare" phase might succeed and the
>> "finish" phase might nevertheless fail.
> 
> That means we're holding the packed-refs lock for a slightly longer
> period. I think this could mean worse lock contention between otherwise
> unrelated transactions over the packed-refs file. I wonder if the
> lock-retry timeout might need to be increased to accommodate this. On
> the other hand, it looks like we take it after getting the individual
> locks, which I'd think would be the expensive part.

That was my thinking, yes. While the packed-refs lock is held, the
references being created/updated have their reflogs written and are
renamed into place. I don't see how that can be shortened without
compromising on correctness (in particular, that we want to process
creates/updates before deletions to try to preserve reachability as much
as possible during the transaction). As an added optimization, the
packed-refs lock is not acquired at all if no references are being deleted.

> Are there other callers who take the packed-refs and individual locks in
> the reverse order? I think git-pack-refs might. Could we "deadlock" with
> pack-refs? There are timeouts so it would resolve itself quickly, but I
> wonder if we'd have a case that would deadlock-until-timeout that
> otherwise could succeed.

That's not true. `files_pack_refs()` does the following:

1. Lock the `packed-refs` file.

2. Start a packed ref-store transaction.

3. Iterate over the loose ref cache. For each reference that should
   be packed:
   * add it to the packed-refs transaction as an update to set it
     to the loose value (without specifying an old_sha1)
   * if pruning is on, schedule the loose reference to be pruned
     (in an internal data structure, without locking the file).

4. Commit the packed ref-store transaction.

5. Release the packed-refs lock.

6. For each to-prune loose ref:
   * lock the loose reference
   * verify that it still has the value that was just written to
     the `packed-refs` file
   * if so, delete the loose version of the reference
   * unlock the loose reference

The packed-refs file and loose references are never locked at the same
time during pack-refs, so no deadlock is possible.

But you are right to assume that it *should* be so. The algorithm
written above is a tiny bit unsafe (and has been for years). It is
possible, though admittedly very unlikely, for the following to happen
in the gap between steps 5 and 6:

1. One process updates the value of branch "foo" from A to B.
   (Remember that A is the value that was just written to the
   `packed-refs` file by step 4.)

2. `pack-refs` is run *again* (while the first `pack-refs` is out
   on its lunch break), writes value B to the `packed-refs` file
   for "foo", and deletes the loose version of "foo".

3. Yet *another* process changes "foo" back from B to A. This
   creates a loose version of "foo" with value A, which overwrites
   the packed version with value B.

4. The first `pack-refs` process resumes at step 6, sees that
   "foo" "still" has the value "A", so deletes the loose reference.

This would leave "foo" at the obsolete value "B" (i.e., the value
written to the `packed-refs` file for it by the nested `pack-refs` process.

I think that fixing this problem would require the `packed-refs` lock to
be held while `pack-refs` is pruning the loose references. But given how
unlikely that chain of events seems, and that fixing it would increase
contention on the `packed-refs` file and allow the deadlock that you
described, I lean towards leaving it as-is. Though admittedly,
contention over a loose reference lock could make the race more likely
to be hit.

I also just noticed that the `refs_to_prune` linked list is leaked here
(as has also been the case for many years). I'll fix that separately.

>> Second, the old code deleted the loose version of a reference before
>> deleting any packed version of the same reference. This left a moment
>> when another process might think that the packed version of the
>> reference is current, which is incorrect. (Even worse, the packed
>> version of the reference can be arbitrarily old, and might even point
>> at an object that has since been garbage-collected.)
>>
>> Third, if a reference deletion fails to acquire the `packed-refs` lock
>> altogether, then the old code might leave the repository in the
>> incorrect state (possibly corrupt) described in the previous
>> paragraph.
> 
> And to think I had the hubris to claim a few weeks ago that we had
> probably weeded out all of the weird packed-refs inconsistency and
> race-condition bugs. :)

Haha, job security.

Michael

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] files_ref_store: use a transaction to update packed refs
  2017-09-08 12:44     ` Michael Haggerty
@ 2017-09-08 12:57       ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2017-09-08 12:57 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Stefan Beller, Ævar Arnfjörð Bjarmason,
	Brandon Williams, git

On Fri, Sep 08, 2017 at 02:44:57PM +0200, Michael Haggerty wrote:

> > That means we're holding the packed-refs lock for a slightly longer
> > period. I think this could mean worse lock contention between otherwise
> > unrelated transactions over the packed-refs file. I wonder if the
> > lock-retry timeout might need to be increased to accommodate this. On
> > the other hand, it looks like we take it after getting the individual
> > locks, which I'd think would be the expensive part.
> 
> That was my thinking, yes. While the packed-refs lock is held, the
> references being created/updated have their reflogs written and are
> renamed into place. I don't see how that can be shortened without
> compromising on correctness (in particular, that we want to process
> creates/updates before deletions to try to preserve reachability as much
> as possible during the transaction). As an added optimization, the
> packed-refs lock is not acquired at all if no references are being deleted.

Makes sense.  I guess in theory a process could do significant unrelated
work between the "prepare" and "finish" steps, while holding the lock.
But I don't know why it would, and arguably that itself is a bug.

> The packed-refs file and loose references are never locked at the same
> time during pack-refs, so no deadlock is possible.
> 
> But you are right to assume that it *should* be so. The algorithm
> written above is a tiny bit unsafe (and has been for years). It is
> possible, though admittedly very unlikely, for the following to happen
> in the gap between steps 5 and 6:

Thanks for explaining (and I think that your "should" is why I thought
it was so; we've discussed this race before).

> This would leave "foo" at the obsolete value "B" (i.e., the value
> written to the `packed-refs` file for it by the nested `pack-refs` process.
> 
> I think that fixing this problem would require the `packed-refs` lock to
> be held while `pack-refs` is pruning the loose references. But given how
> unlikely that chain of events seems, and that fixing it would increase
> contention on the `packed-refs` file and allow the deadlock that you
> described, I lean towards leaving it as-is. Though admittedly,
> contention over a loose reference lock could make the race more likely
> to be hit.

Agreed. It's a pretty unlikely sequence of events. And IMHO the real
solution is a new storage format that's easier to reason about.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-09-08 12:57 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
2017-09-08  6:52   ` Jeff King
2017-09-08 10:02     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 02/10] struct ref_transaction: add a place for backends to store data Michael Haggerty
2017-09-08  7:02   ` Jeff King
2017-09-08  8:19     ` Michael Haggerty
2017-09-08  8:33       ` Jeff King
2017-08-29  8:20 ` [PATCH 03/10] packed_ref_store: implement reference transactions Michael Haggerty
2017-08-29  8:20 ` [PATCH 04/10] packed_delete_refs(): implement method Michael Haggerty
2017-08-29 18:07   ` Brandon Williams
2017-08-30  3:00     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs Michael Haggerty
2017-08-29  8:20 ` [PATCH 06/10] files_initial_transaction_commit(): use a transaction for " Michael Haggerty
2017-09-08  7:27   ` Jeff King
2017-09-08 10:04     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
2017-08-30 17:21   ` Stefan Beller
2017-08-31  3:42     ` Michael Haggerty
2017-09-08  4:44   ` Junio C Hamano
2017-09-08  7:45     ` Jeff King
2017-09-08 10:06     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 08/10] files_ref_store: use a transaction to update packed refs Michael Haggerty
2017-09-08  7:38   ` Jeff King
2017-09-08 12:44     ` Michael Haggerty
2017-09-08 12:57       ` Jeff King
2017-08-29  8:20 ` [PATCH 09/10] packed-backend: rip out some now-unused code Michael Haggerty
2017-08-29 18:24   ` Brandon Williams
2017-08-29  8:20 ` [PATCH 10/10] files_transaction_finish(): delete reflogs before references Michael Haggerty
2017-08-29 18:30 ` [PATCH 00/10] Implement transactions for the packed ref store Brandon Williams
2017-09-08  7:42 ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).