git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Stefan Beller <sbeller@google.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Jeff King <peff@peff.net>,
	git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH v2 10/18] refs: check for D/F conflicts among refs created in a transaction
Date: Mon, 11 May 2015 17:25:12 +0200	[thread overview]
Message-ID: <1431357920-25090-11-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1431357920-25090-1-git-send-email-mhagger@alum.mit.edu>

If two references that D/F conflict (e.g., "refs/foo" and
"refs/foo/bar") are created in a single transaction, the old code
discovered the problem only after the "commit" phase of
ref_transaction_commit() had already begun. This could leave some
references updated and others not, which violates the promise of
atomicity.

Instead, check for such conflicts during the "locking" phase:

* Teach is_refname_available() to take an "extras" parameter that can
  contain extra reference names with which the specified refname must
  not conflict.

* Change lock_ref_sha1_basic() to take an "extras" parameter, which it
  passes through to is_refname_available().

* Change ref_transaction_commit() to pass "affected_refnames" to
  lock_ref_sha1_basic() as its "extras" argument.

This change fixes a test case in t1404.

This code is a bit stricter than it needs to be. We could conceivably
allow reference "refs/foo/bar" to be created in the same transaction
as "refs/foo" is deleted (or vice versa). But that would be
complicated to implement, because it is not possible to lock
"refs/foo/bar" while "refs/foo" exists as a loose reference, but on
the other hand we don't want to delete some references before adding
others (because that could leave a gap during which required objects
are unreachable). There is also a complication that reflog files'
paths can conflict.

Any less-strict implementation would probably require tricks like the
packing of all references before the start of the real transaction, or
the use of temporary intermediate reference names.

So for now let's accept too-strict checks. Some reference update
transactions will be rejected unnecessarily, but they will be rejected
in their entirety rather than leaving the repository in an
intermediate state, as would happen now.

Please note that there is still one kind of D/F conflict that is *not*
handled correctly. If two processes are running at the same time, and
one tries to create "refs/foo" at the same time that the other tries
to create "refs/foo/bar", then they can race with each other. Both
processes can obtain their respective locks ("refs/foo.lock" and
"refs/foo/bar.lock"), proceed to the "commit" phase of
ref_transaction_commit(), and then the slower process will discover
that it cannot rename its lockfile into place (after possibly having
committed changes to other references). There appears to be no way to
fix this race without changing the locking policy, which in turn would
require a change to *all* Git clients.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c                             | 156 ++++++++++++++++++++++---------------
 t/t1404-update-ref-df-conflicts.sh |   2 +-
 2 files changed, 95 insertions(+), 63 deletions(-)

diff --git a/refs.c b/refs.c
index 6bb65ab..0fa70fb 100644
--- a/refs.c
+++ b/refs.c
@@ -859,19 +859,22 @@ static int nonmatching_ref_fn(struct ref_entry *entry, void *vdata)
 
 /*
  * Return true iff a reference named refname could be created without
- * conflicting with the name of an existing reference in dir.  If
- * skip is non-NULL, ignore potential conflicts with refs in skip
- * (e.g., because they are scheduled for deletion in the same
- * operation).
+ * conflicting with the name of an existing reference in dir. If
+ * extras is non-NULL, it is a list of additional refnames with which
+ * refname is not allowed to conflict. If skip is non-NULL, ignore
+ * potential conflicts with refs in skip (e.g., because they are
+ * scheduled for deletion in the same operation). Behavior is
+ * undefined if the same name is listed in both extras and skip.
  *
  * Two reference names conflict if one of them exactly matches the
  * leading components of the other; e.g., "refs/foo/bar" conflicts
  * with both "refs/foo" and with "refs/foo/bar/baz" but not with
  * "refs/foo/bar" or "refs/foo/barbados".
  *
- * skip must be sorted.
+ * extras and skip must be sorted.
  */
 static int is_refname_available(const char *refname,
+				const struct string_list *extras,
 				const struct string_list *skip,
 				struct ref_dir *dir)
 {
@@ -895,51 +898,53 @@ static int is_refname_available(const char *refname,
 		 * "refs/foo"; if there is a reference with that name,
 		 * it is a conflict, *unless* it is in skip.
 		 */
-		pos = search_ref_dir(dir, dirname.buf, dirname.len);
-		if (pos >= 0) {
-			/*
-			 * We found a reference whose name is a proper
-			 * prefix of refname; e.g., "refs/foo".
-			 */
-			if (skip && string_list_has_string(skip, dirname.buf)) {
+		if (dir) {
+			pos = search_ref_dir(dir, dirname.buf, dirname.len);
+			if (pos >= 0 &&
+			    (!skip || !string_list_has_string(skip, dirname.buf))) {
 				/*
-				 * The reference we just found, e.g.,
-				 * "refs/foo", is also in skip, so it
-				 * is not considered a conflict.
-				 * Moreover, the fact that "refs/foo"
-				 * exists means that there cannot be
-				 * any references anywhere under the
-				 * "refs/foo/" namespace (because they
-				 * would have conflicted with
-				 * "refs/foo"). So we can stop looking
-				 * now and return true.
+				 * We found a reference whose name is
+				 * a proper prefix of refname; e.g.,
+				 * "refs/foo", and is not in skip.
 				 */
-				ret = 1;
+				error("'%s' exists; cannot create '%s'",
+				      dirname.buf, refname);
 				goto cleanup;
 			}
-			error("'%s' exists; cannot create '%s'", dirname.buf, refname);
-			goto cleanup;
 		}
 
+		if (extras && string_list_has_string(extras, dirname.buf) &&
+		    (!skip || !string_list_has_string(skip, dirname.buf))) {
+			error("cannot process '%s' and '%s' at the same time",
+			      refname, dirname.buf);
+			goto cleanup;
+		}
 
 		/*
 		 * Otherwise, we can try to continue our search with
 		 * the next component. So try to look up the
-		 * directory, e.g., "refs/foo/".
+		 * directory, e.g., "refs/foo/". If we come up empty,
+		 * we know there is nothing under this whole prefix,
+		 * but even in that case we still have to continue the
+		 * search for conflicts with extras.
 		 */
 		strbuf_addch(&dirname, '/');
-		pos = search_ref_dir(dir, dirname.buf, dirname.len);
-		if (pos < 0) {
-			/*
-			 * There was no directory "refs/foo/", so
-			 * there is nothing under this whole prefix,
-			 * and we are OK.
-			 */
-			ret = 1;
-			goto cleanup;
+		if (dir) {
+			pos = search_ref_dir(dir, dirname.buf, dirname.len);
+			if (pos < 0) {
+				/*
+				 * There was no directory "refs/foo/",
+				 * so there is nothing under this
+				 * whole prefix. So there is no need
+				 * to continue looking for conflicting
+				 * references. But we need to continue
+				 * looking for conflicting extras.
+				 */
+				dir = NULL;
+			} else {
+				dir = get_ref_dir(dir->entries[pos]);
+			}
 		}
-
-		dir = get_ref_dir(dir->entries[pos]);
 	}
 
 	/*
@@ -952,30 +957,56 @@ static int is_refname_available(const char *refname,
 	 */
 	strbuf_addstr(&dirname, refname + dirname.len);
 	strbuf_addch(&dirname, '/');
-	pos = search_ref_dir(dir, dirname.buf, dirname.len);
 
-	if (pos >= 0) {
+	if (dir) {
+		pos = search_ref_dir(dir, dirname.buf, dirname.len);
+
+		if (pos >= 0) {
+			/*
+			 * We found a directory named "$refname/"
+			 * (e.g., "refs/foo/bar/"). It is a problem
+			 * iff it contains any ref that is not in
+			 * "skip".
+			 */
+			struct nonmatching_ref_data data;
+
+			data.skip = skip;
+			data.conflicting_refname = NULL;
+			dir = get_ref_dir(dir->entries[pos]);
+			sort_ref_dir(dir);
+			if (do_for_each_entry_in_dir(dir, 0, nonmatching_ref_fn, &data)) {
+				error("'%s' exists; cannot create '%s'",
+				      data.conflicting_refname, refname);
+				goto cleanup;
+			}
+		}
+	}
+
+	if (extras) {
 		/*
-		 * We found a directory named "$refname/" (e.g.,
-		 * "refs/foo/bar/"). It is a problem iff it contains
-		 * any ref that is not in "skip".
+		 * Check for entries in extras that start with
+		 * "$refname/". We do that by looking for the place
+		 * where "$refname/" would be inserted in extras. If
+		 * there is an entry at that position that starts with
+		 * "$refname/" and is not in skip, then we have a
+		 * conflict.
 		 */
-		struct nonmatching_ref_data data;
-		struct ref_entry *entry = dir->entries[pos];
-
-		dir = get_ref_dir(entry);
-		data.skip = skip;
-		sort_ref_dir(dir);
-		if (!do_for_each_entry_in_dir(dir, 0, nonmatching_ref_fn, &data)) {
-			ret = 1;
-			goto cleanup;
-		}
+		for (pos = string_list_find_insert_index(extras, dirname.buf, 0);
+		     pos < extras->nr; pos++) {
+			const char *extra_refname = extras->items[pos].string;
 
-		error("'%s' exists; cannot create '%s'",
-		      data.conflicting_refname, refname);
-		goto cleanup;
+			if (!starts_with(extra_refname, dirname.buf))
+				break;
+
+			if (!skip || !string_list_has_string(skip, extra_refname)) {
+				error("cannot process '%s' and '%s' at the same time",
+				      refname, extra_refname);
+				goto cleanup;
+			}
+		}
 	}
 
+	/* No conflicts were found */
 	ret = 1;
 
 cleanup:
@@ -2296,6 +2327,7 @@ int dwim_log(const char *str, int len, unsigned char *sha1, char **log)
  */
 static struct ref_lock *lock_ref_sha1_basic(const char *refname,
 					    const unsigned char *old_sha1,
+					    const struct string_list *extras,
 					    const struct string_list *skip,
 					    unsigned int flags, int *type_p)
 {
@@ -2351,7 +2383,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname,
 	 * our refname.
 	 */
 	if (is_null_sha1(lock->old_sha1) &&
-	     !is_refname_available(refname, skip, get_packed_refs(&ref_cache))) {
+	    !is_refname_available(refname, extras, skip, get_packed_refs(&ref_cache))) {
 		last_errno = ENOTDIR;
 		goto error_return;
 	}
@@ -2792,8 +2824,8 @@ static int rename_ref_available(const char *oldname, const char *newname)
 	int ret;
 
 	string_list_insert(&skip, oldname);
-	ret = is_refname_available(newname, &skip, get_packed_refs(&ref_cache))
-	    && is_refname_available(newname, &skip, get_loose_refs(&ref_cache));
+	ret = is_refname_available(newname, NULL, &skip, get_packed_refs(&ref_cache))
+		&& is_refname_available(newname, NULL, &skip, get_loose_refs(&ref_cache));
 	string_list_clear(&skip, 0);
 	return ret;
 }
@@ -2851,7 +2883,7 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 
 	logmoved = log;
 
-	lock = lock_ref_sha1_basic(newrefname, NULL, NULL, 0, NULL);
+	lock = lock_ref_sha1_basic(newrefname, NULL, NULL, NULL, 0, NULL);
 	if (!lock) {
 		error("unable to lock %s for update", newrefname);
 		goto rollback;
@@ -2865,7 +2897,7 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 	return 0;
 
  rollback:
-	lock = lock_ref_sha1_basic(oldrefname, NULL, NULL, 0, NULL);
+	lock = lock_ref_sha1_basic(oldrefname, NULL, NULL, NULL, 0, NULL);
 	if (!lock) {
 		error("unable to lock %s for rollback", oldrefname);
 		goto rollbacklog;
@@ -3777,7 +3809,7 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 				update->refname,
 				((update->flags & REF_HAVE_OLD) ?
 				 update->old_sha1 : NULL),
-				NULL,
+				&affected_refnames, NULL,
 				flags,
 				&update->type);
 		if (!update->lock) {
@@ -4054,7 +4086,7 @@ int reflog_expire(const char *refname, const unsigned char *sha1,
 	 * reference itself, plus we might need to update the
 	 * reference if --updateref was specified:
 	 */
-	lock = lock_ref_sha1_basic(refname, sha1, NULL, 0, &type);
+	lock = lock_ref_sha1_basic(refname, sha1, NULL, NULL, 0, &type);
 	if (!lock)
 		return error("cannot lock ref '%s'", refname);
 	if (!reflog_exists(refname)) {
diff --git a/t/t1404-update-ref-df-conflicts.sh b/t/t1404-update-ref-df-conflicts.sh
index ed1b93e..b0bb7d4 100755
--- a/t/t1404-update-ref-df-conflicts.sh
+++ b/t/t1404-update-ref-df-conflicts.sh
@@ -96,7 +96,7 @@ test_expect_success 'new ref is a deeper prefix of existing packed' '
 
 '
 
-test_expect_failure 'one new ref is a simple prefix of another' '
+test_expect_success 'one new ref is a simple prefix of another' '
 
 	prefix=refs/5 &&
 	test_update_rejected $prefix "a e" false "b c c/x d" \
-- 
2.1.4

  parent reply	other threads:[~2015-05-11 15:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-11 15:25 [PATCH v2 00/18] Improve handling of D/F conflicts Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 01/18] t1404: new tests of ref D/F conflicts within transactions Michael Haggerty
2015-05-11 18:52   ` Junio C Hamano
2015-05-12  8:45     ` Michael Haggerty
2015-05-11 19:37   ` Junio C Hamano
2015-05-12  8:32     ` Michael Haggerty
2015-05-12 15:45       ` Junio C Hamano
2015-05-13 20:19         ` Michael Haggerty
2015-05-14 17:00           ` Junio C Hamano
2015-05-22 22:47             ` Michael Haggerty
2015-05-23  0:52               ` Junio C Hamano
2015-05-11 15:25 ` [PATCH v2 02/18] is_refname_available(): revamp the comments Michael Haggerty
2015-05-12 21:04   ` Junio C Hamano
2015-05-11 15:25 ` [PATCH v2 03/18] is_refname_available(): avoid shadowing "dir" variable Michael Haggerty
2015-05-12 21:06   ` Junio C Hamano
2015-05-11 15:25 ` [PATCH v2 04/18] is_refname_available(): convert local variable "dirname" to strbuf Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 05/18] entry_matches(): inline function Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 06/18] report_refname_conflict(): " Michael Haggerty
2015-05-12 21:21   ` Junio C Hamano
2015-05-11 15:25 ` [PATCH v2 07/18] struct nonmatching_ref_data: store a refname instead of a ref_entry Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 08/18] is_refname_available(): use dirname in first loop Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 09/18] ref_transaction_commit(): use a string_list for detecting duplicates Michael Haggerty
2015-05-11 15:25 ` Michael Haggerty [this message]
2015-05-11 15:25 ` [PATCH v2 11/18] verify_refname_available(): rename function Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 12/18] verify_refname_available(): report errors via a "struct strbuf *err" Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 13/18] lock_ref_sha1_basic(): " Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 14/18] lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 15/18] rename_ref(): integrate lock_ref_sha1_basic() errors into ours Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 16/18] ref_transaction_commit(): provide better error messages Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 17/18] ref_transaction_commit(): delete extra "the" from error message Michael Haggerty
2015-05-11 15:25 ` [PATCH v2 18/18] reflog_expire(): integrate lock_ref_sha1_basic() errors into ours Michael Haggerty
2015-05-11 17:56 ` [PATCH v2 00/18] Improve handling of D/F conflicts Junio C Hamano
2015-05-11 18:10   ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1431357920-25090-11-git-send-email-mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).