git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org, David Turner <dturner@twopensource.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>
Subject: [PATCH 27/29] lock_ref_for_update(): don't resolve symrefs
Date: Wed, 27 Apr 2016 18:57:44 +0200	[thread overview]
Message-ID: <dc9cd5747320dbbfc328f79b60ba8d6d78764fff.1461768690.git.mhagger@alum.mit.edu> (raw)
In-Reply-To: <cover.1461768689.git.mhagger@alum.mit.edu>

If a transaction includes a non-NODEREF update to a symbolic reference,
we don't have to look it up in lock_ref_for_update(). The reference will
be dereferenced anyway when the split-off update is processed.

This change requires that we store a backpointer from the split-off
update to its parent update, for two reasons:

* We still want to report the original reference name in error messages.
  So if an error occurs when checking the split-off update's old_sha1,
  walk the parent_update pointers back to find the original reference
  name, and report that one.

* We still need to write the old_sha1 of the symref to its reflog. So
  after we read the split-off update's reference value, walk the
  parent_update pointers back and fill in their old_sha1 fields.

Aside from eliminating unnecessary reads, this change fixes a
subtle (though not very serious) race condition: in the old code, the
old_sha1 of the symref was resolved before the reference that it pointed
at was locked. So it was possible that the old_sha1 value logged to the
symref's reflog could be wrong if another process changed the downstream
reference before it was locked.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
There is one remaining race that I know of: the value of HEAD is read
at the start of the transaction to determine whether its referent is
changed in the transaction. If so, the update is also logged in HEAD's
reflog.

The problem is that HEAD is read without acquiring its lock. So in
principle HEAD could change during the time that the transaction is
being processed, leading to inconsistencies in the reflogs.

This problem could be fixed, but it would require HEAD to be locked
for every reference transaction, probably in some kind of an
additional fake ref_update. I don't know that it's worth the effort,
but if it is it can be done within the same framework that I
implemented here.

 refs/files-backend.c | 108 +++++++++++++++++++++++++++++++++++++--------------
 refs/refs-internal.h |  17 ++++++++
 2 files changed, 95 insertions(+), 30 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 66ceb83..40ed157 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3366,8 +3366,15 @@ static int split_symref_update(struct ref_update *update,
 			update->new_sha1, update->old_sha1,
 			update->msg);
 
-	/* Change the symbolic ref update to log only: */
+	new_update->parent_update = update;
+
+	/*
+	 * Change the symbolic ref update to log only. Also, it
+	 * doesn't need to check its old SHA-1 value, as that will be
+	 * done when new_update is processed.
+	 */
 	update->flags |= REF_LOG_ONLY | REF_NODEREF;
+	update->flags &= ~REF_HAVE_OLD;
 
 	item->util = new_update;
 
@@ -3375,6 +3382,17 @@ static int split_symref_update(struct ref_update *update,
 }
 
 /*
+ * Return the refname under which update was originally requested.
+ */
+static const char *original_update_refname(struct ref_update *update)
+{
+	while (update->parent_update)
+		update = update->parent_update;
+
+	return update->refname;
+}
+
+/*
  * Prepare for carrying out update:
  * - Lock the reference referred to by update.
  * - Read the reference under lock.
@@ -3428,44 +3446,74 @@ static int lock_ref_for_update(struct ref_update *update,
 	lock = update->lock;
 
 	if (update->type & REF_ISSYMREF) {
-		if (read_ref_full(update->refname,
-				  mustexist ? RESOLVE_REF_READING : 0,
-				  lock->old_oid.hash, NULL)) {
-			if (update->flags & REF_HAVE_OLD) {
-				strbuf_addf(err, "cannot lock ref '%s': can't resolve old value",
-					    update->refname);
+		if (update->flags & REF_NODEREF) {
+			/*
+			 * We won't be reading the referent as part of
+			 * the transaction, so we have to read it here
+			 * to record and possibly check old_sha1:
+			 */
+			if (read_ref_full(update->refname,
+					  mustexist ? RESOLVE_REF_READING : 0,
+					  lock->old_oid.hash, NULL)) {
+				if (update->flags & REF_HAVE_OLD) {
+					strbuf_addf(err, "cannot lock ref '%s': "
+						    "can't resolve old value",
+						    update->refname);
+					return TRANSACTION_GENERIC_ERROR;
+				} else {
+					hashclr(lock->old_oid.hash);
+				}
+			}
+			if ((update->flags & REF_HAVE_OLD) &&
+			    hashcmp(lock->old_oid.hash, update->old_sha1)) {
+				strbuf_addf(err, "cannot lock ref '%s': "
+					    "is at %s but expected %s",
+					    update->refname,
+					    sha1_to_hex(lock->old_oid.hash),
+					    sha1_to_hex(update->old_sha1));
 				return TRANSACTION_GENERIC_ERROR;
-			} else {
-				hashclr(lock->old_oid.hash);
 			}
-		}
-		if ((update->flags & REF_HAVE_OLD) &&
-		    hashcmp(lock->old_oid.hash, update->old_sha1)) {
-			strbuf_addf(err, "cannot lock ref '%s': is at %s but expected %s",
-				    update->refname,
-				    sha1_to_hex(lock->old_oid.hash),
-				    sha1_to_hex(update->old_sha1));
-			return TRANSACTION_GENERIC_ERROR;
-		}
 
-		if (!(update->flags & REF_NODEREF)) {
+		} else {
+			/*
+			 * Create a new update for the reference this
+			 * symref is pointing at. Also, we will record
+			 * and verify old_sha1 for this update as part
+			 * of processing the split-off update, so we
+			 * don't have to do it here.
+			 */
 			ret = split_symref_update(update, referent.buf, transaction,
 						  affected_refnames, err);
 			if (ret)
 				return ret;
 		}
-	} else if ((update->flags & REF_HAVE_OLD) &&
-		   hashcmp(lock->old_oid.hash, update->old_sha1)) {
-		if (is_null_sha1(update->old_sha1))
-			strbuf_addf(err, "cannot lock ref '%s': reference already exists",
-				    update->refname);
-		else
-			strbuf_addf(err, "cannot lock ref '%s': is at %s but expected %s",
-				    update->refname,
-				    sha1_to_hex(lock->old_oid.hash),
-				    sha1_to_hex(update->old_sha1));
+	} else {
+		struct ref_update *parent_update;
 
-		return TRANSACTION_GENERIC_ERROR;
+		/*
+		 * If this update is happening indirectly because of a
+		 * symref update, record the old SHA-1 in the parent
+		 * update:
+		 */
+		for (parent_update = update->parent_update;
+		     parent_update;
+		     parent_update = parent_update->parent_update) {
+			oidcpy(&parent_update->lock->old_oid, &lock->old_oid);
+		}
+
+		if ((update->flags & REF_HAVE_OLD) &&
+		    hashcmp(lock->old_oid.hash, update->old_sha1)) {
+			if (is_null_sha1(update->old_sha1))
+				strbuf_addf(err, "cannot lock ref '%s': reference already exists",
+					    original_update_refname(update));
+			else
+				strbuf_addf(err, "cannot lock ref '%s': is at %s but expected %s",
+					    original_update_refname(update),
+					    sha1_to_hex(lock->old_oid.hash),
+					    sha1_to_hex(update->old_sha1));
+
+			return TRANSACTION_GENERIC_ERROR;
+		}
 	}
 
 	if ((update->flags & REF_HAVE_NEW) &&
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 9839b07..2355735 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -143,24 +143,41 @@ int should_autocreate_reflog(const char *refname);
  * not exist before update.
  */
 struct ref_update {
+
 	/*
 	 * If (flags & REF_HAVE_NEW), set the reference to this value:
 	 */
 	unsigned char new_sha1[20];
+
 	/*
 	 * If (flags & REF_HAVE_OLD), check that the reference
 	 * previously had this value:
 	 */
 	unsigned char old_sha1[20];
+
 	/*
 	 * One or more of REF_HAVE_NEW, REF_HAVE_OLD, REF_NODEREF,
 	 * REF_DELETING, REF_ISPRUNING, REF_LOG_ONLY, and
 	 * REF_UPDATE_VIA_HEAD:
 	 */
 	unsigned int flags;
+
 	struct ref_lock *lock;
 	unsigned int type;
 	char *msg;
+
+	/*
+	 * If this ref_update was split off of a symref update via
+	 * split_symref_update(), then this member points at that
+	 * update. This is used for two purposes:
+	 * 1. When reporting errors, we report the refname under which
+	 *    the update was originally requested.
+	 * 2. When we read the old value of this reference, we
+	 *    propagate it back to its parent update for recording in
+	 *    the latter's reflog.
+	 */
+	struct ref_update *parent_update;
+
 	const char refname[FLEX_ARRAY];
 };
 
-- 
2.8.1

  parent reply	other threads:[~2016-04-27 16:59 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27 16:57 [PATCH 00/29] Yet more preparation for reference backends Michael Haggerty
2016-04-27 16:57 ` [PATCH 01/29] safe_create_leading_directories(): improve docstring Michael Haggerty
2016-04-27 16:57 ` [PATCH 02/29] remove_dir_recursively(): add docstring Michael Haggerty
2016-04-27 16:57 ` [PATCH 03/29] refname_is_safe(): use skip_prefix() Michael Haggerty
2016-04-27 16:57 ` [PATCH 04/29] refname_is_safe(): don't allow the empty string Michael Haggerty
2016-04-27 16:57 ` [PATCH 05/29] refname_is_safe(): insist that the refname already be normalized Michael Haggerty
2016-04-27 17:59   ` Junio C Hamano
2016-04-27 20:10     ` David Turner
2016-04-27 20:15       ` Jeff King
2016-04-27 20:34         ` David Turner
2016-04-27 20:37           ` Jeff King
2016-04-27 22:19             ` Jeff King
2016-04-28 17:44             ` David Turner
2016-04-27 16:57 ` [PATCH 06/29] commit_ref_update(): write error message to *err, not stderr Michael Haggerty
2016-04-27 16:57 ` [PATCH 07/29] rename_ref(): remove unneeded local variable Michael Haggerty
2016-04-27 16:57 ` [PATCH 08/29] ref_transaction_commit(): remove local variable n Michael Haggerty
2016-04-27 18:16   ` Junio C Hamano
2016-04-27 20:45     ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 09/29] read_raw_ref(): rename flags argument to type Michael Haggerty
2016-04-27 16:57 ` [PATCH 10/29] read_raw_ref(): clear *type at start of function Michael Haggerty
2016-04-27 16:57 ` [PATCH 11/29] read_raw_ref(): rename symref argument to referent Michael Haggerty
2016-04-27 16:57 ` [PATCH 12/29] read_raw_ref(): improve docstring Michael Haggerty
2016-04-27 18:31   ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 13/29] lock_ref_sha1_basic(): remove unneeded local variable Michael Haggerty
2016-04-27 16:57 ` [PATCH 14/29] refs: make error messages more consistent Michael Haggerty
2016-04-27 16:57 ` [PATCH 15/29] ref_transaction_create(): disallow recursive pruning Michael Haggerty
2016-04-27 18:47   ` Junio C Hamano
2016-04-27 20:23     ` David Turner
2016-04-27 20:45       ` Junio C Hamano
2016-04-27 21:15         ` Junio C Hamano
2016-04-28 17:48           ` David Turner
2016-04-28 20:15             ` Junio C Hamano
2016-04-29  6:56           ` Michael Haggerty
2016-04-29  8:19             ` Junio C Hamano
2016-04-29  8:41             ` Junio C Hamano
2016-04-29 14:29               ` Michael Haggerty
2016-04-27 16:57 ` [PATCH 16/29] ref_transaction_commit(): correctly report close_ref() failure Michael Haggerty
2016-04-27 16:57 ` [PATCH 17/29] delete_branches(): use resolve_refdup() Michael Haggerty
2016-04-27 16:57 ` [PATCH 18/29] refs: allow log-only updates Michael Haggerty
2016-04-27 16:57 ` [PATCH 19/29] refs: don't dereference on rename Michael Haggerty
2016-04-27 18:55   ` Junio C Hamano
2016-04-29  7:38     ` Michael Haggerty
2016-04-29  8:53       ` Junio C Hamano
2016-04-29 10:57         ` Michael Haggerty
2016-04-29 12:12           ` Jeff King
2016-04-29 13:55             ` Michael Haggerty
2016-04-29 14:08               ` Jeff King
2016-04-29 15:29               ` Junio C Hamano
2016-04-29 23:21       ` David Turner
2016-04-30  3:48         ` Michael Haggerty
2016-05-02 17:55           ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 20/29] verify_refname_available(): adjust constness in declaration Michael Haggerty
2016-04-27 16:57 ` [PATCH 21/29] add_update(): initialize the whole ref_update Michael Haggerty
2016-04-27 16:57 ` [PATCH 22/29] lock_ref_for_update(): new function Michael Haggerty
2016-04-27 16:57 ` [PATCH 23/29] unlock_ref(): move definition higher in the file Michael Haggerty
2016-04-27 16:57 ` [PATCH 24/29] ref_transaction_update(): check refname_is_safe() at a minimum Michael Haggerty
2016-04-27 20:14   ` Junio C Hamano
2016-04-29  7:42     ` Michael Haggerty
2016-04-29  8:53       ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 25/29] refs: resolve symbolic refs first Michael Haggerty
2016-04-28 23:40   ` David Turner
2016-04-29  9:51     ` Michael Haggerty
2016-04-29 23:14       ` David Turner
2016-05-02 18:06     ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 26/29] lock_ref_for_update(): don't re-read non-symbolic references Michael Haggerty
2016-04-27 16:57 ` Michael Haggerty [this message]
2016-04-27 16:57 ` [PATCH 28/29] commit_ref_update(): remove the flags parameter Michael Haggerty
2016-04-27 16:57 ` [PATCH 29/29] lock_ref_sha1_basic(): only handle REF_NODEREF mode Michael Haggerty
2016-04-29 15:43   ` Junio C Hamano
2016-04-29  1:14 ` [PATCH 00/29] Yet more preparation for reference backends David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc9cd5747320dbbfc328f79b60ba8d6d78764fff.1461768690.git.mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).