git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: =dstolee@microsoft.com, =jrnieder@gmail.com, =gitster@pobox.com,
	=jayconrod@google.com, jonathantanmy@google.com
Subject: [PATCH] commit.c: don't persist substituted parents when unshallowing
Date: Tue, 7 Jul 2020 10:42:22 -0400	[thread overview]
Message-ID: <82939831ad88f7750b1d024b2031f688ecdf6755.1594132839.git.me@ttaylorr.com> (raw)

In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git learned how to reset stat-validity checks for the
'$GIT_DIR/shallow' file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').

However, 37b9dcabfc does not alter or remove any grafts nor substituted
parents. This produces a problem when unshallowing if another part of
the code relies on the un-grafted and/or non-substituted parentage for
commits after, say, fetching.

This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
return 0, indicating that the repository should not generate
commit-graphs (since at one point in the same process it was shallow).
But with 37b9dcabfc, that check succeeds.

This is where the bug occurs: even though the repository isn't shallow
any longer (that is, we have all of the objects), the in-core
representation of those objects still has munged parents at the shallow
boundaries. If a commit-graph write proceeds, we will use the incorrect
parentage, producing wrong results.

(Prior to this patch, there were two ways of fixing this: either (1)
set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
after unshallowing).

One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. This can produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, we attach a new bit to the
pool, 'substituted_parent' which indicates if the repository *ever*
stored a commit which had its parents modified (i.e., the shallow
boundary *before* unshallowing).

This bit is sticky, since all subsequent reads after modifying a
commit's parent are unreliable when unshallowing. This patch modifies
the check in 'commit_graph_compatible' to take this bit into account,
and correctly avoid generating commit-graphs in this case.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
This is a follow-up to Jonathan Nieder's recent message; this patch
fixes the persistent-shallow issue originally reported by Jay Conrod in:

  https://lore.kernel.org/git/20200603034213.GB253041@google.com/

Like Jonathan, I am also late to send this with -rc0 so close around the
corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
fetch.writeCommitGraph is no longer implied by feature.experimental, but
I figure that it is probably better to get this into v2.28.0 since it
fixes the issue once and for all, so long as there is consensus that the
patch is good.

Thanks in advance for a review.

 commit-graph.c           |  3 ++-
 commit.c                 |  2 ++
 object.h                 |  1 +
 t/t5537-fetch-shallow.sh | 14 ++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index fdd1c4fa7c..328ab06fd4 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
 	}

 	prepare_commit_graft(r);
-	if (r->parsed_objects && r->parsed_objects->grafts_nr)
+	if (r->parsed_objects &&
+	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
 		return 0;
 	if (is_repository_shallow(r))
 		return 0;
diff --git a/commit.c b/commit.c
index 43d29a800d..7128895c3a 100644
--- a/commit.c
+++ b/commit.c
@@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	pptr = &item->parents;

 	graft = lookup_commit_graft(r, &item->object.oid);
+	if (graft)
+		r->parsed_objects->substituted_parent = 1;
 	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
 		struct commit *new_parent;

diff --git a/object.h b/object.h
index 38dc2d5a6c..96a2105859 100644
--- a/object.h
+++ b/object.h
@@ -25,6 +25,7 @@ struct parsed_object_pool {
 	char *alternate_shallow_file;

 	int commit_graft_prepared;
+	int substituted_parent;

 	struct buffer_slab *buffer_slab;
 };
diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index d427a2d7f7..a55202d2d3 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
 	)
 '

+test_expect_success 'fetch --unshallow from a full clone' '
+	git clone --no-local --depth=2 .git shallow3 &&
+	(
+	cd shallow3 &&
+	git log --format=%s >actual &&
+	test_write_lines 4 3 >expect &&
+	test_cmp expect actual &&
+	git -c fetch.writeCommitGraph fetch --unshallow &&
+	git log origin/master --format=%s >actual &&
+	test_write_lines 4 3 2 1 >expect &&
+	test_cmp expect actual
+	)
+'
+
 test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
 	# the blob "1" is available in .git but hidden by the
 	# shallow2/.git/shallow and it should be resent
--
2.27.0.225.g9fa765a71d

             reply	other threads:[~2020-07-07 14:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 14:42 Taylor Blau [this message]
2020-07-07 14:43 ` [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
2020-07-08 21:10   ` [PATCH v2] " Taylor Blau
2020-07-09  1:00     ` Jonathan Nieder
2020-07-09  1:21       ` Junio C Hamano
2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
2020-07-08 20:55   ` Junio C Hamano
2020-07-08 21:22     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82939831ad88f7750b1d024b2031f688ecdf6755.1594132839.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc==dstolee@microsoft.com \
    --cc==gitster@pobox.com \
    --cc==jayconrod@google.com \
    --cc==jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).