git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com, jrnieder@gmail.com,
	gitster@pobox.com, jayconrod@google.com,
	jonathantanmy@google.com
Subject: Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
Date: Tue, 7 Jul 2020 10:43:38 -0400	[thread overview]
Message-ID: <20200707144338.GA26342@syl.lan> (raw)
In-Reply-To: <82939831ad88f7750b1d024b2031f688ecdf6755.1594132839.git.me@ttaylorr.com>

...oops; I was scripting around my git-who script this morning, and
clearly munged a few of the recipient's emails. They are fixed in this
email.

On Tue, Jul 07, 2020 at 10:42:22AM -0400, Taylor Blau wrote:
> In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
> 2020-04-22), Git learned how to reset stat-validity checks for the
> '$GIT_DIR/shallow' file, allowing it to change between a shallow and
> non-shallow state in the same process (e.g., in the case of 'git fetch
> --unshallow').
>
> However, 37b9dcabfc does not alter or remove any grafts nor substituted
> parents. This produces a problem when unshallowing if another part of
> the code relies on the un-grafted and/or non-substituted parentage for
> commits after, say, fetching.
>
> This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
> certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
> return 0, indicating that the repository should not generate
> commit-graphs (since at one point in the same process it was shallow).
> But with 37b9dcabfc, that check succeeds.
>
> This is where the bug occurs: even though the repository isn't shallow
> any longer (that is, we have all of the objects), the in-core
> representation of those objects still has munged parents at the shallow
> boundaries. If a commit-graph write proceeds, we will use the incorrect
> parentage, producing wrong results.
>
> (Prior to this patch, there were two ways of fixing this: either (1)
> set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
> after unshallowing).
>
> One way to fix this would be to reset the parsed object pool entirely
> (flushing the cache and thus preventing subsequent reads from modifying
> their parents) after unshallowing. This can produce a problem when
> callers have a now-stale reference to the old pool, and so this patch
> implements a different approach. Instead, we attach a new bit to the
> pool, 'substituted_parent' which indicates if the repository *ever*
> stored a commit which had its parents modified (i.e., the shallow
> boundary *before* unshallowing).
>
> This bit is sticky, since all subsequent reads after modifying a
> commit's parent are unreliable when unshallowing. This patch modifies
> the check in 'commit_graph_compatible' to take this bit into account,
> and correctly avoid generating commit-graphs in this case.
>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Helped-by: Jonathan Nieder <jrnieder@gmail.com>
> Reported-by: Jay Conrod <jayconrod@google.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> This is a follow-up to Jonathan Nieder's recent message; this patch
> fixes the persistent-shallow issue originally reported by Jay Conrod in:
>
>   https://lore.kernel.org/git/20200603034213.GB253041@google.com/
>
> Like Jonathan, I am also late to send this with -rc0 so close around the
> corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
> fetch.writeCommitGraph is no longer implied by feature.experimental, but
> I figure that it is probably better to get this into v2.28.0 since it
> fixes the issue once and for all, so long as there is consensus that the
> patch is good.
>
> Thanks in advance for a review.
>
>  commit-graph.c           |  3 ++-
>  commit.c                 |  2 ++
>  object.h                 |  1 +
>  t/t5537-fetch-shallow.sh | 14 ++++++++++++++
>  4 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index fdd1c4fa7c..328ab06fd4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
>  	}
>
>  	prepare_commit_graft(r);
> -	if (r->parsed_objects && r->parsed_objects->grafts_nr)
> +	if (r->parsed_objects &&
> +	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
>  		return 0;
>  	if (is_repository_shallow(r))
>  		return 0;
> diff --git a/commit.c b/commit.c
> index 43d29a800d..7128895c3a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
>  	pptr = &item->parents;
>
>  	graft = lookup_commit_graft(r, &item->object.oid);
> +	if (graft)
> +		r->parsed_objects->substituted_parent = 1;
>  	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
>  		struct commit *new_parent;
>
> diff --git a/object.h b/object.h
> index 38dc2d5a6c..96a2105859 100644
> --- a/object.h
> +++ b/object.h
> @@ -25,6 +25,7 @@ struct parsed_object_pool {
>  	char *alternate_shallow_file;
>
>  	int commit_graft_prepared;
> +	int substituted_parent;
>
>  	struct buffer_slab *buffer_slab;
>  };
> diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
> index d427a2d7f7..a55202d2d3 100755
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
>  	)
>  '
>
> +test_expect_success 'fetch --unshallow from a full clone' '
> +	git clone --no-local --depth=2 .git shallow3 &&
> +	(
> +	cd shallow3 &&
> +	git log --format=%s >actual &&
> +	test_write_lines 4 3 >expect &&
> +	test_cmp expect actual &&
> +	git -c fetch.writeCommitGraph fetch --unshallow &&
> +	git log origin/master --format=%s >actual &&
> +	test_write_lines 4 3 2 1 >expect &&
> +	test_cmp expect actual
> +	)
> +'
> +
>  test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
>  	# the blob "1" is available in .git but hidden by the
>  	# shallow2/.git/shallow and it should be resent
> --
> 2.27.0.225.g9fa765a71d
Thanks,
Taylor

  reply	other threads:[~2020-07-07 14:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 14:42 [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
2020-07-07 14:43 ` Taylor Blau [this message]
2020-07-08 21:10   ` [PATCH v2] " Taylor Blau
2020-07-09  1:00     ` Jonathan Nieder
2020-07-09  1:21       ` Junio C Hamano
2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
2020-07-08 20:55   ` Junio C Hamano
2020-07-08 21:22     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200707144338.GA26342@syl.lan \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jayconrod@google.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).