From: Taylor Blau <me@ttaylorr.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com, jrnieder@gmail.com,
gitster@pobox.com, jayconrod@google.com,
jonathantanmy@google.com
Subject: Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
Date: Tue, 7 Jul 2020 10:43:38 -0400 [thread overview]
Message-ID: <20200707144338.GA26342@syl.lan> (raw)
In-Reply-To: <82939831ad88f7750b1d024b2031f688ecdf6755.1594132839.git.me@ttaylorr.com>
...oops; I was scripting around my git-who script this morning, and
clearly munged a few of the recipient's emails. They are fixed in this
email.
On Tue, Jul 07, 2020 at 10:42:22AM -0400, Taylor Blau wrote:
> In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
> 2020-04-22), Git learned how to reset stat-validity checks for the
> '$GIT_DIR/shallow' file, allowing it to change between a shallow and
> non-shallow state in the same process (e.g., in the case of 'git fetch
> --unshallow').
>
> However, 37b9dcabfc does not alter or remove any grafts nor substituted
> parents. This produces a problem when unshallowing if another part of
> the code relies on the un-grafted and/or non-substituted parentage for
> commits after, say, fetching.
>
> This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
> certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
> return 0, indicating that the repository should not generate
> commit-graphs (since at one point in the same process it was shallow).
> But with 37b9dcabfc, that check succeeds.
>
> This is where the bug occurs: even though the repository isn't shallow
> any longer (that is, we have all of the objects), the in-core
> representation of those objects still has munged parents at the shallow
> boundaries. If a commit-graph write proceeds, we will use the incorrect
> parentage, producing wrong results.
>
> (Prior to this patch, there were two ways of fixing this: either (1)
> set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
> after unshallowing).
>
> One way to fix this would be to reset the parsed object pool entirely
> (flushing the cache and thus preventing subsequent reads from modifying
> their parents) after unshallowing. This can produce a problem when
> callers have a now-stale reference to the old pool, and so this patch
> implements a different approach. Instead, we attach a new bit to the
> pool, 'substituted_parent' which indicates if the repository *ever*
> stored a commit which had its parents modified (i.e., the shallow
> boundary *before* unshallowing).
>
> This bit is sticky, since all subsequent reads after modifying a
> commit's parent are unreliable when unshallowing. This patch modifies
> the check in 'commit_graph_compatible' to take this bit into account,
> and correctly avoid generating commit-graphs in this case.
>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Helped-by: Jonathan Nieder <jrnieder@gmail.com>
> Reported-by: Jay Conrod <jayconrod@google.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> This is a follow-up to Jonathan Nieder's recent message; this patch
> fixes the persistent-shallow issue originally reported by Jay Conrod in:
>
> https://lore.kernel.org/git/20200603034213.GB253041@google.com/
>
> Like Jonathan, I am also late to send this with -rc0 so close around the
> corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
> fetch.writeCommitGraph is no longer implied by feature.experimental, but
> I figure that it is probably better to get this into v2.28.0 since it
> fixes the issue once and for all, so long as there is consensus that the
> patch is good.
>
> Thanks in advance for a review.
>
> commit-graph.c | 3 ++-
> commit.c | 2 ++
> object.h | 1 +
> t/t5537-fetch-shallow.sh | 14 ++++++++++++++
> 4 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index fdd1c4fa7c..328ab06fd4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
> }
>
> prepare_commit_graft(r);
> - if (r->parsed_objects && r->parsed_objects->grafts_nr)
> + if (r->parsed_objects &&
> + (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
> return 0;
> if (is_repository_shallow(r))
> return 0;
> diff --git a/commit.c b/commit.c
> index 43d29a800d..7128895c3a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
> pptr = &item->parents;
>
> graft = lookup_commit_graft(r, &item->object.oid);
> + if (graft)
> + r->parsed_objects->substituted_parent = 1;
> while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
> struct commit *new_parent;
>
> diff --git a/object.h b/object.h
> index 38dc2d5a6c..96a2105859 100644
> --- a/object.h
> +++ b/object.h
> @@ -25,6 +25,7 @@ struct parsed_object_pool {
> char *alternate_shallow_file;
>
> int commit_graft_prepared;
> + int substituted_parent;
>
> struct buffer_slab *buffer_slab;
> };
> diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
> index d427a2d7f7..a55202d2d3 100755
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
> )
> '
>
> +test_expect_success 'fetch --unshallow from a full clone' '
> + git clone --no-local --depth=2 .git shallow3 &&
> + (
> + cd shallow3 &&
> + git log --format=%s >actual &&
> + test_write_lines 4 3 >expect &&
> + test_cmp expect actual &&
> + git -c fetch.writeCommitGraph fetch --unshallow &&
> + git log origin/master --format=%s >actual &&
> + test_write_lines 4 3 2 1 >expect &&
> + test_cmp expect actual
> + )
> +'
> +
> test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
> # the blob "1" is available in .git but hidden by the
> # shallow2/.git/shallow and it should be resent
> --
> 2.27.0.225.g9fa765a71d
Thanks,
Taylor
next prev parent reply other threads:[~2020-07-07 14:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-07 14:42 [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
2020-07-07 14:43 ` Taylor Blau [this message]
2020-07-08 21:10 ` [PATCH v2] " Taylor Blau
2020-07-09 1:00 ` Jonathan Nieder
2020-07-09 1:21 ` Junio C Hamano
2020-07-08 5:41 ` [PATCH] " Jonathan Nieder
2020-07-08 20:55 ` Junio C Hamano
2020-07-08 21:22 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200707144338.GA26342@syl.lan \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jayconrod@google.com \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).