git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] commit.c: don't persist substituted parents when unshallowing
@ 2020-07-07 14:42 Taylor Blau
  2020-07-07 14:43 ` Taylor Blau
  2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
  0 siblings, 2 replies; 8+ messages in thread
From: Taylor Blau @ 2020-07-07 14:42 UTC (permalink / raw)
  To: git; +Cc: =dstolee, =jrnieder, =gitster, =jayconrod, jonathantanmy

In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git learned how to reset stat-validity checks for the
'$GIT_DIR/shallow' file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').

However, 37b9dcabfc does not alter or remove any grafts nor substituted
parents. This produces a problem when unshallowing if another part of
the code relies on the un-grafted and/or non-substituted parentage for
commits after, say, fetching.

This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
return 0, indicating that the repository should not generate
commit-graphs (since at one point in the same process it was shallow).
But with 37b9dcabfc, that check succeeds.

This is where the bug occurs: even though the repository isn't shallow
any longer (that is, we have all of the objects), the in-core
representation of those objects still has munged parents at the shallow
boundaries. If a commit-graph write proceeds, we will use the incorrect
parentage, producing wrong results.

(Prior to this patch, there were two ways of fixing this: either (1)
set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
after unshallowing).

One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. This can produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, we attach a new bit to the
pool, 'substituted_parent' which indicates if the repository *ever*
stored a commit which had its parents modified (i.e., the shallow
boundary *before* unshallowing).

This bit is sticky, since all subsequent reads after modifying a
commit's parent are unreliable when unshallowing. This patch modifies
the check in 'commit_graph_compatible' to take this bit into account,
and correctly avoid generating commit-graphs in this case.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
This is a follow-up to Jonathan Nieder's recent message; this patch
fixes the persistent-shallow issue originally reported by Jay Conrod in:

  https://lore.kernel.org/git/20200603034213.GB253041@google.com/

Like Jonathan, I am also late to send this with -rc0 so close around the
corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
fetch.writeCommitGraph is no longer implied by feature.experimental, but
I figure that it is probably better to get this into v2.28.0 since it
fixes the issue once and for all, so long as there is consensus that the
patch is good.

Thanks in advance for a review.

 commit-graph.c           |  3 ++-
 commit.c                 |  2 ++
 object.h                 |  1 +
 t/t5537-fetch-shallow.sh | 14 ++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index fdd1c4fa7c..328ab06fd4 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
 	}

 	prepare_commit_graft(r);
-	if (r->parsed_objects && r->parsed_objects->grafts_nr)
+	if (r->parsed_objects &&
+	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
 		return 0;
 	if (is_repository_shallow(r))
 		return 0;
diff --git a/commit.c b/commit.c
index 43d29a800d..7128895c3a 100644
--- a/commit.c
+++ b/commit.c
@@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	pptr = &item->parents;

 	graft = lookup_commit_graft(r, &item->object.oid);
+	if (graft)
+		r->parsed_objects->substituted_parent = 1;
 	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
 		struct commit *new_parent;

diff --git a/object.h b/object.h
index 38dc2d5a6c..96a2105859 100644
--- a/object.h
+++ b/object.h
@@ -25,6 +25,7 @@ struct parsed_object_pool {
 	char *alternate_shallow_file;

 	int commit_graft_prepared;
+	int substituted_parent;

 	struct buffer_slab *buffer_slab;
 };
diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index d427a2d7f7..a55202d2d3 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
 	)
 '

+test_expect_success 'fetch --unshallow from a full clone' '
+	git clone --no-local --depth=2 .git shallow3 &&
+	(
+	cd shallow3 &&
+	git log --format=%s >actual &&
+	test_write_lines 4 3 >expect &&
+	test_cmp expect actual &&
+	git -c fetch.writeCommitGraph fetch --unshallow &&
+	git log origin/master --format=%s >actual &&
+	test_write_lines 4 3 2 1 >expect &&
+	test_cmp expect actual
+	)
+'
+
 test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
 	# the blob "1" is available in .git but hidden by the
 	# shallow2/.git/shallow and it should be resent
--
2.27.0.225.g9fa765a71d

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
  2020-07-07 14:42 [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
@ 2020-07-07 14:43 ` Taylor Blau
  2020-07-08 21:10   ` [PATCH v2] " Taylor Blau
  2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
  1 sibling, 1 reply; 8+ messages in thread
From: Taylor Blau @ 2020-07-07 14:43 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, jrnieder, gitster, jayconrod, jonathantanmy

...oops; I was scripting around my git-who script this morning, and
clearly munged a few of the recipient's emails. They are fixed in this
email.

On Tue, Jul 07, 2020 at 10:42:22AM -0400, Taylor Blau wrote:
> In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
> 2020-04-22), Git learned how to reset stat-validity checks for the
> '$GIT_DIR/shallow' file, allowing it to change between a shallow and
> non-shallow state in the same process (e.g., in the case of 'git fetch
> --unshallow').
>
> However, 37b9dcabfc does not alter or remove any grafts nor substituted
> parents. This produces a problem when unshallowing if another part of
> the code relies on the un-grafted and/or non-substituted parentage for
> commits after, say, fetching.
>
> This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
> certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
> return 0, indicating that the repository should not generate
> commit-graphs (since at one point in the same process it was shallow).
> But with 37b9dcabfc, that check succeeds.
>
> This is where the bug occurs: even though the repository isn't shallow
> any longer (that is, we have all of the objects), the in-core
> representation of those objects still has munged parents at the shallow
> boundaries. If a commit-graph write proceeds, we will use the incorrect
> parentage, producing wrong results.
>
> (Prior to this patch, there were two ways of fixing this: either (1)
> set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
> after unshallowing).
>
> One way to fix this would be to reset the parsed object pool entirely
> (flushing the cache and thus preventing subsequent reads from modifying
> their parents) after unshallowing. This can produce a problem when
> callers have a now-stale reference to the old pool, and so this patch
> implements a different approach. Instead, we attach a new bit to the
> pool, 'substituted_parent' which indicates if the repository *ever*
> stored a commit which had its parents modified (i.e., the shallow
> boundary *before* unshallowing).
>
> This bit is sticky, since all subsequent reads after modifying a
> commit's parent are unreliable when unshallowing. This patch modifies
> the check in 'commit_graph_compatible' to take this bit into account,
> and correctly avoid generating commit-graphs in this case.
>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Helped-by: Jonathan Nieder <jrnieder@gmail.com>
> Reported-by: Jay Conrod <jayconrod@google.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> This is a follow-up to Jonathan Nieder's recent message; this patch
> fixes the persistent-shallow issue originally reported by Jay Conrod in:
>
>   https://lore.kernel.org/git/20200603034213.GB253041@google.com/
>
> Like Jonathan, I am also late to send this with -rc0 so close around the
> corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
> fetch.writeCommitGraph is no longer implied by feature.experimental, but
> I figure that it is probably better to get this into v2.28.0 since it
> fixes the issue once and for all, so long as there is consensus that the
> patch is good.
>
> Thanks in advance for a review.
>
>  commit-graph.c           |  3 ++-
>  commit.c                 |  2 ++
>  object.h                 |  1 +
>  t/t5537-fetch-shallow.sh | 14 ++++++++++++++
>  4 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index fdd1c4fa7c..328ab06fd4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
>  	}
>
>  	prepare_commit_graft(r);
> -	if (r->parsed_objects && r->parsed_objects->grafts_nr)
> +	if (r->parsed_objects &&
> +	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
>  		return 0;
>  	if (is_repository_shallow(r))
>  		return 0;
> diff --git a/commit.c b/commit.c
> index 43d29a800d..7128895c3a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
>  	pptr = &item->parents;
>
>  	graft = lookup_commit_graft(r, &item->object.oid);
> +	if (graft)
> +		r->parsed_objects->substituted_parent = 1;
>  	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
>  		struct commit *new_parent;
>
> diff --git a/object.h b/object.h
> index 38dc2d5a6c..96a2105859 100644
> --- a/object.h
> +++ b/object.h
> @@ -25,6 +25,7 @@ struct parsed_object_pool {
>  	char *alternate_shallow_file;
>
>  	int commit_graft_prepared;
> +	int substituted_parent;
>
>  	struct buffer_slab *buffer_slab;
>  };
> diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
> index d427a2d7f7..a55202d2d3 100755
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
>  	)
>  '
>
> +test_expect_success 'fetch --unshallow from a full clone' '
> +	git clone --no-local --depth=2 .git shallow3 &&
> +	(
> +	cd shallow3 &&
> +	git log --format=%s >actual &&
> +	test_write_lines 4 3 >expect &&
> +	test_cmp expect actual &&
> +	git -c fetch.writeCommitGraph fetch --unshallow &&
> +	git log origin/master --format=%s >actual &&
> +	test_write_lines 4 3 2 1 >expect &&
> +	test_cmp expect actual
> +	)
> +'
> +
>  test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
>  	# the blob "1" is available in .git but hidden by the
>  	# shallow2/.git/shallow and it should be resent
> --
> 2.27.0.225.g9fa765a71d
Thanks,
Taylor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
  2020-07-07 14:42 [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
  2020-07-07 14:43 ` Taylor Blau
@ 2020-07-08  5:41 ` Jonathan Nieder
  2020-07-08 20:55   ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Nieder @ 2020-07-08  5:41 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, jrnieder, gitster, jayconrod, jonathantanmy

Taylor Blau wrote:

> In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
> 2020-04-22), Git learned how to reset stat-validity checks for the
> '$GIT_DIR/shallow' file, allowing it to change between a shallow and
> non-shallow state in the same process (e.g., in the case of 'git fetch
> --unshallow').
>
> However, 37b9dcabfc does not alter or remove any grafts nor substituted
> parents.
[...]
>                               even though the repository isn't shallow
> any longer (that is, we have all of the objects), the in-core
> representation of those objects still has munged parents at the shallow
> boundaries. If a commit-graph write proceeds, we will use the incorrect
> parentage, producing wrong results.
>
> (Prior to this patch, there were two ways of fixing this: either (1)
> set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
> after unshallowing).

nit: It wasn't obvious to me on first reading which patch "this patch"
refers to --- does it mean 37b9dcabfc or the patch I am reading?

The approach described in Documentation/SubmittingPatches is to treat
the commit message kind of like a bug report:

 . start with the problem the change tries to solve, i.e. what is wrong
   with the current code without the change.

 . request the change, justifying the way it solves the problem, i.e.
   why the result with the change is better.

 . then move on to alternate solutions considered but discarded, if any.

Here, I think that would be similar to what you have written:

 Since 37b9dcabfc (shallow.c: [...]), Git knows how to reset
 stat-validity checks for the $GIT_DIR/shallow file, allowing it to
 [etc].  However, when $GIT_DIR/shallow changes, Git does not alter or
 remove any grafts nor substituted parents in memory.

 This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
 set to true.  Ordinarily in a shallow repository (and before 37b9dcabfc,
 even in this case), commit_graph_compatible() would return false,
 indicating that the repository should not be used to write a
 commit-graphs (since commit-graph files cannot represent a shallow
 history).  But since 37b9dcabfc, in an --unshallow operation that check
 succeeds.

 Thus even though the repository isn't shallow any longer (that is, we
 have all of the objects), the in-core representation of those objects
 [...].  When the commit-graph write proceeds, we use the incorrect
 parentage, producing wrong results.

 There are two ways for a user to work around this: either (1) set
 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
 after unshallowing.

> One way to fix this would be to reset the parsed object pool entirely
> (flushing the cache and thus preventing subsequent reads from modifying
> their parents) after unshallowing. This can produce a problem when

nit: seems easier to read with s/This can/That would/, since it's
describing the road not taken

> callers have a now-stale reference to the old pool, and so this patch
> implements a different approach. Instead, we attach a new bit to the

nit: s/we attach/attach/, to avoid ambiguity with the previous part of
the commit message that describes the current, unpatched behavior.

> pool, 'substituted_parent' which indicates if the repository *ever*
> stored a commit which had its parents modified (i.e., the shallow
> boundary *before* unshallowing).
>
> This bit is sticky, since all subsequent reads after modifying a
> commit's parent are unreliable when unshallowing. This patch modifies
> the check in 'commit_graph_compatible' to take this bit into account,
> and correctly avoid generating commit-graphs in this case.

likewise:

 This bit needs to be sticky because [...]

> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Helped-by: Jonathan Nieder <jrnieder@gmail.com>
> Reported-by: Jay Conrod <jayconrod@google.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> This is a follow-up to Jonathan Nieder's recent message; this patch
> fixes the persistent-shallow issue originally reported by Jay Conrod in:
>
>   https://lore.kernel.org/git/20200603034213.GB253041@google.com/
>
> Like Jonathan, I am also late to send this with -rc0 so close around the
> corner. I think that this *could* wait until v2.28.1 or v2.29.0 since
> fetch.writeCommitGraph is no longer implied by feature.experimental, but
> I figure that it is probably better to get this into v2.28.0 since it
> fixes the issue once and for all, so long as there is consensus that the
> patch is good.

I'm of course inclined to like it, so before reading through the rest my
preference would be to include both patches (this fix and the patch I
sent to reduce user exposure) in -rc0.  But let's see if the patch
changes my opinion. ;-)

[...]
>  commit-graph.c           |  3 ++-
>  commit.c                 |  2 ++
>  object.h                 |  1 +
>  t/t5537-fetch-shallow.sh | 14 ++++++++++++++
>  4 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index fdd1c4fa7c..328ab06fd4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
>  	}
> 
>  	prepare_commit_graft(r);
> -	if (r->parsed_objects && r->parsed_objects->grafts_nr)
> +	if (r->parsed_objects &&
> +	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))

A subtlety: we can't *just* check for substituted parents, because
there could be a graft affecting a commit we haven't parsed yet.  So
we'd have to check for both grafts yet to be applied (grafts_nr) and
grafts already applied (substituted_parent).

Good.

[...]
> --- a/commit.c
> +++ b/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
>  	pptr = &item->parents;
> 
>  	graft = lookup_commit_graft(r, &item->object.oid);
> +	if (graft)
> +		r->parsed_objects->substituted_parent = 1;

This applies right away at the only place in commit parsing where
grafts can be discovered.

The only other callers to lookup_commit_graft are get_shallow_commits
and prepare_shallow_info.  Those are protocol code and they don't
affect the in-core history.

[...]
> --- a/object.h
> +++ b/object.h
> @@ -25,6 +25,7 @@ struct parsed_object_pool {
>  	char *alternate_shallow_file;
> 
>  	int commit_graft_prepared;
> +	int substituted_parent;

parsed_object_pool is zero-initialized on creation (in
parsed_object_pool_new), so this gets correctly initialized to false.

[...]
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
>  	)
>  '
> 
> +test_expect_success 'fetch --unshallow from a full clone' '
> +	git clone --no-local --depth=2 .git shallow3 &&
> +	(
> +	cd shallow3 &&
> +	git log --format=%s >actual &&
> +	test_write_lines 4 3 >expect &&
> +	test_cmp expect actual &&
> +	git -c fetch.writeCommitGraph fetch --unshallow &&
> +	git log origin/master --format=%s >actual &&
> +	test_write_lines 4 3 2 1 >expect &&
> +	test_cmp expect actual
> +	)
> +'
> +

The indentation is odd here, but it's consistent with the rest of the
file.

A fairly straightforward test that demonstrates the bug being fixed and
only relies on what was set up in the initial 'setup' test (4 commits).
Thanks for tying that loose end.

With or without commit message tweaks along the lines described above,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks for your thoughtful work, as always.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
  2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
@ 2020-07-08 20:55   ` Junio C Hamano
  2020-07-08 21:22     ` Taylor Blau
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2020-07-08 20:55 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Taylor Blau, git, dstolee, jayconrod, jonathantanmy

Jonathan Nieder <jrnieder@gmail.com> writes:

> A fairly straightforward test that demonstrates the bug being fixed and
> only relies on what was set up in the initial 'setup' test (4 commits).
> Thanks for tying that loose end.
>
> With or without commit message tweaks along the lines described above,
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
>
> Thanks for your thoughtful work, as always.

Thanks for a nicely written review.  It would be very nice if we can
see a reroll before I have to tag -rc0.

Thanks.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] commit.c: don't persist substituted parents when unshallowing
  2020-07-07 14:43 ` Taylor Blau
@ 2020-07-08 21:10   ` Taylor Blau
  2020-07-09  1:00     ` Jonathan Nieder
  0 siblings, 1 reply; 8+ messages in thread
From: Taylor Blau @ 2020-07-08 21:10 UTC (permalink / raw)
  To: git; +Cc: dstolee, jrnieder, gitster, jayconrod, jonathantanmy

Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git knows how to reset stat-validity checks for the
$GIT_DIR/shallow file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').

However, when $GIT_DIR/shallow changes, Git does not alter or remove any
grafts (nor substituted parents) in memory.

This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
even in this case), commit_graph_compatible() would return false,
indicating that the repository should not be used to write a
commit-graphs (since commit-graph files cannot represent a shallow
history). But since 37b9dcabfc, in an --unshallow operation that check
succeeds.

Thus even though the repository isn't shallow any longer (that is, we
have all of the objects), the in-core representation of those objects
still has munged parents at the shallow boundaries.  When the
commit-graph write proceeds, we use the incorrect parentage, producing
wrong results.

There are two ways for a user to work around this: either (1) set
'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
unshallowing.

One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. That would produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, attach a new bit to the pool,
'substituted_parent', which indicates if the repository *ever* stored a
commit which had its parents modified (i.e., the shallow boundary
prior to unshallowing).

This bit needs to be sticky because all reads subsequent to modifying a
commit's parents are unreliable when unshallowing. Modify the check in
'commit_graph_compatible' to take this bit into account, and correctly
avoid generating commit-graphs in this case, thus solving the bug.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
The patch contents are unchanged from v1, but the patch's message is
wordsmithed with jrn's help. This should be ready for -rc0 now.

Range-diff below for easier review:

Range-diff against v1:
1:  82939831ad ! 1:  1f769bbbb4 commit.c: don't persist substituted parents when unshallowing
    @@ Metadata
      ## Commit message ##
         commit.c: don't persist substituted parents when unshallowing

    -    In 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
    -    2020-04-22), Git learned how to reset stat-validity checks for the
    -    '$GIT_DIR/shallow' file, allowing it to change between a shallow and
    +    Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
    +    2020-04-22), Git knows how to reset stat-validity checks for the
    +    $GIT_DIR/shallow file, allowing it to change between a shallow and
         non-shallow state in the same process (e.g., in the case of 'git fetch
         --unshallow').

    -    However, 37b9dcabfc does not alter or remove any grafts nor substituted
    -    parents. This produces a problem when unshallowing if another part of
    -    the code relies on the un-grafted and/or non-substituted parentage for
    -    commits after, say, fetching.
    +    However, when $GIT_DIR/shallow changes, Git does not alter or remove any
    +    grafts (nor substituted parents) in memory.

    -    This can arise 'fetch.writeCommitGraph' is true. Ordinarily (and
    -    certainly previous to 37b9dcabfc), 'commit_graph_compatible()' would
    -    return 0, indicating that the repository should not generate
    -    commit-graphs (since at one point in the same process it was shallow).
    -    But with 37b9dcabfc, that check succeeds.
    +    This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
    +    set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
    +    even in this case), commit_graph_compatible() would return false,
    +    indicating that the repository should not be used to write a
    +    commit-graphs (since commit-graph files cannot represent a shallow
    +    history). But since 37b9dcabfc, in an --unshallow operation that check
    +    succeeds.

    -    This is where the bug occurs: even though the repository isn't shallow
    -    any longer (that is, we have all of the objects), the in-core
    -    representation of those objects still has munged parents at the shallow
    -    boundaries. If a commit-graph write proceeds, we will use the incorrect
    -    parentage, producing wrong results.
    +    Thus even though the repository isn't shallow any longer (that is, we
    +    have all of the objects), the in-core representation of those objects
    +    still has munged parents at the shallow boundaries.  When the
    +    commit-graph write proceeds, we use the incorrect parentage, producing
    +    wrong results.

    -    (Prior to this patch, there were two ways of fixing this: either (1)
    -    set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph
    -    after unshallowing).
    +    There are two ways for a user to work around this: either (1) set
    +    'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
    +    unshallowing.

         One way to fix this would be to reset the parsed object pool entirely
         (flushing the cache and thus preventing subsequent reads from modifying
    -    their parents) after unshallowing. This can produce a problem when
    +    their parents) after unshallowing. That would produce a problem when
         callers have a now-stale reference to the old pool, and so this patch
    -    implements a different approach. Instead, we attach a new bit to the
    -    pool, 'substituted_parent' which indicates if the repository *ever*
    -    stored a commit which had its parents modified (i.e., the shallow
    -    boundary *before* unshallowing).
    +    implements a different approach. Instead, attach a new bit to the pool,
    +    'substituted_parent', which indicates if the repository *ever* stored a
    +    commit which had its parents modified (i.e., the shallow boundary
    +    prior to unshallowing).

    -    This bit is sticky, since all subsequent reads after modifying a
    -    commit's parent are unreliable when unshallowing. This patch modifies
    -    the check in 'commit_graph_compatible' to take this bit into account,
    -    and correctly avoid generating commit-graphs in this case.
    +    This bit needs to be sticky because all reads subsequent to modifying a
    +    commit's parents are unreliable when unshallowing. Modify the check in
    +    'commit_graph_compatible' to take this bit into account, and correctly
    +    avoid generating commit-graphs in this case, thus solving the bug.

         Helped-by: Derrick Stolee <dstolee@microsoft.com>
         Helped-by: Jonathan Nieder <jrnieder@gmail.com>
         Reported-by: Jay Conrod <jayconrod@google.com>
    +    Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
         Signed-off-by: Taylor Blau <me@ttaylorr.com>

      ## commit-graph.c ##

 commit-graph.c           |  3 ++-
 commit.c                 |  2 ++
 object.h                 |  1 +
 t/t5537-fetch-shallow.sh | 14 ++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index fdd1c4fa7c..328ab06fd4 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -203,7 +203,8 @@ static int commit_graph_compatible(struct repository *r)
 	}

 	prepare_commit_graft(r);
-	if (r->parsed_objects && r->parsed_objects->grafts_nr)
+	if (r->parsed_objects &&
+	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
 		return 0;
 	if (is_repository_shallow(r))
 		return 0;
diff --git a/commit.c b/commit.c
index 43d29a800d..7128895c3a 100644
--- a/commit.c
+++ b/commit.c
@@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
 	pptr = &item->parents;

 	graft = lookup_commit_graft(r, &item->object.oid);
+	if (graft)
+		r->parsed_objects->substituted_parent = 1;
 	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
 		struct commit *new_parent;

diff --git a/object.h b/object.h
index 38dc2d5a6c..96a2105859 100644
--- a/object.h
+++ b/object.h
@@ -25,6 +25,7 @@ struct parsed_object_pool {
 	char *alternate_shallow_file;

 	int commit_graft_prepared;
+	int substituted_parent;

 	struct buffer_slab *buffer_slab;
 };
diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index d427a2d7f7..a55202d2d3 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -81,6 +81,20 @@ test_expect_success 'fetch --unshallow from shallow clone' '
 	)
 '

+test_expect_success 'fetch --unshallow from a full clone' '
+	git clone --no-local --depth=2 .git shallow3 &&
+	(
+	cd shallow3 &&
+	git log --format=%s >actual &&
+	test_write_lines 4 3 >expect &&
+	test_cmp expect actual &&
+	git -c fetch.writeCommitGraph fetch --unshallow &&
+	git log origin/master --format=%s >actual &&
+	test_write_lines 4 3 2 1 >expect &&
+	test_cmp expect actual
+	)
+'
+
 test_expect_success 'fetch something upstream has but hidden by clients shallow boundaries' '
 	# the blob "1" is available in .git but hidden by the
 	# shallow2/.git/shallow and it should be resent
--
2.27.0.225.g9fa765a71d

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] commit.c: don't persist substituted parents when unshallowing
  2020-07-08 20:55   ` Junio C Hamano
@ 2020-07-08 21:22     ` Taylor Blau
  0 siblings, 0 replies; 8+ messages in thread
From: Taylor Blau @ 2020-07-08 21:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Taylor Blau, git, dstolee, jayconrod, jonathantanmy

On Wed, Jul 08, 2020 at 01:55:28PM -0700, Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>
> > A fairly straightforward test that demonstrates the bug being fixed and
> > only relies on what was set up in the initial 'setup' test (4 commits).
> > Thanks for tying that loose end.
> >
> > With or without commit message tweaks along the lines described above,
> > Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> >
> > Thanks for your thoughtful work, as always.
>
> Thanks for a nicely written review.  It would be very nice if we can
> see a reroll before I have to tag -rc0.

Of course! I just sent one here[1]. Sorry for the last-minute addition
to -rc0, and thanks for fast-tracking it down.

> Thanks.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/1f769bbbb4f1b4ad67d29ee7b3e5282446e4cc0f.1594242582.git.me@ttaylorr.com/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] commit.c: don't persist substituted parents when unshallowing
  2020-07-08 21:10   ` [PATCH v2] " Taylor Blau
@ 2020-07-09  1:00     ` Jonathan Nieder
  2020-07-09  1:21       ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Nieder @ 2020-07-09  1:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, gitster, jayconrod, jonathantanmy

Taylor Blau wrote:

> Reported-by: Jay Conrod <jayconrod@google.com>
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---

This is indeed
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] commit.c: don't persist substituted parents when unshallowing
  2020-07-09  1:00     ` Jonathan Nieder
@ 2020-07-09  1:21       ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2020-07-09  1:21 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Taylor Blau, git, dstolee, jayconrod, jonathantanmy

Jonathan Nieder <jrnieder@gmail.com> writes:

> Taylor Blau wrote:
>
>> Reported-by: Jay Conrod <jayconrod@google.com>
>> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>> ---
>
> This is indeed
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks, both.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-09  1:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 14:42 [PATCH] commit.c: don't persist substituted parents when unshallowing Taylor Blau
2020-07-07 14:43 ` Taylor Blau
2020-07-08 21:10   ` [PATCH v2] " Taylor Blau
2020-07-09  1:00     ` Jonathan Nieder
2020-07-09  1:21       ` Junio C Hamano
2020-07-08  5:41 ` [PATCH] " Jonathan Nieder
2020-07-08 20:55   ` Junio C Hamano
2020-07-08 21:22     ` Taylor Blau

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).