git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only
@ 2021-12-07 19:29 Glen Choo
  2021-12-09 22:12 ` Jonathan Tan
  2021-12-17  0:02 ` [PATCH v2] " Glen Choo
  0 siblings, 2 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-07 19:29 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

`git fetch --negotiate-only` does not fetch objects and thus, it should
not perform certain auxiliary tasks like updating submodules, updating
the commit graph, or running gc. Although send_pack() invokes `git fetch
--negotiate-only` correctly, cmd_fetch() also reads config variables,
leading to undesirable behavior, like updating submodules if
`submodule.recurse=true`.

Make cmd_fetch() return early if --negotiate-only was specified so that
these auxiliary tasks are skipped.

Signed-off-by: Glen Choo <chooglen@google.com>
---
`git fetch --negotiate-only` is used during push negotiation to
determine the reachability of commits. As its name implies, only
negotiation is performed, not the actual fetching of objects. However,
cmd_fetch() performs certain tasks with the assumption that objects are
fetched:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense (introduced in [1]).
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true. But according to
  Documentation/config/fetch.txt [2], this should only be done if a
  pack-file is downloaded
* gc is run, but according to [3], we only do this because we expect
  `git fetch` to introduce objects

Instead of disabling these tasks piecemeal, let's just make cmd_fetch()
return early if --negotiate-only was given. To accommodate possible
future options that don't fetch objects, I opted to introduce another
`if` statement instead of putting the early return in the existing
`if (negotiate_only)` block.

[1] 7dce19d374 (fetch/pull: Add the --recurse-submodules option, 2010-11-12)
[2] 50f26bd035 (fetch: add fetch.writeCommitGraph config setting, 2019-09-02)
[3] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

 builtin/fetch.c       | 22 +++++++++++++++++-----
 t/t5516-fetch-push.sh | 12 ++++++++++++
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f7abbc31ff..01865b5c09 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1996,6 +1996,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (negotiate_only) {
+		/*
+		 * --negotiate-only should never recurse into
+		 * submodules, so there is no need to read .gitmodules.
+		 */
+		recurse_submodules = RECURSE_SUBMODULES_OFF;
+		if (!negotiation_tip.nr)
+			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
@@ -2005,9 +2016,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		fetch_config_from_gitmodules(sfjc, rs);
 	}
 
-	if (negotiate_only && !negotiation_tip.nr)
-		die(_("--negotiate-only needs one or more --negotiate-tip=*"));
-
 	if (deepen_relative) {
 		if (deepen_relative < 0)
 			die(_("Negative depth in --deepen is not supported"));
@@ -2112,6 +2120,12 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_multiple(&list, max_children);
 	}
 
+	string_list_clear(&list, 0);
+
+	/* skip irrelevant tasks if objects were not fetched  */
+	if (negotiate_only)
+		return result;
+
 	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
 		struct strvec options = STRVEC_INIT;
 		int max_children = max_jobs;
@@ -2132,8 +2146,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 8212ca56dc..732031085e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-07 19:29 [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2021-12-09 22:12 ` Jonathan Tan
  2021-12-09 22:36   ` Glen Choo
  2021-12-17  0:02 ` [PATCH v2] " Glen Choo
  1 sibling, 1 reply; 50+ messages in thread
From: Jonathan Tan @ 2021-12-09 22:12 UTC (permalink / raw)
  To: chooglen; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:
> `git fetch --negotiate-only` does not fetch objects and thus, it should
> not perform certain auxiliary tasks like updating submodules, updating
> the commit graph, or running gc. Although send_pack() invokes `git fetch
> --negotiate-only` correctly, cmd_fetch() also reads config variables,
> leading to undesirable behavior, like updating submodules if
> `submodule.recurse=true`.
> 
> Make cmd_fetch() return early if --negotiate-only was specified so that
> these auxiliary tasks are skipped.
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> `git fetch --negotiate-only` is used during push negotiation to
> determine the reachability of commits. As its name implies, only
> negotiation is performed, not the actual fetching of objects. However,
> cmd_fetch() performs certain tasks with the assumption that objects are
> fetched:
> 
> * Submodules are updated if enabled by recurse.submodules=true, but
>   negotiation fetch doesn't actually update the repo, so this doesn't
>   make sense (introduced in [1]).
> * Commit graphs will be written if enabled by
>   fetch.writeCommitGraph=true. But according to
>   Documentation/config/fetch.txt [2], this should only be done if a
>   pack-file is downloaded
> * gc is run, but according to [3], we only do this because we expect
>   `git fetch` to introduce objects
> 
> Instead of disabling these tasks piecemeal, let's just make cmd_fetch()
> return early if --negotiate-only was given. To accommodate possible
> future options that don't fetch objects, I opted to introduce another
> `if` statement instead of putting the early return in the existing
> `if (negotiate_only)` block.

Some of this probably should be in the commit message too.

> +	if (negotiate_only) {
> +		/*
> +		 * --negotiate-only should never recurse into
> +		 * submodules, so there is no need to read .gitmodules.
> +		 */
> +		recurse_submodules = RECURSE_SUBMODULES_OFF;
> +		if (!negotiation_tip.nr)
> +			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
> +	}

Maybe add a check here that --recurse-submodules was not explicitly
given.

> +	/* skip irrelevant tasks if objects were not fetched  */
> +	if (negotiate_only)
> +		return result;

There are other reasons too why objects were not fetched (e.g. if we
already have all of them). Maybe add a NEEDSWORK explaining this.

Besides these code comments and the fact that the commit message could
be improved, this patch looks good to me.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-09 22:12 ` Jonathan Tan
@ 2021-12-09 22:36   ` Glen Choo
  2021-12-13 22:58     ` Jonathan Tan
  0 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2021-12-09 22:36 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Jonathan Tan

Jonathan Tan <jonathantanmy@google.com> writes:

> Glen Choo <chooglen@google.com> writes:
>> `git fetch --negotiate-only` does not fetch objects and thus, it should
>> not perform certain auxiliary tasks like updating submodules, updating
>> the commit graph, or running gc. Although send_pack() invokes `git fetch
>> --negotiate-only` correctly, cmd_fetch() also reads config variables,
>> leading to undesirable behavior, like updating submodules if
>> `submodule.recurse=true`.
>> 
>> Make cmd_fetch() return early if --negotiate-only was specified so that
>> these auxiliary tasks are skipped.
>> 
>> Signed-off-by: Glen Choo <chooglen@google.com>
>> ---
>> `git fetch --negotiate-only` is used during push negotiation to
>> determine the reachability of commits. As its name implies, only
>> negotiation is performed, not the actual fetching of objects. However,
>> cmd_fetch() performs certain tasks with the assumption that objects are
>> fetched:
>> 
>> * Submodules are updated if enabled by recurse.submodules=true, but
>>   negotiation fetch doesn't actually update the repo, so this doesn't
>>   make sense (introduced in [1]).
>> * Commit graphs will be written if enabled by
>>   fetch.writeCommitGraph=true. But according to
>>   Documentation/config/fetch.txt [2], this should only be done if a
>>   pack-file is downloaded
>> * gc is run, but according to [3], we only do this because we expect
>>   `git fetch` to introduce objects
>> 
>> Instead of disabling these tasks piecemeal, let's just make cmd_fetch()
>> return early if --negotiate-only was given. To accommodate possible
>> future options that don't fetch objects, I opted to introduce another
>> `if` statement instead of putting the early return in the existing
>> `if (negotiate_only)` block.
>
> Some of this probably should be in the commit message too.

I suppose you mean the explanation of why the tasks are irrelevant to
negotiation fetch? i.e. 

   * Submodules are updated if enabled by recurse.submodules=true...
   * Commit graphs will be written if enabled by...
   * gc is run, but according to [3]...

>> +	if (negotiate_only) {
>> +		/*
>> +		 * --negotiate-only should never recurse into
>> +		 * submodules, so there is no need to read .gitmodules.
>> +		 */
>> +		recurse_submodules = RECURSE_SUBMODULES_OFF;
>> +		if (!negotiation_tip.nr)
>> +			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
>> +	}
>
> Maybe add a check here that --recurse-submodules was not explicitly
> given.

Hm, that's not a bad idea, but it's not so easy because we don't have
RECURSE_SUBMODULES_EXPLICIT so it's not easy to tell whether or not
submodule recursion was enabled by CLI option or config.

This is the exact same use case I encountered with "branch
--recurse-submodules" [1]. I think this means that we should consider
standardizing the parsing of submodule.recurse + --recurse-submodules. I
haven't done it yet because it's a little tricky and hard to review.

So I'll punt on this check until we get RECURSE_SUBMODULES_EXPLICIT.

>
>> +	/* skip irrelevant tasks if objects were not fetched  */
>> +	if (negotiate_only)
>> +		return result;
>
> There are other reasons too why objects were not fetched (e.g. if we
> already have all of them). Maybe add a NEEDSWORK explaining this.

Good point, the comment doesn't distinguish between cases where we know
that objects won't be fetched beforehand vs cases where we found out
that objects weren't fetched after the fact. I'll add the NEEDSWORK.

[1] https://lore.kernel.org/git/20211209184928.71413-5-chooglen@google.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-09 22:36   ` Glen Choo
@ 2021-12-13 22:58     ` Jonathan Tan
  2021-12-16 18:11       ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Tan @ 2021-12-13 22:58 UTC (permalink / raw)
  To: chooglen; +Cc: jonathantanmy, git

Glen Choo <chooglen@google.com> writes:
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Glen Choo <chooglen@google.com> writes:
> >> `git fetch --negotiate-only` does not fetch objects and thus, it should
> >> not perform certain auxiliary tasks like updating submodules, updating
> >> the commit graph, or running gc. Although send_pack() invokes `git fetch
> >> --negotiate-only` correctly, cmd_fetch() also reads config variables,
> >> leading to undesirable behavior, like updating submodules if
> >> `submodule.recurse=true`.
> >> 
> >> Make cmd_fetch() return early if --negotiate-only was specified so that
> >> these auxiliary tasks are skipped.
> >> 
> >> Signed-off-by: Glen Choo <chooglen@google.com>
> >> ---
> >> `git fetch --negotiate-only` is used during push negotiation to
> >> determine the reachability of commits. As its name implies, only
> >> negotiation is performed, not the actual fetching of objects. However,
> >> cmd_fetch() performs certain tasks with the assumption that objects are
> >> fetched:
> >> 
> >> * Submodules are updated if enabled by recurse.submodules=true, but
> >>   negotiation fetch doesn't actually update the repo, so this doesn't
> >>   make sense (introduced in [1]).
> >> * Commit graphs will be written if enabled by
> >>   fetch.writeCommitGraph=true. But according to
> >>   Documentation/config/fetch.txt [2], this should only be done if a
> >>   pack-file is downloaded
> >> * gc is run, but according to [3], we only do this because we expect
> >>   `git fetch` to introduce objects
> >> 
> >> Instead of disabling these tasks piecemeal, let's just make cmd_fetch()
> >> return early if --negotiate-only was given. To accommodate possible
> >> future options that don't fetch objects, I opted to introduce another
> >> `if` statement instead of putting the early return in the existing
> >> `if (negotiate_only)` block.
> >
> > Some of this probably should be in the commit message too.
> 
> I suppose you mean the explanation of why the tasks are irrelevant to
> negotiation fetch? i.e. 
> 
>    * Submodules are updated if enabled by recurse.submodules=true...
>    * Commit graphs will be written if enabled by...
>    * gc is run, but according to [3]...

Yes - why the behavior is undesirable, and the way you're doing it (by
adding another "if" statement).

After looking at this, more concretely, it might be better to use the
part below the "---" as your commit message. :-) Just note that we're
not just accommodating future options that don't fetch objects - "fetch"
already may not fetch objects (e.g. if the ref we want doesn't exist or
if we already have all the objects).

> > Maybe add a check here that --recurse-submodules was not explicitly
> > given.
> 
> Hm, that's not a bad idea, but it's not so easy because we don't have
> RECURSE_SUBMODULES_EXPLICIT so it's not easy to tell whether or not
> submodule recursion was enabled by CLI option or config.
> 
> This is the exact same use case I encountered with "branch
> --recurse-submodules" [1]. I think this means that we should consider
> standardizing the parsing of submodule.recurse + --recurse-submodules. I
> haven't done it yet because it's a little tricky and hard to review.
> 
> So I'll punt on this check until we get RECURSE_SUBMODULES_EXPLICIT.

Hmm...can we separate out the recurse_submodules variable into one
that's given by config and one that's given by CLI argument?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-13 22:58     ` Jonathan Tan
@ 2021-12-16 18:11       ` Glen Choo
  0 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-16 18:11 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: jonathantanmy, git

Jonathan Tan <jonathantanmy@google.com> writes:

> Glen Choo <chooglen@google.com> writes:
>> Jonathan Tan <jonathantanmy@google.com> writes:
>> 
>> > Glen Choo <chooglen@google.com> writes:
>> >> `git fetch --negotiate-only` does not fetch objects and thus, it should
>> >> not perform certain auxiliary tasks like updating submodules, updating
>> >> the commit graph, or running gc. Although send_pack() invokes `git fetch
>> >> --negotiate-only` correctly, cmd_fetch() also reads config variables,
>> >> leading to undesirable behavior, like updating submodules if
>> >> `submodule.recurse=true`.
>> >> 
>> >> Make cmd_fetch() return early if --negotiate-only was specified so that
>> >> these auxiliary tasks are skipped.
>> >> 
>> >> Signed-off-by: Glen Choo <chooglen@google.com>
>> >> ---
>> >> `git fetch --negotiate-only` is used during push negotiation to
>> >> determine the reachability of commits. As its name implies, only
>> >> negotiation is performed, not the actual fetching of objects. However,
>> >> cmd_fetch() performs certain tasks with the assumption that objects are
>> >> fetched:
>> >> 
>> >> * Submodules are updated if enabled by recurse.submodules=true, but
>> >>   negotiation fetch doesn't actually update the repo, so this doesn't
>> >>   make sense (introduced in [1]).
>> >> * Commit graphs will be written if enabled by
>> >>   fetch.writeCommitGraph=true. But according to
>> >>   Documentation/config/fetch.txt [2], this should only be done if a
>> >>   pack-file is downloaded
>> >> * gc is run, but according to [3], we only do this because we expect
>> >>   `git fetch` to introduce objects
>> >> 
>> >> Instead of disabling these tasks piecemeal, let's just make cmd_fetch()
>> >> return early if --negotiate-only was given. To accommodate possible
>> >> future options that don't fetch objects, I opted to introduce another
>> >> `if` statement instead of putting the early return in the existing
>> >> `if (negotiate_only)` block.
>> >
>> > Some of this probably should be in the commit message too.
>> 
>> I suppose you mean the explanation of why the tasks are irrelevant to
>> negotiation fetch? i.e. 
>> 
>>    * Submodules are updated if enabled by recurse.submodules=true...
>>    * Commit graphs will be written if enabled by...
>>    * gc is run, but according to [3]...
>
> Yes - why the behavior is undesirable, and the way you're doing it (by
> adding another "if" statement).
>
> After looking at this, more concretely, it might be better to use the
> part below the "---" as your commit message. :-) Just note that we're
> not just accommodating future options that don't fetch objects - "fetch"
> already may not fetch objects (e.g. if the ref we want doesn't exist or
> if we already have all the objects).

Good suggestion. I'll clean it up for brevity.

>
>> > Maybe add a check here that --recurse-submodules was not explicitly
>> > given.
>> 
>> Hm, that's not a bad idea, but it's not so easy because we don't have
>> RECURSE_SUBMODULES_EXPLICIT so it's not easy to tell whether or not
>> submodule recursion was enabled by CLI option or config.
>> 
>> This is the exact same use case I encountered with "branch
>> --recurse-submodules" [1]. I think this means that we should consider
>> standardizing the parsing of submodule.recurse + --recurse-submodules. I
>> haven't done it yet because it's a little tricky and hard to review.
>> 
>> So I'll punt on this check until we get RECURSE_SUBMODULES_EXPLICIT.
>
> Hmm...can we separate out the recurse_submodules variable into one
> that's given by config and one that's given by CLI argument?

This would be similar to what I did for branch --recurse-submodules [1],
which, as I noted in my cover letter [2], is easier to understand, but a
little inconsistent with the rest of the --recurse-submodules parsing.

I'm not very enthusiastic about adding more inconsistency, and since
this check isn't critical to this bugfix (and I think there is very
little chance that anyone would run afoul of this check any time soon)
I'd like to hold off on it until RECURSE_SUBMODULES_EXPLICIT.

Unless you think I'm missing something of course :)

[1] https://lore.kernel.org/git/20211216003213.99135-6-chooglen@google.com
[2] https://lore.kernel.org/git/20211216003213.99135-1-chooglen@google.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-07 19:29 [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2021-12-09 22:12 ` Jonathan Tan
@ 2021-12-17  0:02 ` Glen Choo
  2021-12-17 23:35   ` Junio C Hamano
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
  1 sibling, 2 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-17  0:02 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan

cmd_fetch() performs certain tasks with the assumption that objects are
fetched, but `git fetch --negotiate-only` does not fetch objects, as its
name implies. This is results in behavior that is unnecessary at best,
and incorrect at worst:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense (introduced in [1]).
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true. But according to
  Documentation/config/fetch.txt [2], this should only be done if a
  pack-file is downloaded.
* gc is run, but according to [3], we only do this because we expect
  `git fetch` to introduce objects.

Instead of disabling these tasks piecemeal, make cmd_fetch() return
early if we know for certain that objects will not be fetched. We can
return early whenever objects are not fetched, but for now this only
considers --negotiate-only.

[1] 7dce19d374 (fetch/pull: Add the --recurse-submodules option, 2010-11-12)
[2] 50f26bd035 (fetch: add fetch.writeCommitGraph config setting, 2019-09-02)
[3] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

Signed-off-by: Glen Choo <chooglen@google.com>
---
Thanks for the review, Jonathan! 

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment 

Interdiff against v1:
  diff --git a/builtin/fetch.c b/builtin/fetch.c
  index 01865b5c09..85091af99b 100644
  --- a/builtin/fetch.c
  +++ b/builtin/fetch.c
  @@ -2122,7 +2122,14 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
   
   	string_list_clear(&list, 0);
   
  -	/* skip irrelevant tasks if objects were not fetched  */
  +	/*
  +	 * Skip irrelevant tasks because we know objects were not
  +	 * fetched.
  +	 *
  +	 * NEEDSWORK: as a future optimization, we can return early
  +	 * whenever objects were not fetched e.g. if we already have all
  +	 * of them.
  +	 */
   	if (negotiate_only)
   		return result;
   

 builtin/fetch.c       | 29 ++++++++++++++++++++++++-----
 t/t5516-fetch-push.sh | 12 ++++++++++++
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f7abbc31ff..85091af99b 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1996,6 +1996,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (negotiate_only) {
+		/*
+		 * --negotiate-only should never recurse into
+		 * submodules, so there is no need to read .gitmodules.
+		 */
+		recurse_submodules = RECURSE_SUBMODULES_OFF;
+		if (!negotiation_tip.nr)
+			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
@@ -2005,9 +2016,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		fetch_config_from_gitmodules(sfjc, rs);
 	}
 
-	if (negotiate_only && !negotiation_tip.nr)
-		die(_("--negotiate-only needs one or more --negotiate-tip=*"));
-
 	if (deepen_relative) {
 		if (deepen_relative < 0)
 			die(_("Negative depth in --deepen is not supported"));
@@ -2112,6 +2120,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_multiple(&list, max_children);
 	}
 
+	string_list_clear(&list, 0);
+
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		return result;
+
 	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
 		struct strvec options = STRVEC_INIT;
 		int max_children = max_jobs;
@@ -2132,8 +2153,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 8212ca56dc..732031085e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-17  0:02 ` [PATCH v2] " Glen Choo
@ 2021-12-17 23:35   ` Junio C Hamano
  2021-12-20 19:37     ` Glen Choo
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
  1 sibling, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2021-12-17 23:35 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> cmd_fetch() performs certain tasks with the assumption that objects are
> fetched, but `git fetch --negotiate-only` does not fetch objects, as its
> name implies. This is results in behavior that is unnecessary at best,
> and incorrect at worst:
>
> * Submodules are updated if enabled by recurse.submodules=true, but
>   negotiation fetch doesn't actually update the repo, so this doesn't
>   make sense (introduced in [1]).

Hmph.

> * Commit graphs will be written if enabled by
>   fetch.writeCommitGraph=true. But according to
>   Documentation/config/fetch.txt [2], this should only be done if a
>   pack-file is downloaded.

Makes sense, as we haven't changed any reachability, and we have no
need to rewrite the graph file.

> * gc is run, but according to [3], we only do this because we expect
>   `git fetch` to introduce objects.

Makes sense.  As we haven't added any new objects, there is nothing
(other than the passage of time) that adds to the need to collect
garbage.

It makes me wonder if we need to do anything upon "fetch --dry-run".
I know we add to the object store without making anything reachable,
so that the user can do pre-flight checks with the real objects.  We
do not change the reachability so there is no reason to rewrite the
graph file, but we do add cruft to the object store.

Doing something about "--dry-run" is obviously outside the scope of
this topic, but it may make sense to think about it while we are
thinking about "fetch".

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index f7abbc31ff..85091af99b 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1996,6 +1996,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  
>  	argc = parse_options(argc, argv, prefix,
>  			     builtin_fetch_options, builtin_fetch_usage, 0);
> +
> +	if (negotiate_only) {
> +		/*
> +		 * --negotiate-only should never recurse into
> +		 * submodules, so there is no need to read .gitmodules.
> +		 */
> +		recurse_submodules = RECURSE_SUBMODULES_OFF;
> +		if (!negotiation_tip.nr)
> +			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
> +	}
> +

This means "fetch --negotiate-only --recurse-submodules" silently
ignores an explicit wish by the user.

I suspect that this part should be more like this.

	if (negitiate_only &&
	    recurse_submodules != RECURSE_SUBMODULES_OFF) {
		if (recurse_submodules came from the parse_options)
			die(_("'--%s' cannot be used with '--%s'",
			      "recurse-submodules", "negotiate-only"));
		recurse_submodules = RECURSE_SUBMODULES_OFF;
	}

That is, we complain if user gives us a combination we do not
support, but we are OK if the configuration is set to do so and
silently ignore (because we declare that the combination does not
make sense).

By the way, do not move the check about the number of negotiation
tips from the original location.  That check, or its location, have
nothing to do with what you want to do in this patch, which is "do
not gc or update the graph file if we are not fetching".  It is
better to leave unrelated changes out of the patch.

In order to tell if recurse_submodules that is not OFF came from the
call to parse_options(), you may need to capture the value of the
variable before calling parse_options() and compare it with the
current value in the above illustration code snippet I gave.

Having said all that, is it true that recurse-submodules should not
be combined with negotiate-only?  I naively think it would not be
surprising if users expect negotiate-only fetches are done also in
the submodules.  Whatever we decide the right behaviour should be,
we should document it.  With your patch without any of my above
input, I would expect at least something like

    diff --git i/Documentation/fetch-options.txt w/Documentation/fetch-options.txt
    index e967ff1874..baf2e9c50d 100644
    --- i/Documentation/fetch-options.txt
    +++ w/Documentation/fetch-options.txt
    @@ -73,6 +73,9 @@ configuration variables documented in linkgit:git-config[1], and the
     +
     Internally this is used to implement the `push.negotiate` option, see
     linkgit:git-config[1].
    ++
    +Note that this option silently makes various options that do not make
    +sense to be used together with it (e.g. `--recurse-submodules`) ignored.
     
     --dry-run::
            Show what would be done, without making any changes.

to leave wiggling room for us to silently ignore more.  We may know
about --recurse-submodules today, but I would not be surprised if we
find more.

> @@ -2112,6 +2120,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  		result = fetch_multiple(&list, max_children);
>  	}
>  
> +	string_list_clear(&list, 0);
> +
> +	/*
> +	 * Skip irrelevant tasks because we know objects were not
> +	 * fetched.
> +	 *
> +	 * NEEDSWORK: as a future optimization, we can return early
> +	 * whenever objects were not fetched e.g. if we already have all
> +	 * of them.
> +	 */
> +	if (negotiate_only)
> +		return result;
> +

I find it somewhat misleading to have the early return before the
block for recurse_submodules, as we _are_ already forcing it to not
to recurse.  It would be more readable if it went before the place
where we start doing the post-action clean-ups like reachability
graphs and garbage collection.

>  	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
>  		struct strvec options = STRVEC_INIT;
>  		int max_children = max_jobs;
> @@ -2132,8 +2153,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  		strvec_clear(&options);
>  	}
>  
> -	string_list_clear(&list, 0);
> -

Namely, here.

>  	prepare_repo_settings(the_repository);

This is existing code, but I wonder why it can be done _SO_ late in
the sequence.  We've already called the transport API for the
negotiate-only communication at this point, but a call to this
function is the only thing that gives fetch_negotiation_algorithm
member in the_repository its default value, isn't it?

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-17 23:35   ` Junio C Hamano
@ 2021-12-20 19:37     ` Glen Choo
  2021-12-20 19:56       ` Junio C Hamano
  2021-12-21 23:07       ` Glen Choo
  0 siblings, 2 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-20 19:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

>> * gc is run, but according to [3], we only do this because we expect
>>   `git fetch` to introduce objects.
>
> Makes sense.  As we haven't added any new objects, there is nothing
> (other than the passage of time) that adds to the need to collect
> garbage.
>
> It makes me wonder if we need to do anything upon "fetch --dry-run".
> I know we add to the object store without making anything reachable,
> so that the user can do pre-flight checks with the real objects.  We
> do not change the reachability so there is no reason to rewrite the
> graph file, but we do add cruft to the object store.
>
> Doing something about "--dry-run" is obviously outside the scope of
> this topic, but it may make sense to think about it while we are
> thinking about "fetch".

I hadn't considered "--dry-run". I'll think about that while I structure
this patch.

>> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> index f7abbc31ff..85091af99b 100644
>> --- a/builtin/fetch.c
>> +++ b/builtin/fetch.c
>> @@ -1996,6 +1996,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  
>>  	argc = parse_options(argc, argv, prefix,
>>  			     builtin_fetch_options, builtin_fetch_usage, 0);
>> +
>> +	if (negotiate_only) {
>> +		/*
>> +		 * --negotiate-only should never recurse into
>> +		 * submodules, so there is no need to read .gitmodules.
>> +		 */
>> +		recurse_submodules = RECURSE_SUBMODULES_OFF;
>> +		if (!negotiation_tip.nr)
>> +			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
>> +	}
>> +
>
> This means "fetch --negotiate-only --recurse-submodules" silently
> ignores an explicit wish by the user.
>
> I suspect that this part should be more like this.
>
> 	if (negitiate_only &&
> 	    recurse_submodules != RECURSE_SUBMODULES_OFF) {
> 		if (recurse_submodules came from the parse_options)
> 			die(_("'--%s' cannot be used with '--%s'",
> 			      "recurse-submodules", "negotiate-only"));
> 		recurse_submodules = RECURSE_SUBMODULES_OFF;
> 	}
>
> That is, we complain if user gives us a combination we do not
> support, but we are OK if the configuration is set to do so and
> silently ignore (because we declare that the combination does not
> make sense).

Yes, Jonathan proposed this as well. This is identical to the approach
in gc/branch-recurse-submodules, which as I noted in [1], is a bit
inconsistent with submodule parsing in general.

I decided *against* this because I thought that "--negotiate-only" is
internal-only. I don't see why a user would use "--negotiate-only", but
if this is a user journey we want to care about, then adding the
explicit check sounds ok.

> By the way, do not move the check about the number of negotiation
> tips from the original location.  That check, or its location, have
> nothing to do with what you want to do in this patch, which is "do
> not gc or update the graph file if we are not fetching".  It is
> better to leave unrelated changes out of the patch.

Ah, I see that it's not easy to tell whether or not the behavior is
correct after that line is moved. I'll avoid doing this in the future.

I still think that it is cleaner to move the negotiation_tip.nr check.
Should I do this in a follow-up patch?

> In order to tell if recurse_submodules that is not OFF came from the
> call to parse_options(), you may need to capture the value of the
> variable before calling parse_options() and compare it with the
> current value in the above illustration code snippet I gave.
>
> Having said all that, is it true that recurse-submodules should not
> be combined with negotiate-only?  I naively think it would not be
> surprising if users expect negotiate-only fetches are done also in
> the submodules.

In the current form of "--negotiate-only", no it does not make sense for
them to be combined. I think users would have the same expectations as
you if they were invoking "git fetch --negotiate-only" directly, but I'm
not convinced that that they will ever do so, except to debug push
negotiation.

I hope Jonathan can chime in to confirm whether or not users want/need
to invoke "--negotiate-only".

> Whatever we decide the right behaviour should be, we should document
> it. With your patch without any of my above input, I would expect at
> least something like
>
>     diff --git i/Documentation/fetch-options.txt w/Documentation/fetch-options.txt
>     index e967ff1874..baf2e9c50d 100644
>     --- i/Documentation/fetch-options.txt
>     +++ w/Documentation/fetch-options.txt
>     @@ -73,6 +73,9 @@ configuration variables documented in linkgit:git-config[1], and the
>      +
>      Internally this is used to implement the `push.negotiate` option, see
>      linkgit:git-config[1].
>     ++
>     +Note that this option silently makes various options that do not make
>     +sense to be used together with it (e.g. `--recurse-submodules`) ignored.
>      
>      --dry-run::
>             Show what would be done, without making any changes.
>
> to leave wiggling room for us to silently ignore more.  We may know
> about --recurse-submodules today, but I would not be surprised if we
> find more.

Sounds good.

>> @@ -2112,6 +2120,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  		result = fetch_multiple(&list, max_children);
>>  	}
>>  
>> +	string_list_clear(&list, 0);
>> +
>> +	/*
>> +	 * Skip irrelevant tasks because we know objects were not
>> +	 * fetched.
>> +	 *
>> +	 * NEEDSWORK: as a future optimization, we can return early
>> +	 * whenever objects were not fetched e.g. if we already have all
>> +	 * of them.
>> +	 */
>> +	if (negotiate_only)
>> +		return result;
>> +
>
> I find it somewhat misleading to have the early return before the
> block for recurse_submodules, as we _are_ already forcing it to not
> to recurse.  It would be more readable if it went before the place
> where we start doing the post-action clean-ups like reachability
> graphs and garbage collection.

Your feedback is valid, but I think it is true because negotiate_only
and recurse_submodules just happen to have a specifal interaction. As
indicated by the comments, this early return can apply to any situation
where objects were not fetched at all, but we only consider
negotiate_only right now.

This early return should also apply to submodules because if no new
objects were found in the superproject, all of the superproject commits
are referencing known submodule commits. If the submodule commits are
not known, they should be updated with "git submodule update", not "git
fetch".

>>  	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
>>  		struct strvec options = STRVEC_INIT;
>>  		int max_children = max_jobs;
>> @@ -2132,8 +2153,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  		strvec_clear(&options);
>>  	}
>>  
>> -	string_list_clear(&list, 0);
>> -
>
> Namely, here.

Hm, I realize that I could have used a goto instead of moving
string_list_clear()..

>
>>  	prepare_repo_settings(the_repository);
>
> This is existing code, but I wonder why it can be done _SO_ late in
> the sequence.  We've already called the transport API for the
> negotiate-only communication at this point, but a call to this
> function is the only thing that gives fetch_negotiation_algorithm
> member in the_repository its default value, isn't it?

That's right, this looks like it could be a bug. Maybe Jonathan knows
more.

[1] https://lore.kernel.org/git/kl6lwnk4to5x.fsf@chooglen-macbookpro.roam.corp.google.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-20 19:37     ` Glen Choo
@ 2021-12-20 19:56       ` Junio C Hamano
  2021-12-20 20:54         ` Glen Choo
  2021-12-21 23:07       ` Glen Choo
  1 sibling, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2021-12-20 19:56 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

>> By the way, do not move the check about the number of negotiation
>> tips from the original location.  That check, or its location, have
>> nothing to do with what you want to do in this patch, which is "do
>> not gc or update the graph file if we are not fetching".  It is
>> better to leave unrelated changes out of the patch.
>
> Ah, I see that it's not easy to tell whether or not the behavior is
> correct after that line is moved. I'll avoid doing this in the future.
>
> I still think that it is cleaner to move the negotiation_tip.nr check.
> Should I do this in a follow-up patch?

I am not seeing what makes it cleaner to have it early or at the
current position, but with "It is cleaner to do tip.nr check early
because ...", in a separate patch, it may become obvious.  But let's
not do it in this patch.

> I hope Jonathan can chime in to confirm whether or not users want/need
> to invoke "--negotiate-only".

Yeah, I knew that this was "internal implementation detail", but
then perhaps we know that a sane developer who knows what they are
doing will never write combination of it with recurse-submodule
option?  If so, we'd catch a notice developer breaking the system by
having a sanity check by detecting it as an error, no?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-20 19:56       ` Junio C Hamano
@ 2021-12-20 20:54         ` Glen Choo
  2021-12-20 22:12           ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2021-12-20 20:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>>> By the way, do not move the check about the number of negotiation
>>> tips from the original location.  That check, or its location, have
>>> nothing to do with what you want to do in this patch, which is "do
>>> not gc or update the graph file if we are not fetching".  It is
>>> better to leave unrelated changes out of the patch.
>>
>> Ah, I see that it's not easy to tell whether or not the behavior is
>> correct after that line is moved. I'll avoid doing this in the future.
>>
>> I still think that it is cleaner to move the negotiation_tip.nr check.
>> Should I do this in a follow-up patch?
>
> I am not seeing what makes it cleaner to have it early or at the
> current position, but with "It is cleaner to do tip.nr check early
> because ...", in a separate patch, it may become obvious.  But let's
> not do it in this patch.

Noted, thanks!

>> I hope Jonathan can chime in to confirm whether or not users want/need
>> to invoke "--negotiate-only".
>
> Yeah, I knew that this was "internal implementation detail", but
> then perhaps we know that a sane developer who knows what they are
> doing will never write combination of it with recurse-submodule
> option?  If so, we'd catch a notice developer breaking the system by
> having a sanity check by detecting it as an error, no?

Hm, I suppose that is true, though I have some reservations. In
particular, I'm not sure if the argument you are making is intended to
be specific to this combination of options, or all incompatible options
in general.

I think it might make sense to check for this specific combination of
"--negotiate-only" and "--recurse-submodules" because where the stakes
are low and behavior is reasonably simple to understand, though the
payoff is also pretty low.

In the more general case of "Does it *always* make sense to check for an
invalid combination of options?", I find your argument a bit troubling
because it seems to imply that anything we add to the Git CLI should be
treated as if users will depend on it. This seems like a needless burden
(or even an antipattern) if we need to fork a Git process purely as an
implementation detail, but we have to treat these internal CLI options
as user-facing.

But maybe I am misunderstanding how Git treats CLI options in general?
We don't ever really hide CLI options, even if they are only
internal. There is PARSE_OPT_HIDDEN, but those still show up in usage
and documentation [1]. So we actually do want users to know about these
implementation details?

Presumably, the thought process goes something like this: if we add an
option, a user *could* find a use for it (or it might accidentally
conflict with a user's flags), even if we never intended it for end user
consumption. Thus we need to treat all CLI options with care.

Is this a more accurate description of how we think about CLI options?

[1] This statement doesn't apply to the -- commands (like
submodule--helper). Those are 'truly' hidden because they don't have
public docs AFAIK.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-20 20:54         ` Glen Choo
@ 2021-12-20 22:12           ` Junio C Hamano
  2021-12-21  0:18             ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2021-12-20 22:12 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> But maybe I am misunderstanding how Git treats CLI options in general?
> We don't ever really hide CLI options, even if they are only
> internal. There is PARSE_OPT_HIDDEN, but those still show up in usage
> and documentation [1]. So we actually do want users to know about these
> implementation details?

At least they should be told not to use from the command line when
they have no business triggering whatever feature these options
trigger.  

> Presumably, the thought process goes something like this: if we add an
> option, a user *could* find a use for it (or it might accidentally
> conflict with a user's flags), even if we never intended it for end user
> consumption. Thus we need to treat all CLI options with care.

Rather, they can keep both halves ;-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-20 22:12           ` Junio C Hamano
@ 2021-12-21  0:18             ` Glen Choo
  0 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-21  0:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> But maybe I am misunderstanding how Git treats CLI options in general?
>> We don't ever really hide CLI options, even if they are only
>> internal. There is PARSE_OPT_HIDDEN, but those still show up in usage
>> and documentation [1]. So we actually do want users to know about these
>> implementation details?
>
> At least they should be told not to use from the command line when
> they have no business triggering whatever feature these options
> trigger.  

Makes sense. I can add this to the docs if we think --negotiate-only
isn't meant for end users.

>> Presumably, the thought process goes something like this: if we add an
>> option, a user *could* find a use for it (or it might accidentally
>> conflict with a user's flags), even if we never intended it for end user
>> consumption. Thus we need to treat all CLI options with care.
>
> Rather, they can keep both halves ;-)

Hm, do you mean "both halves" as in, they should be able to choose
whether or not they want to use an internal option?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-20 19:37     ` Glen Choo
  2021-12-20 19:56       ` Junio C Hamano
@ 2021-12-21 23:07       ` Glen Choo
  1 sibling, 0 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-21 23:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

>>>  	prepare_repo_settings(the_repository);
>>
>> This is existing code, but I wonder why it can be done _SO_ late in
>> the sequence.  We've already called the transport API for the
>> negotiate-only communication at this point, but a call to this
>> function is the only thing that gives fetch_negotiation_algorithm
>> member in the_repository its default value, isn't it?
>
> That's right, this looks like it could be a bug. Maybe Jonathan knows
> more.

It seems that fetch negotiation always calls prepare_repo_settings().
Fetch negotiation uses negotiate_using_fetch(), which calls
fetch_negotiator_init(), which calls prepare_repo_settings():

  void fetch_negotiator_init(struct repository *r,
          struct fetch_negotiator *negotiator)
  {
    prepare_repo_settings(r);
    switch(r->settings.fetch_negotiation_algorithm) {
    case FETCH_NEGOTIATION_SKIPPING:
      skipping_negotiator_init(negotiator);
      return;

    case FETCH_NEGOTIATION_NOOP:
      noop_negotiator_init(negotiator);
      return;

    case FETCH_NEGOTIATION_DEFAULT:
      default_negotiator_init(negotiator);
      return;
    }
  }

If anything, this seems safer than calling prepare_repo_settings() in
cmd_fetch() :)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 0/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-17  0:02 ` [PATCH v2] " Glen Choo
  2021-12-17 23:35   ` Junio C Hamano
@ 2021-12-22  0:11   ` Glen Choo
  2021-12-22  0:11     ` [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch() Glen Choo
                       ` (3 more replies)
  1 sibling, 4 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-22  0:11 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

cmd_fetch() performs certain tasks with the assumption that objects are
fetched, but `git fetch --negotiate-only` does not fetch objects, as its
name implies. This is results in behavior that is unnecessary at best,
and incorrect at worst:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense (introduced in [1]).
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true. But according to
  Documentation/config/fetch.txt [2], this should only be done if a
  pack-file is downloaded.
* gc is run, but according to [3], we only do this because we expect
  `git fetch` to introduce objects.

Make `git fetch --negotiate-only` handle these tasks more rigorously by
doing the following:

* Make cmd_fetch() return early if we know for certain that objects will
  not be fetched
* Disable submodule recursion and die() if a user explicitly asks for it

[1] 7dce19d374 (fetch/pull: Add the --recurse-submodules option, 2010-11-12)
[2] 50f26bd035 (fetch: add fetch.writeCommitGraph config setting, 2019-09-02)
[3] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

Changes since v2:
* added a prepatory patch that introduces a "goto cleanup"
* drop an unnecessary line move (as suggested by Junio)
* check for user-given --recurse-submodules and die() (as suggested by
  Jonathan and Junio)
* update --negotiate-only's documentation 

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment 

Glen Choo (3):
  builtin/fetch: use goto cleanup in cmd_fetch()
  builtin/fetch: skip unnecessary tasks when using --negotiate-only
  builtin/fetch: die on --negotiate-only and --recurse-submodules

 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 41 +++++++++++++++++++++++++++++----
 t/t5516-fetch-push.sh           | 12 ++++++++++
 t/t5702-protocol-v2.sh          | 17 ++++++++++++++
 4 files changed, 67 insertions(+), 4 deletions(-)

Range-diff against v2:
-:  ---------- > 1:  97e2cba633 builtin/fetch: use goto cleanup in cmd_fetch()
1:  9d18270a41 ! 2:  3a3a04b649 builtin/fetch: skip unnecessary tasks when using --negotiate-only
    @@ Commit message
         return early whenever objects are not fetched, but for now this only
         considers --negotiate-only.
     
    +    A similar optimization would be to skip irrelevant tasks in `git fetch
    +    --dry-run`. This optimization was not done in this commit because a dry
    +    run will actually fetch objects; we'd presumably still want to recurse
    +    into submodules and run gc.
    +
         [1] 7dce19d374 (fetch/pull: Add the --recurse-submodules option, 2010-11-12)
         [2] 50f26bd035 (fetch: add fetch.writeCommitGraph config setting, 2019-09-02)
         [3] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +		 * submodules, so there is no need to read .gitmodules.
     +		 */
     +		recurse_submodules = RECURSE_SUBMODULES_OFF;
    -+		if (!negotiation_tip.nr)
    -+			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
     +	}
     +
      	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
      		int *sfjc = submodule_fetch_jobs_config == -1
      			    ? &submodule_fetch_jobs_config : NULL;
    -@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
    - 		fetch_config_from_gitmodules(sfjc, rs);
    - 	}
    - 
    --	if (negotiate_only && !negotiation_tip.nr)
    --		die(_("--negotiate-only needs one or more --negotiate-tip=*"));
    --
    - 	if (deepen_relative) {
    - 		if (deepen_relative < 0)
    - 			die(_("Negative depth in --deepen is not supported"));
     @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
      		result = fetch_multiple(&list, max_children);
      	}
      
    -+	string_list_clear(&list, 0);
    -+
     +	/*
     +	 * Skip irrelevant tasks because we know objects were not
     +	 * fetched.
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +	 * of them.
     +	 */
     +	if (negotiate_only)
    -+		return result;
    ++		goto cleanup;
     +
      	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
      		struct strvec options = STRVEC_INIT;
      		int max_children = max_jobs;
    -@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
    - 		strvec_clear(&options);
    - 	}
    - 
    --	string_list_clear(&list, 0);
    --
    - 	prepare_repo_settings(the_repository);
    - 	if (fetch_write_commit_graph > 0 ||
    - 	    (fetch_write_commit_graph < 0 &&
     
      ## t/t5516-fetch-push.sh ##
     @@ t/t5516-fetch-push.sh: test_expect_success 'push with negotiation proceeds anyway even if negotiation f
-:  ---------- > 3:  97792b5214 builtin/fetch: die on --negotiate-only and --recurse-submodules

base-commit: abe6bb3905392d5eb6b01fa6e54d7e784e0522aa
-- 
2.33.GIT


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch()
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
@ 2021-12-22  0:11     ` Glen Choo
  2021-12-22  0:11     ` [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-22  0:11 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared.

The string_list_clear() call is purely cleanup; the string_list was not
reused.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f7abbc31ff..eab73d7417 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2076,7 +2076,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			gtransport->smart_options->acked_commits = &acked_commits;
 		} else {
 			warning(_("Protocol does not support --negotiate-only, exiting."));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 		if (server_options.nr)
 			gtransport->server_options = &server_options;
@@ -2132,8 +2133,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
@@ -2151,5 +2150,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (enable_auto_gc)
 		run_auto_maintenance(verbosity < 0);
 
+ cleanup:
+	string_list_clear(&list, 0);
 	return result;
 }
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
  2021-12-22  0:11     ` [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch() Glen Choo
@ 2021-12-22  0:11     ` Glen Choo
  2021-12-22  6:42       ` Junio C Hamano
  2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2021-12-22  0:11 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

cmd_fetch() performs certain tasks with the assumption that objects are
fetched, but `git fetch --negotiate-only` does not fetch objects, as its
name implies. This is results in behavior that is unnecessary at best,
and incorrect at worst:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense (introduced in [1]).
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true. But according to
  Documentation/config/fetch.txt [2], this should only be done if a
  pack-file is downloaded.
* gc is run, but according to [3], we only do this because we expect
  `git fetch` to introduce objects.

Instead of disabling these tasks piecemeal, make cmd_fetch() return
early if we know for certain that objects will not be fetched. We can
return early whenever objects are not fetched, but for now this only
considers --negotiate-only.

A similar optimization would be to skip irrelevant tasks in `git fetch
--dry-run`. This optimization was not done in this commit because a dry
run will actually fetch objects; we'd presumably still want to recurse
into submodules and run gc.

[1] 7dce19d374 (fetch/pull: Add the --recurse-submodules option, 2010-11-12)
[2] 50f26bd035 (fetch: add fetch.writeCommitGraph config setting, 2019-09-02)
[3] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c       | 20 ++++++++++++++++++++
 t/t5516-fetch-push.sh | 12 ++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index eab73d7417..883bb1b10c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1996,6 +1996,15 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (negotiate_only) {
+		/*
+		 * --negotiate-only should never recurse into
+		 * submodules, so there is no need to read .gitmodules.
+		 */
+		recurse_submodules = RECURSE_SUBMODULES_OFF;
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
@@ -2113,6 +2122,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_multiple(&list, max_children);
 	}
 
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		goto cleanup;
+
 	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
 		struct strvec options = STRVEC_INIT;
 		int max_children = max_jobs;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 8212ca56dc..732031085e 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
  2021-12-22  0:11     ` [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch() Glen Choo
  2021-12-22  0:11     ` [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2021-12-22  0:11     ` Glen Choo
  2021-12-22  6:46       ` Junio C Hamano
  2021-12-23 19:08       ` Jonathan Tan
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 2 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-22  0:11 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

The previous commit ignores the value of --recurse-submodules if
--negotiate-only is given. Since non "no" values of --recurse-submodules
are not supported with --negotiate-only, make cmd_fetch() check for
this invalid combination and die.

This is unlikely to affect internal usage of --negotiate-only, but it
may be helpful for users. We may also want to discourage users from
using --negotiate-only altogether because it was not intended for them.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 24 ++++++++++++++++++------
 t/t5702-protocol-v2.sh          | 17 +++++++++++++++++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index e967ff1874..81a100d593 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -71,6 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
+This is incompatible with `--recurse-submodules[=yes|on-demand]`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 883bb1b10c..54f970130e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -75,6 +75,7 @@ static struct transport *gtransport;
 static struct transport *gsecondary;
 static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_explicit = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
@@ -166,7 +167,7 @@ static struct option builtin_fetch_options[] = {
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
 		 N_("prune local tags no longer on remote and clobber changed tags")),
-	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_explicit, N_("on-demand"),
 		    N_("control recursive fetching of submodules"),
 		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 	OPT_BOOL(0, "dry-run", &dry_run,
@@ -1997,12 +1998,23 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
 
+	if (recurse_submodules_explicit != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_explicit;
+
 	if (negotiate_only) {
-		/*
-		 * --negotiate-only should never recurse into
-		 * submodules, so there is no need to read .gitmodules.
-		 */
-		recurse_submodules = RECURSE_SUBMODULES_OFF;
+		switch (recurse_submodules_explicit) {
+		case RECURSE_SUBMODULES_OFF:
+		case RECURSE_SUBMODULES_DEFAULT: {
+			/*
+			 * --negotiate-only should never recurse into
+			 * submodules, so there is no need to read .gitmodules.
+			 */
+			recurse_submodules = RECURSE_SUBMODULES_OFF;
+			break;
+		}
+		default:
+			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+		}
 	}
 
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49..d099c46499 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -628,6 +628,23 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
 	test_cmp err.expect err.actual
 '
 
+test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
+	SERVER="server" &&
+	URI="file://$(pwd)/server" &&
+
+	setup_negotiate_only "$SERVER" "$URI" &&
+
+	cat >err.expect <<-\EOF &&
+	fatal: --negotiate-only and --recurse-submodules cannot be used together
+	EOF
+
+	test_must_fail git -c protocol.version=2 -C client fetch \
+		--negotiate-only \
+		--recurse-submodules \
+		origin 2>err.actual &&
+	test_cmp err.expect err.actual
+'
+
 test_expect_success 'file:// --negotiate-only' '
 	SERVER="server" &&
 	URI="file://$(pwd)/server" &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22  0:11     ` [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2021-12-22  6:42       ` Junio C Hamano
  2021-12-22 17:28         ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2021-12-22  6:42 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> @@ -2113,6 +2122,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  		result = fetch_multiple(&list, max_children);
>  	}
>  
> +	/*
> +	 * Skip irrelevant tasks because we know objects were not
> +	 * fetched.
> +	 *
> +	 * NEEDSWORK: as a future optimization, we can return early
> +	 * whenever objects were not fetched e.g. if we already have all
> +	 * of them.
> +	 */
> +	if (negotiate_only)
> +		goto cleanup;

Sorry if I did not mention this in the review of the earlier round,
but I think the location this patch places the jump is wrong,
especially with the NEEDSWORK comment.

When we are not under negotiate_only, if our earlier call to
transport_fetch_refs() learns to tell us that that we did not add
any new objects, we would be able to jump to the clean-up label,
making the above code to:

	if (negotiate_only || !num_fetched_objects)
		goto cleanup;

But such a future enhancement is made harder by having this jump
here---the optimization the NEEDSWORK comment alludes to has no
reason to be incompatible with "--recurse-submodules".

If the above block is placed _after_ the "if the main fetch was
successful, and we are not told not to recurse into submodules, then
do this" block we see below, then

 (1) this patch still achieves its goal, as we have manually
     and unconditionally turned recursion off;

 (2) such a future enhancement will not be forbidden from working
     with recurse-submodules feature.

>  	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
>  		struct strvec options = STRVEC_INIT;
>  		int max_children = max_jobs;
> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
> index 8212ca56dc..732031085e 100755
> --- a/t/t5516-fetch-push.sh
> +++ b/t/t5516-fetch-push.sh
> @@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
>  	test_i18ngrep "push negotiation failed" err
>  '
>  
> +test_expect_success 'push with negotiation does not attempt to fetch submodules' '
> +	mk_empty submodule_upstream &&
> +	test_commit -C submodule_upstream submodule_commit &&
> +	git submodule add ./submodule_upstream submodule &&
> +	mk_empty testrepo &&
> +	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
> +	test_commit -C testrepo unrelated_commit &&
> +	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
> +	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
> +	! grep "Fetching submodule" err
> +'
> +
>  test_expect_success 'push without wildcard' '
>  	mk_empty testrepo &&

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules
  2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
@ 2021-12-22  6:46       ` Junio C Hamano
  2021-12-23 19:08       ` Jonathan Tan
  1 sibling, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2021-12-22  6:46 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> +This is incompatible with `--recurse-submodules[=yes|on-demand]`.
>  Internally this is used to implement the `push.negotiate` option, see
>  linkgit:git-config[1].

Makes sense.

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 883bb1b10c..54f970130e 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -75,6 +75,7 @@ static struct transport *gtransport;
>  static struct transport *gsecondary;
>  static const char *submodule_prefix = "";
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> +static int recurse_submodules_explicit = RECURSE_SUBMODULES_DEFAULT;
>  static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
>  static int shown_url = 0;
>  static struct refspec refmap = REFSPEC_INIT_FETCH;
> @@ -166,7 +167,7 @@ static struct option builtin_fetch_options[] = {
>  		 N_("prune remote-tracking branches no longer on remote")),
>  	OPT_BOOL('P', "prune-tags", &prune_tags,
>  		 N_("prune local tags no longer on remote and clobber changed tags")),
> -	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
> +	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_explicit, N_("on-demand"),
>  		    N_("control recursive fetching of submodules"),
>  		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
>  	OPT_BOOL(0, "dry-run", &dry_run,
> @@ -1997,12 +1998,23 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  	argc = parse_options(argc, argv, prefix,
>  			     builtin_fetch_options, builtin_fetch_usage, 0);
>  
> +	if (recurse_submodules_explicit != RECURSE_SUBMODULES_DEFAULT)
> +		recurse_submodules = recurse_submodules_explicit;
> +
>  	if (negotiate_only) {
> -		/*
> -		 * --negotiate-only should never recurse into
> -		 * submodules, so there is no need to read .gitmodules.
> -		 */
> -		recurse_submodules = RECURSE_SUBMODULES_OFF;
> +		switch (recurse_submodules_explicit) {
> +		case RECURSE_SUBMODULES_OFF:
> +		case RECURSE_SUBMODULES_DEFAULT: {
> +			/*
> +			 * --negotiate-only should never recurse into
> +			 * submodules, so there is no need to read .gitmodules.
> +			 */
> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
> +			break;
> +		}
> +		default:
> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +		}
>  	}

Nice. Very cleanly done.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22  6:42       ` Junio C Hamano
@ 2021-12-22 17:28         ` Glen Choo
  2021-12-22 19:29           ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2021-12-22 17:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> @@ -2113,6 +2122,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  		result = fetch_multiple(&list, max_children);
>>  	}
>>  
>> +	/*
>> +	 * Skip irrelevant tasks because we know objects were not
>> +	 * fetched.
>> +	 *
>> +	 * NEEDSWORK: as a future optimization, we can return early
>> +	 * whenever objects were not fetched e.g. if we already have all
>> +	 * of them.
>> +	 */
>> +	if (negotiate_only)
>> +		goto cleanup;
>
> Sorry if I did not mention this in the review of the earlier round,
> but I think the location this patch places the jump is wrong,
> especially with the NEEDSWORK comment.
>
> When we are not under negotiate_only, if our earlier call to
> transport_fetch_refs() learns to tell us that that we did not add
> any new objects, we would be able to jump to the clean-up label,
> making the above code to:
>
> 	if (negotiate_only || !num_fetched_objects)
> 		goto cleanup;

Thanks for the clarification. Yes, we agree that the location of the
jump in this patch should be the same as the location of the jump after
the future optimization.

> But such a future enhancement is made harder by having this jump
> here---the optimization the NEEDSWORK comment alludes to has no
> reason to be incompatible with "--recurse-submodules".
>
> If the above block is placed _after_ the "if the main fetch was
> successful, and we are not told not to recurse into submodules, then
> do this" block we see below, then
>
>  (1) this patch still achieves its goal, as we have manually
>      and unconditionally turned recursion off;
>
>  (2) such a future enhancement will not be forbidden from working
>      with recurse-submodules feature.
>

I would have come to same conclusion if I agreed that we should recurse
into submodules even if no objects are fetched. When I first wrote this
patch, I was convinced that "no new objects" implies "no need to update
submodules" (see my response at [1]), but I'm not sure any more and I'd
like to check my understanding.

The way "fetch --recurse-submodules" works is that the changed
submodules are calcuated from the newly updated tips fetched from the
remote. If no objects were fetched, we already have all of the
superproject commits.

In ~99% of the time, no objects were fetched because the remote doesn't
have any info we do not know about - there are no new commits and no
refs were updated. In this scenario, 'git fetch' can avoid recursing
into submodules because there's no need to. But if we choose to recurse,
the worst thing that happens is that we do some file I/O and realize
that there are no changed submodules - essentially a no-op given that
fetch is slow.

(This is where my understanding of objects vs refs needs to be checked)
In the other ~1%, we might already have all commits, but a remote ref
might still have moved, albeit to a known commit. In this case, the
submodule would need to be updated because it might have changed.

If my understanding is correct, then my patch produces the wrong
behavior in that ~1%. But even if my understanding is wrong, and we
don't need to worry about that edge case, I see that there's unnecessary
risk in trying to be too be clever in my reasoning and skipping what is
essentially a no-op.

Is my understanding accurate? At any rate, I'm somewhat convinced to
move the jump to just after the "if main fetch was successful, and we
are not told not to recurse into submodules" block, i.e. before the "if
we should write the commit graph" block.

[1] https://lore.kernel.org/git/kl6ltuf3ysnw.fsf@chooglen-macbookpro.roam.corp.google.com

>>  	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
>>  		struct strvec options = STRVEC_INIT;
>>  		int max_children = max_jobs;
>> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
>> index 8212ca56dc..732031085e 100755
>> --- a/t/t5516-fetch-push.sh
>> +++ b/t/t5516-fetch-push.sh
>> @@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
>>  	test_i18ngrep "push negotiation failed" err
>>  '
>>  
>> +test_expect_success 'push with negotiation does not attempt to fetch submodules' '
>> +	mk_empty submodule_upstream &&
>> +	test_commit -C submodule_upstream submodule_commit &&
>> +	git submodule add ./submodule_upstream submodule &&
>> +	mk_empty testrepo &&
>> +	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
>> +	test_commit -C testrepo unrelated_commit &&
>> +	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
>> +	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
>> +	! grep "Fetching submodule" err
>> +'
>> +
>>  test_expect_success 'push without wildcard' '
>>  	mk_empty testrepo &&

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22 17:28         ` Glen Choo
@ 2021-12-22 19:29           ` Junio C Hamano
  2021-12-22 20:27             ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2021-12-22 19:29 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> I would have come to same conclusion if I agreed that we should recurse
> into submodules even if no objects are fetched. When I first wrote this
> patch, I was convinced that "no new objects" implies "no need to update
> submodules" (see my response at [1]), but I'm not sure any more and I'd
> like to check my understanding.

For example, there is a "everything_local()" optimization in the
fetch-pack codepath where the following steps happen:

 (1) we look at the current value of the remote refs we are fetching
     from their ls_refs output (let's call it "new tips");

 (2) we notice that all of these objects happen to exist in our
     object store.

 (3) we make sure that we do not see any "missing links" if we run a
     reachability traversal that starts from these objects and our
     existing refs, and stops when the traversal intersect.

When the last step finds that all objects necessary to point at
these "new tips" with our refs safely, then we have no reason to
perform physical transfer of objects.  Yet, we'd update our refs
to the "new tips".

This can happen in a number of ways.  

Imagine that you have a clone of https://github.com/git/git/ for
only its 'main' branch (i.e. a single-branch clone).  If you then
say "git fetch origin maint:maint", we'll learn that the tip of
their 'maint' branch points at a commit, we look into our object
store, find that there is no missing object to reach from it to the
part of the object graph that is reachable from our refs (i.e. my
'maint' is always an ancestor of my 'main'), and we find that there
is no reason to transfer any object.  Yet we will carete a new ref
and point at the commit.

Or if you did "git branch -d" locally, making objects unreachable in
your object store, and then fetch from your upstream, which had fast
forwarded to the contents of the branch you just deleted.

Or they rewound and rebuilt their branches since you fetched the
last time, and then they realized their mistake and now their refs
point at a commit that you have already seen but are different from
what your remote-tracking branches point at now.

Or you are using Derrick's "prefetch" (in "git maintenance run") and
a background process already downloaded the objects needed for the
branch you are fetching in the past.

Depending on what happened when these objects were pre-fetched, such
a real fetch that did not have to perform an object transfer may
likely to need to adjust things in the submodule repository.
"prefetch" is designed not to disrupt and to be invisible to the
normal operation as much as possible, so I would expect that it
won't do any priming of the submodules based on what it prefetched
for the superproject, for example.

So in short, physical object transfer can be optimized out, even
when the external world view, i.e. where in the history graph the
refs point at, changes and makes it necessary to check in the
submodule repositories.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22 19:29           ` Junio C Hamano
@ 2021-12-22 20:27             ` Glen Choo
  0 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2021-12-22 20:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> I would have come to same conclusion if I agreed that we should recurse
>> into submodules even if no objects are fetched. When I first wrote this
>> patch, I was convinced that "no new objects" implies "no need to update
>> submodules" (see my response at [1]), but I'm not sure any more and I'd
>> like to check my understanding.
>
> For example, there is a "everything_local()" optimization in the
> fetch-pack codepath where the following steps happen:
>
>  (1) we look at the current value of the remote refs we are fetching
>      from their ls_refs output (let's call it "new tips");
>
>  (2) we notice that all of these objects happen to exist in our
>      object store.
>
>  (3) we make sure that we do not see any "missing links" if we run a
>      reachability traversal that starts from these objects and our
>      existing refs, and stops when the traversal intersect.
>
> When the last step finds that all objects necessary to point at
> these "new tips" with our refs safely, then we have no reason to
> perform physical transfer of objects.  Yet, we'd update our refs
> to the "new tips".
>
> This can happen in a number of ways.  
>
> Imagine that you have a clone of https://github.com/git/git/ for
> only its 'main' branch (i.e. a single-branch clone).  If you then
> say "git fetch origin maint:maint", we'll learn that the tip of
> their 'maint' branch points at a commit, we look into our object
> store, find that there is no missing object to reach from it to the
> part of the object graph that is reachable from our refs (i.e. my
> 'maint' is always an ancestor of my 'main'), and we find that there
> is no reason to transfer any object.  Yet we will carete a new ref
> and point at the commit.
>
> Or if you did "git branch -d" locally, making objects unreachable in
> your object store, and then fetch from your upstream, which had fast
> forwarded to the contents of the branch you just deleted.
>
> Or they rewound and rebuilt their branches since you fetched the
> last time, and then they realized their mistake and now their refs
> point at a commit that you have already seen but are different from
> what your remote-tracking branches point at now.
>
> Or you are using Derrick's "prefetch" (in "git maintenance run") and
> a background process already downloaded the objects needed for the
> branch you are fetching in the past.
>
> Depending on what happened when these objects were pre-fetched, such
> a real fetch that did not have to perform an object transfer may
> likely to need to adjust things in the submodule repository.
> "prefetch" is designed not to disrupt and to be invisible to the
> normal operation as much as possible, so I would expect that it
> won't do any priming of the submodules based on what it prefetched
> for the superproject, for example.

Thanks for providing concrete examples; that's a lot more convincing
than my hypothetical.

> So in short, physical object transfer can be optimized out, even
> when the external world view, i.e. where in the history graph the
> refs point at, changes and makes it necessary to check in the
> submodule repositories.

Yes, you're right. I'll move the jump accordingly :)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules
  2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
  2021-12-22  6:46       ` Junio C Hamano
@ 2021-12-23 19:08       ` Jonathan Tan
  1 sibling, 0 replies; 50+ messages in thread
From: Jonathan Tan @ 2021-12-23 19:08 UTC (permalink / raw)
  To: chooglen; +Cc: git, jonathantanmy, gitster

Glen Choo <chooglen@google.com> writes:
> The previous commit ignores the value of --recurse-submodules if
> --negotiate-only is given. Since non "no" values of --recurse-submodules
> are not supported with --negotiate-only, make cmd_fetch() check for
> this invalid combination and die.
> 
> This is unlikely to affect internal usage of --negotiate-only, but it
> may be helpful for users. We may also want to discourage users from
> using --negotiate-only altogether because it was not intended for them.

All 3 patches look good, and I'm happy if they were merged as-is.

Having said that, I would delete the last sentence - I envision that
other tools may want to use --negotiate-only to be able to better query
a Git server.

> @@ -75,6 +75,7 @@ static struct transport *gtransport;
>  static struct transport *gsecondary;
>  static const char *submodule_prefix = "";
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> +static int recurse_submodules_explicit = RECURSE_SUBMODULES_DEFAULT;

I would call this "recurse_submodules_cli" as a config variable could
also be considered explicit.

But again, you can consider all these suggestions as optional.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
                       ` (2 preceding siblings ...)
  2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
@ 2022-01-13  0:44     ` Glen Choo
  2022-01-13  0:44       ` [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
                         ` (3 more replies)
  3 siblings, 4 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-13  0:44 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

`git fetch --negotiate-only` is used internally by push negotation and
it behaves very differently from other uses of `git fetch`, e.g. it does
not update refs or fetch objects. But because of how cmd_fetch() is
written, `git fetch --negotiate-only` performs tasks that it shouldn't.
This is results in behavior that is unnecessary at best, and incorrect
at worst:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense.
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true, but this is unnecessary because no
  objects are fetched [1]. 
* gc is run, but according to the commit message in [2], we only do this
  because we expect `git fetch` to introduce objects.

Make `git fetch --negotiate-only` handle these tasks more rigorously by
doing the following:

* Make cmd_fetch() skip irrelevant tasks if we know for certain that
objects will not be fetched
* Disable submodule recursion and die() if a user explicitly asks for it

[1] This is also confirmed by Documentation/config/fetch.txt, which
  states that Git should only write commit graphs if a pack-file is
  downloaded.
[2] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

Changes since v3:
* change commit message subject: builtin/fetch -> fetch --negotiate-only
* move the 'goto cleanup' to _after_ the submodule updating task because
  we may want to update submodules even if objects were not fetched (as
  pointed out by Junio, thanks!)
* disable submodule recursion in the patch that checks for
  --negotiate-only + --recurse-submodules, so we never silently ignore
  --recurse-submodules.
* incorporate some of Jonathan's suggestions (thanks!)

Changes since v2:
* added a prepatory patch that introduces a "goto cleanup"
* drop an unnecessary line move (as suggested by Junio)
* check for user-given --recurse-submodules and die() (as suggested by
  Jonathan and Junio)
* update --negotiate-only's documentation

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment

Glen Choo (3):
  fetch: use goto cleanup in cmd_fetch()
  fetch: skip tasks related to fetching objects
  fetch --negotiate-only: do not update submodules

 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 40 ++++++++++++++++++++++++++++++---
 t/t5516-fetch-push.sh           | 12 ++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++
 4 files changed, 62 insertions(+), 3 deletions(-)

Range-diff against v3:
1:  97e2cba633 ! 1:  ffa1a24109 builtin/fetch: use goto cleanup in cmd_fetch()
    @@ Metadata
     Author: Glen Choo <chooglen@google.com>
     
      ## Commit message ##
    -    builtin/fetch: use goto cleanup in cmd_fetch()
    +    fetch: use goto cleanup in cmd_fetch()
     
         Replace an early return with 'goto cleanup' in cmd_fetch() so that the
    -    string_list is always cleared.
    -
    -    The string_list_clear() call is purely cleanup; the string_list was not
    -    reused.
    +    string_list is always cleared (the string_list_clear() call is purely
    +    cleanup; the string_list is not reused). This makes cleanup consistent
    +    so that a subsequent commit can use 'goto cleanup' to bail out early.
     
         Signed-off-by: Glen Choo <chooglen@google.com>
     
    @@ builtin/fetch.c
     @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
      			gtransport->smart_options->acked_commits = &acked_commits;
      		} else {
    - 			warning(_("Protocol does not support --negotiate-only, exiting."));
    + 			warning(_("protocol does not support --negotiate-only, exiting"));
     -			return 1;
     +			result = 1;
     +			goto cleanup;
2:  3a3a04b649 < -:  ---------- builtin/fetch: skip unnecessary tasks when using --negotiate-only
-:  ---------- > 2:  b0c73e8135 fetch: skip tasks related to fetching objects
3:  97792b5214 ! 3:  914d30866f builtin/fetch: die on --negotiate-only and --recurse-submodules
    @@ Metadata
     Author: Glen Choo <chooglen@google.com>
     
      ## Commit message ##
    -    builtin/fetch: die on --negotiate-only and --recurse-submodules
    +    fetch --negotiate-only: do not update submodules
     
    -    The previous commit ignores the value of --recurse-submodules if
    -    --negotiate-only is given. Since non "no" values of --recurse-submodules
    -    are not supported with --negotiate-only, make cmd_fetch() check for
    -    this invalid combination and die.
    +    `git fetch --negotiate-only` is an implementation detail of push
    +    negotiation and, unlike most `git fetch` invocations, does not actually
    +    update the main repository. Thus it should not update submodules even
    +    if submodule recursion is enabled.
     
    -    This is unlikely to affect internal usage of --negotiate-only, but it
    -    may be helpful for users. We may also want to discourage users from
    -    using --negotiate-only altogether because it was not intended for them.
    +    This is not just slow, it is wrong e.g. push negotiation with
    +    "submodule.recurse=true" will cause submodules to be updated because it
    +    invokes `git fetch --negotiate-only`.
    +
    +    Fix this by disabling submodule recursion if --negotiate-only was given.
    +    Since this makes --negotiate-only and --recurse-submodules incompatible,
    +    check for this invalid combination and die.
    +
    +    This does not use the "goto cleanup" introduced in the previous commit
    +    because we want to recurse through submodules whenever a ref is fetched,
    +    and this can happen without introducing new objects.
     
         Signed-off-by: Glen Choo <chooglen@google.com>
     
    @@ Documentation/fetch-options.txt: configuration variables documented in linkgit:g
      	ancestors of the provided `--negotiation-tip=*` arguments,
      	which we have in common with the server.
      +
    -+This is incompatible with `--recurse-submodules[=yes|on-demand]`.
    ++This is incompatible with `--recurse-submodules=[yes|on-demand]`.
      Internally this is used to implement the `push.negotiate` option, see
      linkgit:git-config[1].
      
    @@ builtin/fetch.c: static struct transport *gtransport;
      static struct transport *gsecondary;
      static const char *submodule_prefix = "";
      static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
    -+static int recurse_submodules_explicit = RECURSE_SUBMODULES_DEFAULT;
    ++static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
      static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
      static int shown_url = 0;
      static struct refspec refmap = REFSPEC_INIT_FETCH;
    @@ builtin/fetch.c: static struct option builtin_fetch_options[] = {
      	OPT_BOOL('P', "prune-tags", &prune_tags,
      		 N_("prune local tags no longer on remote and clobber changed tags")),
     -	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
    -+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_explicit, N_("on-demand"),
    ++	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
      		    N_("control recursive fetching of submodules"),
      		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
      	OPT_BOOL(0, "dry-run", &dry_run,
     @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
    + 
      	argc = parse_options(argc, argv, prefix,
      			     builtin_fetch_options, builtin_fetch_usage, 0);
    - 
    -+	if (recurse_submodules_explicit != RECURSE_SUBMODULES_DEFAULT)
    -+		recurse_submodules = recurse_submodules_explicit;
     +
    - 	if (negotiate_only) {
    --		/*
    --		 * --negotiate-only should never recurse into
    --		 * submodules, so there is no need to read .gitmodules.
    --		 */
    --		recurse_submodules = RECURSE_SUBMODULES_OFF;
    -+		switch (recurse_submodules_explicit) {
    ++	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
    ++		recurse_submodules = recurse_submodules_cli;
    ++
    ++	if (negotiate_only) {
    ++		switch (recurse_submodules_cli) {
     +		case RECURSE_SUBMODULES_OFF:
     +		case RECURSE_SUBMODULES_DEFAULT: {
     +			/*
     +			 * --negotiate-only should never recurse into
    -+			 * submodules, so there is no need to read .gitmodules.
    ++			 * submodules. Skip it by setting recurse_submodules to
    ++			 * RECURSE_SUBMODULES_OFF.
     +			 */
     +			recurse_submodules = RECURSE_SUBMODULES_OFF;
     +			break;
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +		default:
     +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
     +		}
    - 	}
    - 
    ++	}
    ++
      	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
    + 		int *sfjc = submodule_fetch_jobs_config == -1
    + 			    ? &submodule_fetch_jobs_config : NULL;
    +
    + ## t/t5516-fetch-push.sh ##
    +@@ t/t5516-fetch-push.sh: test_expect_success 'push with negotiation proceeds anyway even if negotiation f
    + 	test_i18ngrep "push negotiation failed" err
    + '
    + 
    ++test_expect_success 'push with negotiation does not attempt to fetch submodules' '
    ++	mk_empty submodule_upstream &&
    ++	test_commit -C submodule_upstream submodule_commit &&
    ++	git submodule add ./submodule_upstream submodule &&
    ++	mk_empty testrepo &&
    ++	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
    ++	test_commit -C testrepo unrelated_commit &&
    ++	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
    ++	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
    ++	! grep "Fetching submodule" err
    ++'
    ++
    + test_expect_success 'push without wildcard' '
    + 	mk_empty testrepo &&
    + 
     
      ## t/t5702-protocol-v2.sh ##
     @@ t/t5702-protocol-v2.sh: test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
    @@ t/t5702-protocol-v2.sh: test_expect_success 'usage: --negotiate-only without --n
      '
      
     +test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
    -+	SERVER="server" &&
    -+	URI="file://$(pwd)/server" &&
    -+
    -+	setup_negotiate_only "$SERVER" "$URI" &&
    -+
     +	cat >err.expect <<-\EOF &&
     +	fatal: --negotiate-only and --recurse-submodules cannot be used together
     +	EOF

base-commit: 90d242d36e248acfae0033274b524bfa55a947fd
-- 
2.33.GIT


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch()
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2022-01-13  0:44       ` Glen Choo
  2022-01-13  0:45       ` [PATCH v4 2/3] fetch: skip tasks related to fetching objects Glen Choo
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-13  0:44 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared (the string_list_clear() call is purely
cleanup; the string_list is not reused). This makes cleanup consistent
so that a subsequent commit can use 'goto cleanup' to bail out early.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index eaab8056bf..0a625cb137 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2095,7 +2095,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			gtransport->smart_options->acked_commits = &acked_commits;
 		} else {
 			warning(_("protocol does not support --negotiate-only, exiting"));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 		if (server_options.nr)
 			gtransport->server_options = &server_options;
@@ -2151,8 +2152,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
@@ -2170,5 +2169,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (enable_auto_gc)
 		run_auto_maintenance(verbosity < 0);
 
+ cleanup:
+	string_list_clear(&list, 0);
 	return result;
 }
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 2/3] fetch: skip tasks related to fetching objects
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-13  0:44       ` [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
@ 2022-01-13  0:45       ` Glen Choo
  2022-01-13  0:45       ` [PATCH v4 3/3] fetch --negotiate-only: do not update submodules Glen Choo
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-13  0:45 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

cmd_fetch() does the following with the assumption that objects are
fetched:

* Run gc
* Write commit graphs (if enabled by fetch.writeCommitGraph=true)

However, neither of these tasks makes sense if objects are not fetched
e.g. `git fetch --negotiate-only` never fetches objects.

Speed up cmd_fetch() by bailing out early if we know for certain that
objects will not be fetched. cmd_fetch() can bail out early whenever
objects are not fetched, but for now this only considers
--negotiate-only.

The same optimization does not apply to `git fetch --dry-run` because
that actually fetches objects; the dry run refers to not updating refs.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0a625cb137..7bbff5a029 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2152,6 +2152,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		goto cleanup;
+
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 3/3] fetch --negotiate-only: do not update submodules
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-13  0:44       ` [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
  2022-01-13  0:45       ` [PATCH v4 2/3] fetch: skip tasks related to fetching objects Glen Choo
@ 2022-01-13  0:45       ` Glen Choo
  2022-01-13  1:16         ` Junio C Hamano
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2022-01-13  0:45 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

`git fetch --negotiate-only` is an implementation detail of push
negotiation and, unlike most `git fetch` invocations, does not actually
update the main repository. Thus it should not update submodules even
if submodule recursion is enabled.

This is not just slow, it is wrong e.g. push negotiation with
"submodule.recurse=true" will cause submodules to be updated because it
invokes `git fetch --negotiate-only`.

Fix this by disabling submodule recursion if --negotiate-only was given.
Since this makes --negotiate-only and --recurse-submodules incompatible,
check for this invalid combination and die.

This does not use the "goto cleanup" introduced in the previous commit
because we want to recurse through submodules whenever a ref is fetched,
and this can happen without introducing new objects.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 24 +++++++++++++++++++++++-
 t/t5516-fetch-push.sh           | 12 ++++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index e967ff1874..f903683189 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -71,6 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
+This is incompatible with `--recurse-submodules=[yes|on-demand]`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbff5a029..8b8bde8809 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -76,6 +76,7 @@ static struct transport *gtransport;
 static struct transport *gsecondary;
 static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
@@ -167,7 +168,7 @@ static struct option builtin_fetch_options[] = {
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
 		 N_("prune local tags no longer on remote and clobber changed tags")),
-	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
 		    N_("control recursive fetching of submodules"),
 		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 	OPT_BOOL(0, "dry-run", &dry_run,
@@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_cli;
+
+	if (negotiate_only) {
+		switch (recurse_submodules_cli) {
+		case RECURSE_SUBMODULES_OFF:
+		case RECURSE_SUBMODULES_DEFAULT: {
+			/*
+			 * --negotiate-only should never recurse into
+			 * submodules. Skip it by setting recurse_submodules to
+			 * RECURSE_SUBMODULES_OFF.
+			 */
+			recurse_submodules = RECURSE_SUBMODULES_OFF;
+			break;
+		}
+		default:
+			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+		}
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 2f04cf9a1c..87881726ed 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 710f33e2aa..1b9023d3f0 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
 	test_cmp err.expect err.actual
 '
 
+test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
+	cat >err.expect <<-\EOF &&
+	fatal: --negotiate-only and --recurse-submodules cannot be used together
+	EOF
+
+	test_must_fail git -c protocol.version=2 -C client fetch \
+		--negotiate-only \
+		--recurse-submodules \
+		origin 2>err.actual &&
+	test_cmp err.expect err.actual
+'
+
 test_expect_success 'file:// --negotiate-only' '
 	SERVER="server" &&
 	URI="file://$(pwd)/server" &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 3/3] fetch --negotiate-only: do not update submodules
  2022-01-13  0:45       ` [PATCH v4 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-13  1:16         ` Junio C Hamano
  0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-01-13  1:16 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 7bbff5a029..8b8bde8809 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -76,6 +76,7 @@ static struct transport *gtransport;
>  static struct transport *gsecondary;
>  static const char *submodule_prefix = "";
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> +static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
>  static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
>  static int shown_url = 0;
>  static struct refspec refmap = REFSPEC_INIT_FETCH;
> @@ -167,7 +168,7 @@ static struct option builtin_fetch_options[] = {
>  		 N_("prune remote-tracking branches no longer on remote")),
>  	OPT_BOOL('P', "prune-tags", &prune_tags,
>  		 N_("prune local tags no longer on remote and clobber changed tags")),
> -	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
> +	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
>  		    N_("control recursive fetching of submodules"),
>  		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
>  	OPT_BOOL(0, "dry-run", &dry_run,
> @@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  
>  	argc = parse_options(argc, argv, prefix,
>  			     builtin_fetch_options, builtin_fetch_usage, 0);
> +
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
> +		recurse_submodules = recurse_submodules_cli;

This made me wonder what should happen if the command line option
was given and explicitly told us to use the default, but after
following the option_fetch_parse_recurse_submodules() codeflow, I
realized that it will never return RECURSE_SUBMODULES_DEFAULT, so it
is OK.  It is a bit misleading that _DEFAULT does not mean "use the
default settings" here---it merely means "this variable was left
untouched".  But I suppose that it is in line with all the other
uses of RECURSE_SUBMODULES_DEFAULT, in which case it is OK for now.

> +	if (negotiate_only) {
> +		switch (recurse_submodules_cli) {
> +		case RECURSE_SUBMODULES_OFF:
> +		case RECURSE_SUBMODULES_DEFAULT: {
> +			/*
> +			 * --negotiate-only should never recurse into
> +			 * submodules. Skip it by setting recurse_submodules to
> +			 * RECURSE_SUBMODULES_OFF.
> +			 */
> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
> +			break;
> +		}

It is not immediately obvious to me why we need an extra block
here.  If there is no reason, let's not have it---there is no reason
to puzzle readers into wondering if anything funny is going on if
there is nothing unusual.

> +		default:
> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +		}
> +	}
> +

Other than that, looking good.

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
                         ` (2 preceding siblings ...)
  2022-01-13  0:45       ` [PATCH v4 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-18 18:54       ` Glen Choo
  2022-01-18 18:54         ` [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
                           ` (3 more replies)
  3 siblings, 4 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-18 18:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

`git fetch --negotiate-only` is used internally by push negotation and
it behaves very differently from other uses of `git fetch`, e.g. it does
not update refs or fetch objects. But because of how cmd_fetch() is
written, `git fetch --negotiate-only` performs tasks that it shouldn't.
This is results in behavior that is unnecessary at best, and incorrect
at worst:

* Submodules are updated if enabled by recurse.submodules=true, but
  negotiation fetch doesn't actually update the repo, so this doesn't
  make sense.
* Commit graphs will be written if enabled by
  fetch.writeCommitGraph=true, but this is unnecessary because no
  objects are fetched [1]. 
* gc is run, but according to the commit message in [2], we only do this
  because we expect `git fetch` to introduce objects.

Make `git fetch --negotiate-only` handle these tasks more rigorously by
doing the following:

* Make cmd_fetch() skip irrelevant tasks if we know for certain that
objects will not be fetched
* Disable submodule recursion and die() if a user explicitly asks for it

[1] This is also confirmed by Documentation/config/fetch.txt, which
  states that Git should only write commit graphs if a pack-file is
  downloaded.
[2] 131b8fcbfb (fetch: run gc --auto after fetching, 2013-01-26)

Changes since v4:
* drop an unnecessary block (thanks Junio!)

Changes since v3:
* change commit message subject: builtin/fetch -> fetch --negotiate-only
* move the 'goto cleanup' to _after_ the submodule updating task because
  we may want to update submodules even if objects were not fetched (as
  pointed out by Junio, thanks!)
* disable submodule recursion in the patch that checks for
  --negotiate-only + --recurse-submodules, so we never silently ignore
  --recurse-submodules.
* incorporate some of Jonathan's suggestions (thanks!)

Changes since v2:
* added a prepatory patch that introduces a "goto cleanup"
* drop an unnecessary line move (as suggested by Junio)
* check for user-given --recurse-submodules and die() (as suggested by
  Jonathan and Junio)
* update --negotiate-only's documentation

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment

Glen Choo (3):
  fetch: use goto cleanup in cmd_fetch()
  fetch: skip tasks related to fetching objects
  fetch --negotiate-only: do not update submodules

 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 40 ++++++++++++++++++++++++++++++---
 t/t5516-fetch-push.sh           | 12 ++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++
 4 files changed, 62 insertions(+), 3 deletions(-)

Range-diff against v4:
1:  ffa1a24109 = 1:  ffa1a24109 fetch: use goto cleanup in cmd_fetch()
2:  b0c73e8135 = 2:  b0c73e8135 fetch: skip tasks related to fetching objects
3:  914d30866f ! 3:  f929297961 fetch --negotiate-only: do not update submodules
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
     +		recurse_submodules = recurse_submodules_cli;
     +
    -+	if (negotiate_only) {
    ++	if (negotiate_only)
     +		switch (recurse_submodules_cli) {
     +		case RECURSE_SUBMODULES_OFF:
     +		case RECURSE_SUBMODULES_DEFAULT: {
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +		default:
     +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
     +		}
    -+	}
    ++
     +
      	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
      		int *sfjc = submodule_fetch_jobs_config == -1

base-commit: 90d242d36e248acfae0033274b524bfa55a947fd
-- 
2.33.GIT


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch()
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2022-01-18 18:54         ` Glen Choo
  2022-01-18 18:54         ` [PATCH v5 2/3] fetch: skip tasks related to fetching objects Glen Choo
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-18 18:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared (the string_list_clear() call is purely
cleanup; the string_list is not reused). This makes cleanup consistent
so that a subsequent commit can use 'goto cleanup' to bail out early.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index eaab8056bf..0a625cb137 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2095,7 +2095,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			gtransport->smart_options->acked_commits = &acked_commits;
 		} else {
 			warning(_("protocol does not support --negotiate-only, exiting"));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 		if (server_options.nr)
 			gtransport->server_options = &server_options;
@@ -2151,8 +2152,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
@@ -2170,5 +2169,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (enable_auto_gc)
 		run_auto_maintenance(verbosity < 0);
 
+ cleanup:
+	string_list_clear(&list, 0);
 	return result;
 }
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 2/3] fetch: skip tasks related to fetching objects
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-18 18:54         ` [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
@ 2022-01-18 18:54         ` Glen Choo
  2022-01-18 18:54         ` [PATCH v5 3/3] fetch --negotiate-only: do not update submodules Glen Choo
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-18 18:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

cmd_fetch() does the following with the assumption that objects are
fetched:

* Run gc
* Write commit graphs (if enabled by fetch.writeCommitGraph=true)

However, neither of these tasks makes sense if objects are not fetched
e.g. `git fetch --negotiate-only` never fetches objects.

Speed up cmd_fetch() by bailing out early if we know for certain that
objects will not be fetched. cmd_fetch() can bail out early whenever
objects are not fetched, but for now this only considers
--negotiate-only.

The same optimization does not apply to `git fetch --dry-run` because
that actually fetches objects; the dry run refers to not updating refs.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0a625cb137..7bbff5a029 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2152,6 +2152,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		goto cleanup;
+
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 3/3] fetch --negotiate-only: do not update submodules
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-18 18:54         ` [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
  2022-01-18 18:54         ` [PATCH v5 2/3] fetch: skip tasks related to fetching objects Glen Choo
@ 2022-01-18 18:54         ` Glen Choo
  2022-01-18 22:05           ` Junio C Hamano
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2022-01-18 18:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

`git fetch --negotiate-only` is an implementation detail of push
negotiation and, unlike most `git fetch` invocations, does not actually
update the main repository. Thus it should not update submodules even
if submodule recursion is enabled.

This is not just slow, it is wrong e.g. push negotiation with
"submodule.recurse=true" will cause submodules to be updated because it
invokes `git fetch --negotiate-only`.

Fix this by disabling submodule recursion if --negotiate-only was given.
Since this makes --negotiate-only and --recurse-submodules incompatible,
check for this invalid combination and die.

This does not use the "goto cleanup" introduced in the previous commit
because we want to recurse through submodules whenever a ref is fetched,
and this can happen without introducing new objects.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 24 +++++++++++++++++++++++-
 t/t5516-fetch-push.sh           | 12 ++++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index e967ff1874..f903683189 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -71,6 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
+This is incompatible with `--recurse-submodules=[yes|on-demand]`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbff5a029..63f74776a2 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -76,6 +76,7 @@ static struct transport *gtransport;
 static struct transport *gsecondary;
 static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
@@ -167,7 +168,7 @@ static struct option builtin_fetch_options[] = {
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
 		 N_("prune local tags no longer on remote and clobber changed tags")),
-	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
 		    N_("control recursive fetching of submodules"),
 		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 	OPT_BOOL(0, "dry-run", &dry_run,
@@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_cli;
+
+	if (negotiate_only)
+		switch (recurse_submodules_cli) {
+		case RECURSE_SUBMODULES_OFF:
+		case RECURSE_SUBMODULES_DEFAULT: {
+			/*
+			 * --negotiate-only should never recurse into
+			 * submodules. Skip it by setting recurse_submodules to
+			 * RECURSE_SUBMODULES_OFF.
+			 */
+			recurse_submodules = RECURSE_SUBMODULES_OFF;
+			break;
+		}
+		default:
+			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+		}
+
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 2f04cf9a1c..87881726ed 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 710f33e2aa..1b9023d3f0 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
 	test_cmp err.expect err.actual
 '
 
+test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
+	cat >err.expect <<-\EOF &&
+	fatal: --negotiate-only and --recurse-submodules cannot be used together
+	EOF
+
+	test_must_fail git -c protocol.version=2 -C client fetch \
+		--negotiate-only \
+		--recurse-submodules \
+		origin 2>err.actual &&
+	test_cmp err.expect err.actual
+'
+
 test_expect_success 'file:// --negotiate-only' '
 	SERVER="server" &&
 	URI="file://$(pwd)/server" &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/3] fetch --negotiate-only: do not update submodules
  2022-01-18 18:54         ` [PATCH v5 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-18 22:05           ` Junio C Hamano
  2022-01-18 23:41             ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-01-18 22:05 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> +	if (negotiate_only)
> +		switch (recurse_submodules_cli) {
> +		case RECURSE_SUBMODULES_OFF:
> +		case RECURSE_SUBMODULES_DEFAULT: {
> +			/*
> +			 * --negotiate-only should never recurse into
> +			 * submodules. Skip it by setting recurse_submodules to
> +			 * RECURSE_SUBMODULES_OFF.
> +			 */
> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
> +			break;
> +		}
> +		default:
> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +		}

I think that this part has the only difference since the previous
round, but I am puzzled about it.  Everything else looks as good as
the previous round was.

I did not (and I do not) mind the block for the body of this if
statement.  Even though technically a single switch() statement
makes a single statement block that does not need {} around, it is
large enough that extra {} around (which you had in the previous
round) may make it clear where the body begins and ends.

But do we really need the extra block _inside_ the switch statement?
IOW I would have expected to see this instead:

		switch (recurse_submodules_cli) {
		case RECURSE_SUBMODULES_OFF:
		case RECURSE_SUBMODULES_DEFAULT:
			/*
			 * --negotiate-only should never recurse into
			 * submodules. Skip it by setting recurse_submodules to
			 * RECURSE_SUBMODULES_OFF.
			 */
			recurse_submodules = RECURSE_SUBMODULES_OFF;
			break;
		default:
			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
		}

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/3] fetch --negotiate-only: do not update submodules
  2022-01-18 22:05           ` Junio C Hamano
@ 2022-01-18 23:41             ` Glen Choo
  2022-01-19  0:26               ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2022-01-18 23:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> +	if (negotiate_only)
>> +		switch (recurse_submodules_cli) {
>> +		case RECURSE_SUBMODULES_OFF:
>> +		case RECURSE_SUBMODULES_DEFAULT: {
>> +			/*
>> +			 * --negotiate-only should never recurse into
>> +			 * submodules. Skip it by setting recurse_submodules to
>> +			 * RECURSE_SUBMODULES_OFF.
>> +			 */
>> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
>> +			break;
>> +		}
>> +		default:
>> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
>> +		}
>
> I think that this part has the only difference since the previous
> round, but I am puzzled about it.  Everything else looks as good as
> the previous round was.
>
> I did not (and I do not) mind the block for the body of this if
> statement.  Even though technically a single switch() statement
> makes a single statement block that does not need {} around, it is
> large enough that extra {} around (which you had in the previous
> round) may make it clear where the body begins and ends.

Yes, that makes sense. This was why I added it initially.

> But do we really need the extra block _inside_ the switch statement?
> IOW I would have expected to see this instead:
>
> 		switch (recurse_submodules_cli) {
> 		case RECURSE_SUBMODULES_OFF:
> 		case RECURSE_SUBMODULES_DEFAULT:
> 			/*
> 			 * --negotiate-only should never recurse into
> 			 * submodules. Skip it by setting recurse_submodules to
> 			 * RECURSE_SUBMODULES_OFF.
> 			 */
> 			recurse_submodules = RECURSE_SUBMODULES_OFF;
> 			break;
> 		default:
> 			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> 		}
>
> Thanks.

Ah, I misunderstood. I'll blame the fact that I'm recovering from
COVID ;) I'll remove the block from the "case".

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
                           ` (2 preceding siblings ...)
  2022-01-18 18:54         ` [PATCH v5 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-19  0:00         ` Glen Choo
  2022-01-19  0:00           ` [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
                             ` (3 more replies)
  3 siblings, 4 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-19  0:00 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

Changes since v5:
* revert v4
* drop the unnecessary block that Junio actually meant

Changes since v4:
* drop an unnecessary block (thanks Junio!)

Changes since v3:
* change commit message subject: builtin/fetch -> fetch --negotiate-only
* move the 'goto cleanup' to _after_ the submodule updating task because
  we may want to update submodules even if objects were not fetched (as
  pointed out by Junio, thanks!)
* disable submodule recursion in the patch that checks for
  --negotiate-only + --recurse-submodules, so we never silently ignore
  --recurse-submodules.
* incorporate some of Jonathan's suggestions (thanks!)

Changes since v2:
* added a prepatory patch that introduces a "goto cleanup"
* drop an unnecessary line move (as suggested by Junio)
* check for user-given --recurse-submodules and die() (as suggested by
  Jonathan and Junio)
* update --negotiate-only's documentation

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment

Glen Choo (3):
  fetch: use goto cleanup in cmd_fetch()
  fetch: skip tasks related to fetching objects
  fetch --negotiate-only: do not update submodules

 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 40 ++++++++++++++++++++++++++++++---
 t/t5516-fetch-push.sh           | 12 ++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++
 4 files changed, 62 insertions(+), 3 deletions(-)

Range-diff against v5:
1:  ffa1a24109 = 1:  ffa1a24109 fetch: use goto cleanup in cmd_fetch()
2:  b0c73e8135 = 2:  b0c73e8135 fetch: skip tasks related to fetching objects
3:  f929297961 ! 3:  239101e752 fetch --negotiate-only: do not update submodules
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
     +		recurse_submodules = recurse_submodules_cli;
     +
    -+	if (negotiate_only)
    ++	if (negotiate_only) {
     +		switch (recurse_submodules_cli) {
     +		case RECURSE_SUBMODULES_OFF:
    -+		case RECURSE_SUBMODULES_DEFAULT: {
    ++		case RECURSE_SUBMODULES_DEFAULT:
     +			/*
     +			 * --negotiate-only should never recurse into
     +			 * submodules. Skip it by setting recurse_submodules to
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +			 */
     +			recurse_submodules = RECURSE_SUBMODULES_OFF;
     +			break;
    -+		}
    ++
     +		default:
     +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
     +		}
    -+
    ++	}
     +
      	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
      		int *sfjc = submodule_fetch_jobs_config == -1

base-commit: 90d242d36e248acfae0033274b524bfa55a947fd
-- 
2.33.GIT


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch()
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2022-01-19  0:00           ` Glen Choo
  2022-01-19  0:00           ` [PATCH v6 2/3] fetch: skip tasks related to fetching objects Glen Choo
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-19  0:00 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared (the string_list_clear() call is purely
cleanup; the string_list is not reused). This makes cleanup consistent
so that a subsequent commit can use 'goto cleanup' to bail out early.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index eaab8056bf..0a625cb137 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2095,7 +2095,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			gtransport->smart_options->acked_commits = &acked_commits;
 		} else {
 			warning(_("protocol does not support --negotiate-only, exiting"));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 		if (server_options.nr)
 			gtransport->server_options = &server_options;
@@ -2151,8 +2152,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
@@ -2170,5 +2169,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (enable_auto_gc)
 		run_auto_maintenance(verbosity < 0);
 
+ cleanup:
+	string_list_clear(&list, 0);
 	return result;
 }
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 2/3] fetch: skip tasks related to fetching objects
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-19  0:00           ` [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
@ 2022-01-19  0:00           ` Glen Choo
  2022-01-19  0:00           ` [PATCH v6 3/3] fetch --negotiate-only: do not update submodules Glen Choo
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-19  0:00 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

cmd_fetch() does the following with the assumption that objects are
fetched:

* Run gc
* Write commit graphs (if enabled by fetch.writeCommitGraph=true)

However, neither of these tasks makes sense if objects are not fetched
e.g. `git fetch --negotiate-only` never fetches objects.

Speed up cmd_fetch() by bailing out early if we know for certain that
objects will not be fetched. cmd_fetch() can bail out early whenever
objects are not fetched, but for now this only considers
--negotiate-only.

The same optimization does not apply to `git fetch --dry-run` because
that actually fetches objects; the dry run refers to not updating refs.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0a625cb137..7bbff5a029 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2152,6 +2152,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		goto cleanup;
+
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 3/3] fetch --negotiate-only: do not update submodules
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-19  0:00           ` [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
  2022-01-19  0:00           ` [PATCH v6 2/3] fetch: skip tasks related to fetching objects Glen Choo
@ 2022-01-19  0:00           ` Glen Choo
  2022-01-20  2:38             ` Jiang Xin
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  3 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2022-01-19  0:00 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano

`git fetch --negotiate-only` is an implementation detail of push
negotiation and, unlike most `git fetch` invocations, does not actually
update the main repository. Thus it should not update submodules even
if submodule recursion is enabled.

This is not just slow, it is wrong e.g. push negotiation with
"submodule.recurse=true" will cause submodules to be updated because it
invokes `git fetch --negotiate-only`.

Fix this by disabling submodule recursion if --negotiate-only was given.
Since this makes --negotiate-only and --recurse-submodules incompatible,
check for this invalid combination and die.

This does not use the "goto cleanup" introduced in the previous commit
because we want to recurse through submodules whenever a ref is fetched,
and this can happen without introducing new objects.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 24 +++++++++++++++++++++++-
 t/t5516-fetch-push.sh           | 12 ++++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index e967ff1874..f903683189 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -71,6 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
+This is incompatible with `--recurse-submodules=[yes|on-demand]`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbff5a029..fef4332fb1 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -76,6 +76,7 @@ static struct transport *gtransport;
 static struct transport *gsecondary;
 static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
@@ -167,7 +168,7 @@ static struct option builtin_fetch_options[] = {
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
 		 N_("prune local tags no longer on remote and clobber changed tags")),
-	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
 		    N_("control recursive fetching of submodules"),
 		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 	OPT_BOOL(0, "dry-run", &dry_run,
@@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_cli;
+
+	if (negotiate_only) {
+		switch (recurse_submodules_cli) {
+		case RECURSE_SUBMODULES_OFF:
+		case RECURSE_SUBMODULES_DEFAULT:
+			/*
+			 * --negotiate-only should never recurse into
+			 * submodules. Skip it by setting recurse_submodules to
+			 * RECURSE_SUBMODULES_OFF.
+			 */
+			recurse_submodules = RECURSE_SUBMODULES_OFF;
+			break;
+
+		default:
+			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+		}
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 2f04cf9a1c..87881726ed 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 710f33e2aa..1b9023d3f0 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
 	test_cmp err.expect err.actual
 '
 
+test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
+	cat >err.expect <<-\EOF &&
+	fatal: --negotiate-only and --recurse-submodules cannot be used together
+	EOF
+
+	test_must_fail git -c protocol.version=2 -C client fetch \
+		--negotiate-only \
+		--recurse-submodules \
+		origin 2>err.actual &&
+	test_cmp err.expect err.actual
+'
+
 test_expect_success 'file:// --negotiate-only' '
 	SERVER="server" &&
 	URI="file://$(pwd)/server" &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/3] fetch --negotiate-only: do not update submodules
  2022-01-18 23:41             ` Glen Choo
@ 2022-01-19  0:26               ` Junio C Hamano
  0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-01-19  0:26 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan

Glen Choo <chooglen@google.com> writes:

> Ah, I misunderstood. I'll blame the fact that I'm recovering from
> COVID ;) I'll remove the block from the "case".

Thanks.  Take it easy and be safe.

Will queue the updated one.  It looks good to me otherwise; unless I
hear otherwise from others, let's have it in 'next' soon, and
include it in the first batch after the current cycle.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v6 3/3] fetch --negotiate-only: do not update submodules
  2022-01-19  0:00           ` [PATCH v6 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-20  2:38             ` Jiang Xin
  2022-01-20 17:40               ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Jiang Xin @ 2022-01-20  2:38 UTC (permalink / raw)
  To: Glen Choo, Git List
  Cc: Jiang Xin, Junio C Hamano, Jean-Noël Avila, Jonathan Tan

On Tue, 18 Jan 2022 at 16:00:56, Glen Choo wrote:

> @@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  
>  	argc = parse_options(argc, argv, prefix,
>  			     builtin_fetch_options, builtin_fetch_usage, 0);
> +
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
> +		recurse_submodules = recurse_submodules_cli;
> +
> +	if (negotiate_only) {
> +		switch (recurse_submodules_cli) {
> +		case RECURSE_SUBMODULES_OFF:
> +		case RECURSE_SUBMODULES_DEFAULT:
> +			/*
> +			 * --negotiate-only should never recurse into
> +			 * submodules. Skip it by setting recurse_submodules to
> +			 * RECURSE_SUBMODULES_OFF.
> +			 */
> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
> +			break;
> +
> +		default:
> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));

Found this new l10n message in "po/git.pot" in the next branch[1]. To be
l10n-friendly, it's better to follow the standard Jean-Noël provided in
commit 246cac8505 (i18n: turn even more messages into "cannot be used
together" ones, 2022-01-05). And rewrite the l10n messageas as:

-			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+			die(_("options '%s' and '%s' cannot be used together"),
+			    "--negotiate-only", "--recurse-submodules");

[1]: https://github.com/git-l10n/git-po/blob/pot/next/2022-01-19.diff#L23

--
Jiang Xin


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v6 3/3] fetch --negotiate-only: do not update submodules
  2022-01-20  2:38             ` Jiang Xin
@ 2022-01-20 17:40               ` Glen Choo
  0 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 17:40 UTC (permalink / raw)
  To: Jiang Xin, Git List
  Cc: Jiang Xin, Junio C Hamano, Jean-Noël Avila, Jonathan Tan

Jiang Xin <worldhello.net@gmail.com> writes:

> On Tue, 18 Jan 2022 at 16:00:56, Glen Choo wrote:
>
>> @@ -2014,6 +2015,27 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  
>>  	argc = parse_options(argc, argv, prefix,
>>  			     builtin_fetch_options, builtin_fetch_usage, 0);
>> +
>> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
>> +		recurse_submodules = recurse_submodules_cli;
>> +
>> +	if (negotiate_only) {
>> +		switch (recurse_submodules_cli) {
>> +		case RECURSE_SUBMODULES_OFF:
>> +		case RECURSE_SUBMODULES_DEFAULT:
>> +			/*
>> +			 * --negotiate-only should never recurse into
>> +			 * submodules. Skip it by setting recurse_submodules to
>> +			 * RECURSE_SUBMODULES_OFF.
>> +			 */
>> +			recurse_submodules = RECURSE_SUBMODULES_OFF;
>> +			break;
>> +
>> +		default:
>> +			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
>
> Found this new l10n message in "po/git.pot" in the next branch[1]. To be
> l10n-friendly, it's better to follow the standard Jean-Noël provided in
> commit 246cac8505 (i18n: turn even more messages into "cannot be used
> together" ones, 2022-01-05). And rewrite the l10n messageas as:
>
> -			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +			die(_("options '%s' and '%s' cannot be used together"),
> +			    "--negotiate-only", "--recurse-submodules");
>
> [1]: https://github.com/git-l10n/git-po/blob/pot/next/2022-01-19.diff#L23
>
> --
> Jiang Xin

Thanks for spotting this! I didn't realize we had standardized this.
I'll send a reroll soon.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
                             ` (2 preceding siblings ...)
  2022-01-19  0:00           ` [PATCH v6 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-20 17:49           ` Glen Choo
  2022-01-20 17:49             ` [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
                               ` (3 more replies)
  3 siblings, 4 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 17:49 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano, Jiang Xin

Changes since v6:
* use standard message format introduced in 246cac8505 (i18n: turn even
  more messages into "cannot be used together" ones, 2022-01-05) (thanks
  Jiang Xin!)

Changes since v5:
* revert v4
* drop the unnecessary block that Junio actually meant

Changes since v4:
* drop an unnecessary block (thanks Junio!)

Changes since v3:
* change commit message subject: builtin/fetch -> fetch --negotiate-only
* move the 'goto cleanup' to _after_ the submodule updating task because
  we may want to update submodules even if objects were not fetched (as
  pointed out by Junio, thanks!)
* disable submodule recursion in the patch that checks for
  --negotiate-only + --recurse-submodules, so we never silently ignore
  --recurse-submodules.
* incorporate some of Jonathan's suggestions (thanks!)

Changes since v2:
* added a prepatory patch that introduces a "goto cleanup"
* drop an unnecessary line move (as suggested by Junio)
* check for user-given --recurse-submodules and die() (as suggested by
  Jonathan and Junio)
* update --negotiate-only's documentation

Changes since v1:
* added more context to commit message
* added a NEEDSWORK comment

Glen Choo (3):
  fetch: use goto cleanup in cmd_fetch()
  fetch: skip tasks related to fetching objects
  fetch --negotiate-only: do not update submodules

 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 41 ++++++++++++++++++++++++++++++---
 t/t5516-fetch-push.sh           | 12 ++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++
 4 files changed, 63 insertions(+), 3 deletions(-)

Range-diff against v6:
1:  ffa1a24109 = 1:  ffa1a24109 fetch: use goto cleanup in cmd_fetch()
2:  b0c73e8135 = 2:  b0c73e8135 fetch: skip tasks related to fetching objects
3:  239101e752 ! 3:  7e0de7232b fetch --negotiate-only: do not update submodules
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
     +			break;
     +
     +		default:
    -+			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
    ++			die(_("options '%s' and '%s' cannot be used together"),
    ++			    "--negotiate-only", "--recurse-submodules");
     +		}
     +	}
     +

base-commit: 90d242d36e248acfae0033274b524bfa55a947fd
-- 
2.33.GIT


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch()
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
@ 2022-01-20 17:49             ` Glen Choo
  2022-01-20 17:49             ` [PATCH v7 2/3] fetch: skip tasks related to fetching objects Glen Choo
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 17:49 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano, Jiang Xin

Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared (the string_list_clear() call is purely
cleanup; the string_list is not reused). This makes cleanup consistent
so that a subsequent commit can use 'goto cleanup' to bail out early.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index eaab8056bf..0a625cb137 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2095,7 +2095,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			gtransport->smart_options->acked_commits = &acked_commits;
 		} else {
 			warning(_("protocol does not support --negotiate-only, exiting"));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 		if (server_options.nr)
 			gtransport->server_options = &server_options;
@@ -2151,8 +2152,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
-	string_list_clear(&list, 0);
-
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
@@ -2170,5 +2169,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (enable_auto_gc)
 		run_auto_maintenance(verbosity < 0);
 
+ cleanup:
+	string_list_clear(&list, 0);
 	return result;
 }
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v7 2/3] fetch: skip tasks related to fetching objects
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-20 17:49             ` [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
@ 2022-01-20 17:49             ` Glen Choo
  2022-01-20 17:49             ` [PATCH v7 3/3] fetch --negotiate-only: do not update submodules Glen Choo
  2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
  3 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 17:49 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano, Jiang Xin

cmd_fetch() does the following with the assumption that objects are
fetched:

* Run gc
* Write commit graphs (if enabled by fetch.writeCommitGraph=true)

However, neither of these tasks makes sense if objects are not fetched
e.g. `git fetch --negotiate-only` never fetches objects.

Speed up cmd_fetch() by bailing out early if we know for certain that
objects will not be fetched. cmd_fetch() can bail out early whenever
objects are not fetched, but for now this only considers
--negotiate-only.

The same optimization does not apply to `git fetch --dry-run` because
that actually fetches objects; the dry run refers to not updating refs.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fetch.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0a625cb137..7bbff5a029 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2152,6 +2152,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		strvec_clear(&options);
 	}
 
+	/*
+	 * Skip irrelevant tasks because we know objects were not
+	 * fetched.
+	 *
+	 * NEEDSWORK: as a future optimization, we can return early
+	 * whenever objects were not fetched e.g. if we already have all
+	 * of them.
+	 */
+	if (negotiate_only)
+		goto cleanup;
+
 	prepare_repo_settings(the_repository);
 	if (fetch_write_commit_graph > 0 ||
 	    (fetch_write_commit_graph < 0 &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v7 3/3] fetch --negotiate-only: do not update submodules
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
  2022-01-20 17:49             ` [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
  2022-01-20 17:49             ` [PATCH v7 2/3] fetch: skip tasks related to fetching objects Glen Choo
@ 2022-01-20 17:49             ` Glen Choo
  2022-01-20 23:08               ` Junio C Hamano
  2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
  3 siblings, 1 reply; 50+ messages in thread
From: Glen Choo @ 2022-01-20 17:49 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Jonathan Tan, Junio C Hamano, Jiang Xin

`git fetch --negotiate-only` is an implementation detail of push
negotiation and, unlike most `git fetch` invocations, does not actually
update the main repository. Thus it should not update submodules even
if submodule recursion is enabled.

This is not just slow, it is wrong e.g. push negotiation with
"submodule.recurse=true" will cause submodules to be updated because it
invokes `git fetch --negotiate-only`.

Fix this by disabling submodule recursion if --negotiate-only was given.
Since this makes --negotiate-only and --recurse-submodules incompatible,
check for this invalid combination and die.

This does not use the "goto cleanup" introduced in the previous commit
because we want to recurse through submodules whenever a ref is fetched,
and this can happen without introducing new objects.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/fetch-options.txt |  1 +
 builtin/fetch.c                 | 25 ++++++++++++++++++++++++-
 t/t5516-fetch-push.sh           | 12 ++++++++++++
 t/t5702-protocol-v2.sh          | 12 ++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index e967ff1874..f903683189 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -71,6 +71,7 @@ configuration variables documented in linkgit:git-config[1], and the
 	ancestors of the provided `--negotiation-tip=*` arguments,
 	which we have in common with the server.
 +
+This is incompatible with `--recurse-submodules=[yes|on-demand]`.
 Internally this is used to implement the `push.negotiate` option, see
 linkgit:git-config[1].
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbff5a029..efd1c9bb41 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -76,6 +76,7 @@ static struct transport *gtransport;
 static struct transport *gsecondary;
 static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
@@ -167,7 +168,7 @@ static struct option builtin_fetch_options[] = {
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
 		 N_("prune local tags no longer on remote and clobber changed tags")),
-	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules, N_("on-demand"),
+	OPT_CALLBACK_F(0, "recurse-submodules", &recurse_submodules_cli, N_("on-demand"),
 		    N_("control recursive fetching of submodules"),
 		    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 	OPT_BOOL(0, "dry-run", &dry_run,
@@ -2014,6 +2015,28 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_cli;
+
+	if (negotiate_only) {
+		switch (recurse_submodules_cli) {
+		case RECURSE_SUBMODULES_OFF:
+		case RECURSE_SUBMODULES_DEFAULT:
+			/*
+			 * --negotiate-only should never recurse into
+			 * submodules. Skip it by setting recurse_submodules to
+			 * RECURSE_SUBMODULES_OFF.
+			 */
+			recurse_submodules = RECURSE_SUBMODULES_OFF;
+			break;
+
+		default:
+			die(_("options '%s' and '%s' cannot be used together"),
+			    "--negotiate-only", "--recurse-submodules");
+		}
+	}
+
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
 		int *sfjc = submodule_fetch_jobs_config == -1
 			    ? &submodule_fetch_jobs_config : NULL;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 2f04cf9a1c..87881726ed 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -229,6 +229,18 @@ test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	test_i18ngrep "push negotiation failed" err
 '
 
+test_expect_success 'push with negotiation does not attempt to fetch submodules' '
+	mk_empty submodule_upstream &&
+	test_commit -C submodule_upstream submodule_commit &&
+	git submodule add ./submodule_upstream submodule &&
+	mk_empty testrepo &&
+	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
+	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
+	git -c submodule.recurse=true -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
+	! grep "Fetching submodule" err
+'
+
 test_expect_success 'push without wildcard' '
 	mk_empty testrepo &&
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 710f33e2aa..1b9023d3f0 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
 	test_cmp err.expect err.actual
 '
 
+test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
+	cat >err.expect <<-\EOF &&
+	fatal: --negotiate-only and --recurse-submodules cannot be used together
+	EOF
+
+	test_must_fail git -c protocol.version=2 -C client fetch \
+		--negotiate-only \
+		--recurse-submodules \
+		origin 2>err.actual &&
+	test_cmp err.expect err.actual
+'
+
 test_expect_success 'file:// --negotiate-only' '
 	SERVER="server" &&
 	URI="file://$(pwd)/server" &&
-- 
2.33.GIT


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
                               ` (2 preceding siblings ...)
  2022-01-20 17:49             ` [PATCH v7 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-20 21:58             ` Junio C Hamano
  2022-01-20 23:15               ` Glen Choo
  2022-01-21  2:17               ` Jiang Xin
  3 siblings, 2 replies; 50+ messages in thread
From: Junio C Hamano @ 2022-01-20 21:58 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan, Jiang Xin

Glen Choo <chooglen@google.com> writes:

> Changes since v6:
> * use standard message format introduced in 246cac8505 (i18n: turn even
>   more messages into "cannot be used together" ones, 2022-01-05) (thanks
>   Jiang Xin!)

As v6 is already in 'next' since yesterday, let's make it an
incremental update.  It would give us a place to spell out why we
consider this change is desirable.

This is a tangent, but I recall there was a talk about "reviewer
checklist".  Things like:

 - check if we can reuse existing and identical message to reduce
   load on translators

 - when we are creating a file in a subdirectory of $GIT_DIR, be
   prepared to see any directories other than $GIT_DIR itself
   missing and create them as necessary

 - use safe_create_leading_directories() and adjust_shared_perm() on
   things under $GIT_DIR but not in the working tree


may belong there.

I am not sure if it is feasible to create and maintain such a list,
though.

----- >8 --------- >8 --------- >8 --------- >8 -----
Subject: [PATCH] fetch: help translators by reusing the same message template

Follow the example set by 12909b6b (i18n: turn "options are
incompatible" into "cannot be used together", 2022-01-05) and use
the same message string to reduce the need for translation.

Reported-by: Jiang Xin <worldhello.net@gmail.com>
Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git c/builtin/fetch.c w/builtin/fetch.c
index dc6e637428..5c329f9835 100644
--- c/builtin/fetch.c
+++ w/builtin/fetch.c
@@ -2014,7 +2014,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			break;
 
 		default:
-			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
+			die(_("options '%s' and '%s' cannot be used together"),
+			    "--negotiate-only", "--recurse-submodules");
 		}
 	}
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v7 3/3] fetch --negotiate-only: do not update submodules
  2022-01-20 17:49             ` [PATCH v7 3/3] fetch --negotiate-only: do not update submodules Glen Choo
@ 2022-01-20 23:08               ` Junio C Hamano
  2022-01-20 23:16                 ` Glen Choo
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2022-01-20 23:08 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Jonathan Tan, Jiang Xin

Glen Choo <chooglen@google.com> writes:

> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index 710f33e2aa..1b9023d3f0 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
>  	test_cmp err.expect err.actual
>  '
>  
> +test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
> +	cat >err.expect <<-\EOF &&
> +	fatal: --negotiate-only and --recurse-submodules cannot be used together

This also needs to be updated to match.  I've fixed up locally.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
@ 2022-01-20 23:15               ` Glen Choo
  2022-01-21  2:17               ` Jiang Xin
  1 sibling, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 23:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, Jiang Xin

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> Changes since v6:
>> * use standard message format introduced in 246cac8505 (i18n: turn even
>>   more messages into "cannot be used together" ones, 2022-01-05) (thanks
>>   Jiang Xin!)
>
> As v6 is already in 'next' since yesterday, let's make it an
> incremental update.  It would give us a place to spell out why we
> consider this change is desirable.

Ah, yes. I will take note of this for the future.

>
> This is a tangent, but I recall there was a talk about "reviewer
> checklist".  Things like:
>
>  - check if we can reuse existing and identical message to reduce
>    load on translators
>
>  - when we are creating a file in a subdirectory of $GIT_DIR, be
>    prepared to see any directories other than $GIT_DIR itself
>    missing and create them as necessary
>
>  - use safe_create_leading_directories() and adjust_shared_perm() on
>    things under $GIT_DIR but not in the working tree
>
>
> may belong there.
>
> I am not sure if it is feasible to create and maintain such a list,
> though.

This sounds like a combination of low-hanging fruit things to check when
submitting/reviewing. I think that even a minimal list is preferable to
the toil of spotting and fixing the same mistakes over and over.

A ReviewingPatches doc has been discussed internally for a while, but I
don't recall if this checklist was part of it.

> ----- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: [PATCH] fetch: help translators by reusing the same message template
>
> Follow the example set by 12909b6b (i18n: turn "options are
> incompatible" into "cannot be used together", 2022-01-05) and use
> the same message string to reduce the need for translation.
>
> Reported-by: Jiang Xin <worldhello.net@gmail.com>
> Helped-by: Glen Choo <chooglen@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/fetch.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git c/builtin/fetch.c w/builtin/fetch.c
> index dc6e637428..5c329f9835 100644
> --- c/builtin/fetch.c
> +++ w/builtin/fetch.c
> @@ -2014,7 +2014,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  			break;
>  
>  		default:
> -			die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +			die(_("options '%s' and '%s' cannot be used together"),
> +			    "--negotiate-only", "--recurse-submodules");
>  		}
>  	}
>  

The diff looks good. Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v7 3/3] fetch --negotiate-only: do not update submodules
  2022-01-20 23:08               ` Junio C Hamano
@ 2022-01-20 23:16                 ` Glen Choo
  0 siblings, 0 replies; 50+ messages in thread
From: Glen Choo @ 2022-01-20 23:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, Jiang Xin

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
>> index 710f33e2aa..1b9023d3f0 100755
>> --- a/t/t5702-protocol-v2.sh
>> +++ b/t/t5702-protocol-v2.sh
>> @@ -628,6 +628,18 @@ test_expect_success 'usage: --negotiate-only without --negotiation-tip' '
>>  	test_cmp err.expect err.actual
>>  '
>>  
>> +test_expect_success 'usage: --negotiate-only with --recurse-submodules' '
>> +	cat >err.expect <<-\EOF &&
>> +	fatal: --negotiate-only and --recurse-submodules cannot be used together
>
> This also needs to be updated to match.  I've fixed up locally.

Ah, my mistake. Thanks for the catch :)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only
  2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
  2022-01-20 23:15               ` Glen Choo
@ 2022-01-21  2:17               ` Jiang Xin
  1 sibling, 0 replies; 50+ messages in thread
From: Jiang Xin @ 2022-01-21  2:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Glen Choo, Git List, Jonathan Tan, Alexander Shopov, Jordi Mas,
	Matthias Rüster, Jimmy Angelakos, Christopher Díaz,
	Jean-Noël Avila, Bagas Sanjaya, Alessandro Menti,
	Gwan-gyeong Mun, Arusekk, Daniel Santos, Dimitriy Ryazantcev,
	Peter Krefting, Emir SARI, Trần Ngọc Quân,
	Fangyi Zhou, Yi-Jyun Pan

On Fri, Jan 21, 2022 at 5:58 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Glen Choo <chooglen@google.com> writes:
>
> > Changes since v6:
> > * use standard message format introduced in 246cac8505 (i18n: turn even
> >   more messages into "cannot be used together" ones, 2022-01-05) (thanks
> >   Jiang Xin!)
>
> As v6 is already in 'next' since yesterday, let's make it an
> incremental update.  It would give us a place to spell out why we
> consider this change is desirable.
>
> This is a tangent, but I recall there was a talk about "reviewer
> checklist".  Things like:
>
>  - check if we can reuse existing and identical message to reduce
>    load on translators

How to spot typos in l10n messages in time, instead of waiting until
the very end of the release cycle, has always puzzled me for a long
time. Until two months ago, by introducing a new GitHub Actions[1] in
the "git-l10n/git-po" repository[2], it was possible to generate
temporary "po/git.pot" file based on new changes in "git.git" in time,
and create incremental diffs of "po/git.pot" as separate files in the
"pot/master", "pot/next", and "pot/seen" branches in the "git-po"
repository. By checking the diffs daily, several i18n issues were
found in recent months.

We can make some improvements to the workflow:

1. Send each incremental diff of "po/git.pot" to active l10n team
leaders, and also to this mailing list. Git contributors can find
"side effects" introduced by their patches.

2. Just like using the "git-po-helper" program[3] to check "msgstr" in
l10n translations, develop a new helper to check "msgid". E.g.: check
if we can reuse existing and identical message to reduce load on
translators.

This workflow only works when patches go into the branch "seen" of
git.git repository. Lagging behind code reviews.

[1]: https://github.com/git-l10n/git-po/blob/pot/CI/.github/workflows/git-pot.yml
[2]: https://github.com/git-l10n/git-po
[3]: https://github.com/git-l10n/git-po-helper

>  - when we are creating a file in a subdirectory of $GIT_DIR, be
>    prepared to see any directories other than $GIT_DIR itself
>    missing and create them as necessary
>
>  - use safe_create_leading_directories() and adjust_shared_perm() on
>    things under $GIT_DIR but not in the working tree
>
>
> may belong there.
>
> I am not sure if it is feasible to create and maintain such a list,
> though.
>
> ----- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: [PATCH] fetch: help translators by reusing the same message template
>
> Follow the example set by 12909b6b (i18n: turn "options are
> incompatible" into "cannot be used together", 2022-01-05) and use
> the same message string to reduce the need for translation.
>
> Reported-by: Jiang Xin <worldhello.net@gmail.com>
> Helped-by: Glen Choo <chooglen@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/fetch.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git c/builtin/fetch.c w/builtin/fetch.c
> index dc6e637428..5c329f9835 100644
> --- c/builtin/fetch.c
> +++ w/builtin/fetch.c
> @@ -2014,7 +2014,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>                         break;
>
>                 default:
> -                       die(_("--negotiate-only and --recurse-submodules cannot be used together"));
> +                       die(_("options '%s' and '%s' cannot be used together"),
> +                           "--negotiate-only", "--recurse-submodules");
>                 }
>         }
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-01-21  2:18 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-07 19:29 [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2021-12-09 22:12 ` Jonathan Tan
2021-12-09 22:36   ` Glen Choo
2021-12-13 22:58     ` Jonathan Tan
2021-12-16 18:11       ` Glen Choo
2021-12-17  0:02 ` [PATCH v2] " Glen Choo
2021-12-17 23:35   ` Junio C Hamano
2021-12-20 19:37     ` Glen Choo
2021-12-20 19:56       ` Junio C Hamano
2021-12-20 20:54         ` Glen Choo
2021-12-20 22:12           ` Junio C Hamano
2021-12-21  0:18             ` Glen Choo
2021-12-21 23:07       ` Glen Choo
2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
2021-12-22  0:11     ` [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch() Glen Choo
2021-12-22  0:11     ` [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2021-12-22  6:42       ` Junio C Hamano
2021-12-22 17:28         ` Glen Choo
2021-12-22 19:29           ` Junio C Hamano
2021-12-22 20:27             ` Glen Choo
2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
2021-12-22  6:46       ` Junio C Hamano
2021-12-23 19:08       ` Jonathan Tan
2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-13  0:44       ` [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-13  0:45       ` [PATCH v4 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-13  0:45       ` [PATCH v4 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-13  1:16         ` Junio C Hamano
2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-18 18:54         ` [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-18 18:54         ` [PATCH v5 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-18 18:54         ` [PATCH v5 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-18 22:05           ` Junio C Hamano
2022-01-18 23:41             ` Glen Choo
2022-01-19  0:26               ` Junio C Hamano
2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-19  0:00           ` [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-19  0:00           ` [PATCH v6 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-19  0:00           ` [PATCH v6 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-20  2:38             ` Jiang Xin
2022-01-20 17:40               ` Glen Choo
2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-20 17:49             ` [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-20 17:49             ` [PATCH v7 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-20 17:49             ` [PATCH v7 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-20 23:08               ` Junio C Hamano
2022-01-20 23:16                 ` Glen Choo
2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
2022-01-20 23:15               ` Glen Choo
2022-01-21  2:17               ` Jiang Xin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).