git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Glen Choo <chooglen@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH v2] builtin/fetch: skip unnecessary tasks when using --negotiate-only
Date: Mon, 20 Dec 2021 11:37:07 -0800	[thread overview]
Message-ID: <kl6ltuf3ysnw.fsf@chooglen-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <xmqqilvm24bb.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

>> * gc is run, but according to [3], we only do this because we expect
>>   `git fetch` to introduce objects.
>
> Makes sense.  As we haven't added any new objects, there is nothing
> (other than the passage of time) that adds to the need to collect
> garbage.
>
> It makes me wonder if we need to do anything upon "fetch --dry-run".
> I know we add to the object store without making anything reachable,
> so that the user can do pre-flight checks with the real objects.  We
> do not change the reachability so there is no reason to rewrite the
> graph file, but we do add cruft to the object store.
>
> Doing something about "--dry-run" is obviously outside the scope of
> this topic, but it may make sense to think about it while we are
> thinking about "fetch".

I hadn't considered "--dry-run". I'll think about that while I structure
this patch.

>> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> index f7abbc31ff..85091af99b 100644
>> --- a/builtin/fetch.c
>> +++ b/builtin/fetch.c
>> @@ -1996,6 +1996,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  
>>  	argc = parse_options(argc, argv, prefix,
>>  			     builtin_fetch_options, builtin_fetch_usage, 0);
>> +
>> +	if (negotiate_only) {
>> +		/*
>> +		 * --negotiate-only should never recurse into
>> +		 * submodules, so there is no need to read .gitmodules.
>> +		 */
>> +		recurse_submodules = RECURSE_SUBMODULES_OFF;
>> +		if (!negotiation_tip.nr)
>> +			die(_("--negotiate-only needs one or more --negotiate-tip=*"));
>> +	}
>> +
>
> This means "fetch --negotiate-only --recurse-submodules" silently
> ignores an explicit wish by the user.
>
> I suspect that this part should be more like this.
>
> 	if (negitiate_only &&
> 	    recurse_submodules != RECURSE_SUBMODULES_OFF) {
> 		if (recurse_submodules came from the parse_options)
> 			die(_("'--%s' cannot be used with '--%s'",
> 			      "recurse-submodules", "negotiate-only"));
> 		recurse_submodules = RECURSE_SUBMODULES_OFF;
> 	}
>
> That is, we complain if user gives us a combination we do not
> support, but we are OK if the configuration is set to do so and
> silently ignore (because we declare that the combination does not
> make sense).

Yes, Jonathan proposed this as well. This is identical to the approach
in gc/branch-recurse-submodules, which as I noted in [1], is a bit
inconsistent with submodule parsing in general.

I decided *against* this because I thought that "--negotiate-only" is
internal-only. I don't see why a user would use "--negotiate-only", but
if this is a user journey we want to care about, then adding the
explicit check sounds ok.

> By the way, do not move the check about the number of negotiation
> tips from the original location.  That check, or its location, have
> nothing to do with what you want to do in this patch, which is "do
> not gc or update the graph file if we are not fetching".  It is
> better to leave unrelated changes out of the patch.

Ah, I see that it's not easy to tell whether or not the behavior is
correct after that line is moved. I'll avoid doing this in the future.

I still think that it is cleaner to move the negotiation_tip.nr check.
Should I do this in a follow-up patch?

> In order to tell if recurse_submodules that is not OFF came from the
> call to parse_options(), you may need to capture the value of the
> variable before calling parse_options() and compare it with the
> current value in the above illustration code snippet I gave.
>
> Having said all that, is it true that recurse-submodules should not
> be combined with negotiate-only?  I naively think it would not be
> surprising if users expect negotiate-only fetches are done also in
> the submodules.

In the current form of "--negotiate-only", no it does not make sense for
them to be combined. I think users would have the same expectations as
you if they were invoking "git fetch --negotiate-only" directly, but I'm
not convinced that that they will ever do so, except to debug push
negotiation.

I hope Jonathan can chime in to confirm whether or not users want/need
to invoke "--negotiate-only".

> Whatever we decide the right behaviour should be, we should document
> it. With your patch without any of my above input, I would expect at
> least something like
>
>     diff --git i/Documentation/fetch-options.txt w/Documentation/fetch-options.txt
>     index e967ff1874..baf2e9c50d 100644
>     --- i/Documentation/fetch-options.txt
>     +++ w/Documentation/fetch-options.txt
>     @@ -73,6 +73,9 @@ configuration variables documented in linkgit:git-config[1], and the
>      +
>      Internally this is used to implement the `push.negotiate` option, see
>      linkgit:git-config[1].
>     ++
>     +Note that this option silently makes various options that do not make
>     +sense to be used together with it (e.g. `--recurse-submodules`) ignored.
>      
>      --dry-run::
>             Show what would be done, without making any changes.
>
> to leave wiggling room for us to silently ignore more.  We may know
> about --recurse-submodules today, but I would not be surprised if we
> find more.

Sounds good.

>> @@ -2112,6 +2120,19 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  		result = fetch_multiple(&list, max_children);
>>  	}
>>  
>> +	string_list_clear(&list, 0);
>> +
>> +	/*
>> +	 * Skip irrelevant tasks because we know objects were not
>> +	 * fetched.
>> +	 *
>> +	 * NEEDSWORK: as a future optimization, we can return early
>> +	 * whenever objects were not fetched e.g. if we already have all
>> +	 * of them.
>> +	 */
>> +	if (negotiate_only)
>> +		return result;
>> +
>
> I find it somewhat misleading to have the early return before the
> block for recurse_submodules, as we _are_ already forcing it to not
> to recurse.  It would be more readable if it went before the place
> where we start doing the post-action clean-ups like reachability
> graphs and garbage collection.

Your feedback is valid, but I think it is true because negotiate_only
and recurse_submodules just happen to have a specifal interaction. As
indicated by the comments, this early return can apply to any situation
where objects were not fetched at all, but we only consider
negotiate_only right now.

This early return should also apply to submodules because if no new
objects were found in the superproject, all of the superproject commits
are referencing known submodule commits. If the submodule commits are
not known, they should be updated with "git submodule update", not "git
fetch".

>>  	if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
>>  		struct strvec options = STRVEC_INIT;
>>  		int max_children = max_jobs;
>> @@ -2132,8 +2153,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>>  		strvec_clear(&options);
>>  	}
>>  
>> -	string_list_clear(&list, 0);
>> -
>
> Namely, here.

Hm, I realize that I could have used a goto instead of moving
string_list_clear()..

>
>>  	prepare_repo_settings(the_repository);
>
> This is existing code, but I wonder why it can be done _SO_ late in
> the sequence.  We've already called the transport API for the
> negotiate-only communication at this point, but a call to this
> function is the only thing that gives fetch_negotiation_algorithm
> member in the_repository its default value, isn't it?

That's right, this looks like it could be a bug. Maybe Jonathan knows
more.

[1] https://lore.kernel.org/git/kl6lwnk4to5x.fsf@chooglen-macbookpro.roam.corp.google.com

  reply	other threads:[~2021-12-20 19:37 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07 19:29 [PATCH] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2021-12-09 22:12 ` Jonathan Tan
2021-12-09 22:36   ` Glen Choo
2021-12-13 22:58     ` Jonathan Tan
2021-12-16 18:11       ` Glen Choo
2021-12-17  0:02 ` [PATCH v2] " Glen Choo
2021-12-17 23:35   ` Junio C Hamano
2021-12-20 19:37     ` Glen Choo [this message]
2021-12-20 19:56       ` Junio C Hamano
2021-12-20 20:54         ` Glen Choo
2021-12-20 22:12           ` Junio C Hamano
2021-12-21  0:18             ` Glen Choo
2021-12-21 23:07       ` Glen Choo
2021-12-22  0:11   ` [PATCH v3 0/3] " Glen Choo
2021-12-22  0:11     ` [PATCH v3 1/3] builtin/fetch: use goto cleanup in cmd_fetch() Glen Choo
2021-12-22  0:11     ` [PATCH v3 2/3] builtin/fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2021-12-22  6:42       ` Junio C Hamano
2021-12-22 17:28         ` Glen Choo
2021-12-22 19:29           ` Junio C Hamano
2021-12-22 20:27             ` Glen Choo
2021-12-22  0:11     ` [PATCH v3 3/3] builtin/fetch: die on --negotiate-only and --recurse-submodules Glen Choo
2021-12-22  6:46       ` Junio C Hamano
2021-12-23 19:08       ` Jonathan Tan
2022-01-13  0:44     ` [PATCH v4 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-13  0:44       ` [PATCH v4 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-13  0:45       ` [PATCH v4 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-13  0:45       ` [PATCH v4 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-13  1:16         ` Junio C Hamano
2022-01-18 18:54       ` [PATCH v5 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-18 18:54         ` [PATCH v5 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-18 18:54         ` [PATCH v5 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-18 18:54         ` [PATCH v5 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-18 22:05           ` Junio C Hamano
2022-01-18 23:41             ` Glen Choo
2022-01-19  0:26               ` Junio C Hamano
2022-01-19  0:00         ` [PATCH v6 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-19  0:00           ` [PATCH v6 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-19  0:00           ` [PATCH v6 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-19  0:00           ` [PATCH v6 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-20  2:38             ` Jiang Xin
2022-01-20 17:40               ` Glen Choo
2022-01-20 17:49           ` [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Glen Choo
2022-01-20 17:49             ` [PATCH v7 1/3] fetch: use goto cleanup in cmd_fetch() Glen Choo
2022-01-20 17:49             ` [PATCH v7 2/3] fetch: skip tasks related to fetching objects Glen Choo
2022-01-20 17:49             ` [PATCH v7 3/3] fetch --negotiate-only: do not update submodules Glen Choo
2022-01-20 23:08               ` Junio C Hamano
2022-01-20 23:16                 ` Glen Choo
2022-01-20 21:58             ` Re* [PATCH v7 0/3] fetch: skip unnecessary tasks when using --negotiate-only Junio C Hamano
2022-01-20 23:15               ` Glen Choo
2022-01-21  2:17               ` Jiang Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kl6ltuf3ysnw.fsf@chooglen-macbookpro.roam.corp.google.com \
    --to=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).