From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Benedek Kozma <cyberbeni@gmail.com>, Glen Choo <chooglen@google.com>
Subject: [PATCH v2] fetch: do not run a redundant fetch from submodule
Date: Mon, 16 May 2022 16:53:40 -0700 [thread overview]
Message-ID: <xmqqk0alyqyj.fsf_-_@gitster.g> (raw)
In-Reply-To: <xmqqczgd16wx.fsf_-_@gitster.g> (Junio C. Hamano's message of "Mon, 16 May 2022 14:53:02 -0700")
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option,
2010-11-12) introduced the "--recurse-submodule" option, the
approach taken was to perform fetches in submodules only once, after
all the main fetching (it may usually be a fetch from a single
remote, but it could be fetching from a group of remotes using
fetch_multiple()) succeeded. Later we added "--all" to fetch from
all defined remotes, which complicated things even more.
If your project has a submodule, and you try to run "git fetch
--recurse-submodule --all", you'd see a fetch for the top-level,
which invokes another fetch for the submodule, followed by another
fetch for the same submodule. All but the last fetch for the
submodule come from a "git fetch --recurse-submodules" subprocess
that is spawned via the fetch_multiple() interface for the remotes,
and the last fetch comes from the code at the end.
Because recursive fetching from submodules is done in each fetch for
the top-level in fetch_multiple(), the last fetch in the submodule
is redundant. It only matters when fetch_one() interacts with a
single remote at the top-level.
While we are at it, there is one optimization that exists in dealing
with a group of remote, but is missing when "--all" is used. In the
former, when the group turns out to be a group of one, instead of
spawning "git fetch" as a subprocess via the fetch_multiple()
interface, we use the normal fetch_one() code path. Do the same
when handing "--all", if it turns out that we have only one remote
defined.
Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
So here is a second attempt. It demonstrates a bit interesting
funny in range-diff where similar changes from the previous round
gets applied to a different target.
t5617 is much cleanly organized than t5526, and we may want to clean
up the latter after dust settles.
1: 006fe43da1 ! 1: da0a4e341b fetch: do not run a redundant fetch from submodule
@@ Commit message
when handing "--all", if it turns out that we have only one remote
defined.
+ Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
## builtin/fetch.c ##
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, &list);
+
-+ /* no point doing fetch_multiple() of one */
++ /* do not do fetch_multiple() of one */
+ if (list.nr == 1)
+ remote = remote_get(list.items[0].string);
} else if (argc == 0) {
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
}
- if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
++
++ /*
++ * This is only needed after fetch_one(), which does not fetch
++ * submodules by itself.
++ *
++ * When we fetch from multiple remotes, fetch_multiple() has
++ * already updated submodules to grab commits necessary for
++ * the fetched history from each remote, so there is no need
++ * to fetch submodules from here.
++ */
+ if (!result && remote && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct strvec options = STRVEC_INIT;
int max_children = max_jobs;
- ## t/t5617-clone-submodules-remote.sh ##
-@@ t/t5617-clone-submodules-remote.sh: test_expect_success 'clone with --single-branch' '
+ ## t/t5526-fetch-submodules.sh ##
+@@ t/t5526-fetch-submodules.sh: test_expect_success 'fetch --recurse-submodules updates name-conflicted, unpopul
)
'
+test_expect_success 'fetch --all with --recurse-submodules' '
-+ test_when_finished "rm -fr super_clone" &&
-+ git clone --recurse-submodules srv.bare super_clone &&
++ test_when_finished "rm -fr src_clone" &&
++ git clone --recurse-submodules src src_clone &&
+ (
-+ cd super_clone &&
++ cd src_clone &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
-+ grep "Fetching sub" fetch-log >fetch-subs &&
++ grep "^Fetching submodule sub$" fetch-log >fetch-subs &&
+ test_line_count = 1 fetch-subs
+'
+
- # do basic partial clone from "srv.bare"
- # confirm partial clone was registered in the local config for super and sub.
- test_expect_success 'clone with --filter' '
++test_expect_success 'fetch --all with --recurse-submodules with multiple' '
++ test_when_finished "rm -fr src_clone" &&
++ git clone --recurse-submodules src src_clone &&
++ (
++ cd src_clone &&
++ git remote add secondary ../src &&
++ git config submodule.recurse true &&
++ git config fetch.parallel 0 &&
++ git fetch --all 2>../fetch-log
++ ) &&
++ grep "Fetching submodule sub" fetch-log >fetch-subs &&
++ test_line_count = 2 fetch-subs
++'
++
+ test_done
builtin/fetch.c | 16 +++++++++++++++-
t/t5526-fetch-submodules.sh | 27 +++++++++++++++++++++++++++
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index e3791f09ed..8b15c40bb2 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2187,6 +2187,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, &list);
+
+ /* do not do fetch_multiple() of one */
+ if (list.nr == 1)
+ remote = remote_get(list.items[0].string);
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
@@ -2261,7 +2265,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
result = fetch_multiple(&list, max_children);
}
- if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
+
+ /*
+ * This is only needed after fetch_one(), which does not fetch
+ * submodules by itself.
+ *
+ * When we fetch from multiple remotes, fetch_multiple() has
+ * already updated submodules to grab commits necessary for
+ * the fetched history from each remote, so there is no need
+ * to fetch submodules from here.
+ */
+ if (!result && remote && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct strvec options = STRVEC_INIT;
int max_children = max_jobs;
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 43dada8544..a301b56db8 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -1125,4 +1125,31 @@ test_expect_success 'fetch --recurse-submodules updates name-conflicted, unpopul
)
'
+test_expect_success 'fetch --all with --recurse-submodules' '
+ test_when_finished "rm -fr src_clone" &&
+ git clone --recurse-submodules src src_clone &&
+ (
+ cd src_clone &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
+ grep "^Fetching submodule sub$" fetch-log >fetch-subs &&
+ test_line_count = 1 fetch-subs
+'
+
+test_expect_success 'fetch --all with --recurse-submodules with multiple' '
+ test_when_finished "rm -fr src_clone" &&
+ git clone --recurse-submodules src src_clone &&
+ (
+ cd src_clone &&
+ git remote add secondary ../src &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
+ grep "Fetching submodule sub" fetch-log >fetch-subs &&
+ test_line_count = 2 fetch-subs
+'
+
test_done
--
2.36.1-384-ga138d62aa5
next prev parent reply other threads:[~2022-05-16 23:53 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-29 14:46 Bugreport - submodules are fetched twice in some cases Benedek Kozma
2022-04-29 17:39 ` Junio C Hamano
2022-04-29 19:05 ` Glen Choo
2022-04-29 20:02 ` Junio C Hamano
2022-04-29 20:37 ` Glen Choo
2022-05-14 0:07 ` Glen Choo
2022-05-14 5:24 ` Junio C Hamano
2022-05-16 17:45 ` Glen Choo
2022-05-16 18:25 ` Junio C Hamano
2022-05-16 19:04 ` Junio C Hamano
2022-05-16 21:53 ` [PATCH] fetch: do not run a redundant fetch from submodule Junio C Hamano
2022-05-16 22:56 ` Glen Choo
2022-05-16 23:33 ` Junio C Hamano
2022-05-16 23:53 ` Junio C Hamano [this message]
2022-05-17 16:47 ` [PATCH v2] " Glen Choo
2022-05-18 15:53 ` Junio C Hamano
2022-05-14 0:15 ` Bugreport - submodules are fetched twice in some cases Glen Choo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqk0alyqyj.fsf_-_@gitster.g \
--to=gitster@pobox.com \
--cc=chooglen@google.com \
--cc=cyberbeni@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).