list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <>
Cc: Jonathan Tan <>,
	Taylor Blau <>,
	Derrick Stolee <>,
	Derrick Stolee <>
Subject: [PATCH] clone: --filter=tree:0 implies fetch.recurseSubmodules=no
Date: Fri, 20 Nov 2020 20:36:26 +0000	[thread overview]
Message-ID: <> (raw)

From: Derrick Stolee <>

The partial clone feature has several modes, but only a few are quick
for a server to process using reachability bitmaps:

* Blobless: --filter=blob:none downloads all commits and trees and
  fetches necessary blobs on-demand.

* Treeless: --filter=tree:0 downloads all commits and fetches necessary
  trees and blobs on demand.

This treeles mode is most similar to a shallow clone in the total size
(it only adds the commit objects for the full history). This makes
treeless clones an interesting replacement for shallow clones. A user
can run more commands in a treeless clone than in a shallow clone,
especially 'git log' (no pathspec).

In particular, servers can still serve 'git fetch' requests quickly by
calculating the difference between commit wants and haves using bitmaps.

I was testing this feature with this in mind, and I knew that some trees
would be downloaded multiple times when checking out a new branch, but I
did not expect to discover a significant issue with 'git fetch', at
least in repostiories with submodules.

I was testing these commands:

	$ git clone --filter=tree:0 --single-branch --branch=master \
	$ git -C git fetch origin "+refs/heads/*:refs/remotes/origin/*"

This fetch command started downloading several pack-files of trees
before completing the command. I never let it finish since I got so
impatient with the repeated downloads. During debugging, I found that
the stack triggering promisor_remote_get_direct() was going through
fetch_populated_submodules(). Notice that I did not recurse my
submodules in the original clone, so the sha1collisiondetection
submodule is not initialized. Even so, my 'git fetch' was scanning
commits for updates to submodules.

I decided that even if I did populate the submodules, the nature of
treeless clones makes me not want to care about the contents of commits
other than those that I am explicitly navigating to.

This loop of tree fetches can be avoided by adding
--no-recurse-submodules to the 'git fetch' command or setting

To make this as painless as possible for future users of treeless
clones, automatically set fetch.recurseSubmodules=no at clone time.

Signed-off-by: Derrick Stolee <>
    clone: --filter=tree:0 implies fetch.recurseSubmodules=no
    While testing different partial clone options, I stumbled across this
    one. My initial thought was that we were parsing commits and loading
    their root trees unnecessarily, but I see that doesn't happen after this
    Here are some recent discussions about using --filter=tree:0:
    Thanks, -Stolee

Fetch-It-Via: git fetch pr-797/derrickstolee/tree-0-v1

 list-objects-filter-options.c | 4 ++++
 t/      | 6 ++++++
 2 files changed, 10 insertions(+)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index defd3dfd10..249939dfa5 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -376,6 +376,10 @@ void partial_clone_register(
+	if (filter_options->choice == LOFC_TREE_DEPTH &&
+	    !filter_options->tree_exclude_depth)
+		git_config_set("fetch.recursesubmodules", "no");
 	/* Make sure the config info are reset */
diff --git a/t/ b/t/
index f4d49d8335..b2eaf78069 100755
--- a/t/
+++ b/t/
@@ -341,6 +341,12 @@ test_expect_success 'partial clone with sparse filter succeeds' '
+test_expect_success '--filter=tree:0 sets fetch.recurseSubmodules=no' '
+	rm -rf dst &&
+	git clone --filter=tree:0 "file://$(pwd)/src" dst &&
+	test_config -C dst fetch.recursesubmodules no
 test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' '
 	rm -rf dst.git &&
 	test_must_fail git clone --no-local --bare \

base-commit: faefdd61ec7c7f6f3c8c9907891465ac9a2a1475

             reply	other threads:[~2020-11-20 21:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 20:36 Derrick Stolee via GitGitGadget [this message]
2020-11-21  0:04 ` Jeff King
2020-11-23 15:18   ` Derrick Stolee
2020-11-24  8:04     ` Jeff King
2020-11-21 16:19 ` Philippe Blain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this inbox:

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).