git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>, gitster@pobox.com
Subject: [PATCH] fetch: remove fetch_if_missing=0
Date: Fri,  1 Nov 2019 13:38:30 -0700	[thread overview]
Message-ID: <20191101203830.231676-1-jonathantanmy@google.com> (raw)

In fetch_pack() (and all functions it calls), pass
OBJECT_INFO_SKIP_FETCH_OBJECT whenever we query an object that could be
a tree or blob that we do not want to be lazy-fetched even if it is
absent. Thus, the only lazy-fetches occurring for trees and blobs are
when resolving deltas.

Thus, we can remove fetch_if_missing=0 from builtin/fetch.c. Remove
this, and also add a test ensuring that such objects are not
lazy-fetched. (We might be able to remove fetch_if_missing=0 from other
places too, but I have limited myself to builtin/fetch.c in this commit
because I have not written tests for the other commands yet.)

Note that commits and tags may still be lazy-fetched. I limited myself
to objects that could be trees or blobs here because Git does not
support creating such commit- and tag-excluding clones yet, and even if
such a clone were manually created, Git does not have good support for
fetching a single commit (when fetching a commit, it and all its
ancestors would be sent).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
I've verified that this also solves the bug explained in:
https://public-inbox.org/git/20191007181825.13463-1-jonathantanmy@google.com/

Quoting what I wrote from there:

> (Alternatively, we
> could add the equivalent of OBJECT_INFO_SKIP_FETCH_OBJECT to functions
> like parse_commit() that are used by files like negotiator/default.c, or
> split up commit parsing into object reading - which already has that
> flag - and commit parsing.)

This commit adds OBJECT_INFO_SKIP_FETCH_OBJECT to functions, but not the
parse_commit() mentioned - as stated above, this commit only handles
objects that could be trees or blobs.
---
 builtin/fetch.c          |  5 ++-
 fetch-pack.c             |  3 +-
 t/t5616-partial-clone.sh | 69 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0c345b5dfe..5ff7367dd7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1074,7 +1074,8 @@ static int check_exist_and_connected(struct ref *ref_map)
 	 * we need all direct targets to exist.
 	 */
 	for (r = rm; r; r = r->next) {
-		if (!has_object_file(&r->old_oid))
+		if (!has_object_file_with_flags(&r->old_oid,
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
 			return -1;
 	}
 
@@ -1755,8 +1756,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	packet_trace_identity("fetch");
 
-	fetch_if_missing = 0;
-
 	/* Record the command line for the reflog */
 	strbuf_addstr(&default_rla, "fetch");
 	for (i = 1; i < argc; i++)
diff --git a/fetch-pack.c b/fetch-pack.c
index 0130b44112..37178e2d34 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -673,7 +673,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 		struct object *o;
 
 		if (!has_object_file_with_flags(&ref->old_oid,
-						OBJECT_INFO_QUICK))
+						OBJECT_INFO_QUICK |
+							OBJECT_INFO_SKIP_FETCH_OBJECT))
 			continue;
 		o = parse_object(the_repository, &ref->old_oid);
 		if (!o)
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 79f7b65f8c..1081ed2de0 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -296,6 +296,75 @@ test_expect_success 'partial clone with unresolvable sparse filter fails cleanly
 	test_i18ngrep "unable to parse sparse filter data in" err
 '
 
+setup_triangle () {
+	rm -rf big-blob.txt server client promisor-remote &&
+
+	touch big-blob.txt &&
+	for i in $(seq 1 100)
+	do
+		echo line $i >>big-blob.txt
+	done &&
+
+	# Create a server with 2 commits: a commit with a big blob and a child
+	# commit with an incremental change. Also, create a partial clone
+	# client that only contains the first commit.
+	git init server &&
+	git -C server config --local uploadpack.allowfilter 1 &&
+	cp big-blob.txt server &&
+	git -C server add big-blob.txt &&
+	git -C server commit -m "initial" &&
+	git clone --bare --filter=tree:0 "file://$(pwd)/server" client &&
+	echo another line >>server/big-blob.txt &&
+	git -C server commit -am "append line to big blob" &&
+
+	# Create a promisor remote that only contains the blob from the first
+	# commit, and set it as the promisor remote of client. Thus, whenever
+	# the client lazy fetches, the lazy fetch will succeed only if it is
+	# for this blob.
+	git init promisor-remote &&
+	test_commit -C promisor-remote one && # so that ref advertisement is not empty
+	git -C promisor-remote config --local uploadpack.allowanysha1inwant 1 &&
+	git -C promisor-remote hash-object -w --stdin <big-blob.txt &&
+	git -C client remote set-url origin "file://$(pwd)/promisor-remote"
+}
+
+test_expect_success 'fetch lazy-fetches only to resolve deltas' '
+	setup_triangle &&
+
+	# Exercise to make sure it works. Git will not fetch anything from the
+	# promisor remote other than for the big blob (because it needs to
+	# resolve the delta).
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client \
+		fetch "file://$(pwd)/server" master &&
+
+	# Verify the assumption that the client needed to fetch the delta base
+	# to resolve the delta.
+	git hash-object big-blob.txt >hash &&
+	grep "want $(cat hash)" trace
+'
+
+test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
+	setup_triangle &&
+
+	git -C server config --local protocol.version 2 &&
+	git -C client config --local protocol.version 2 &&
+	git -C promisor-remote config --local protocol.version 2 &&
+
+	# Exercise to make sure it works. Git will not fetch anything from the
+	# promisor remote other than for the big blob (because it needs to
+	# resolve the delta).
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client \
+		fetch "file://$(pwd)/server" master &&
+
+	# Verify that protocol version 2 was used.
+	grep "fetch< version 2" trace &&
+
+	# Verify the assumption that the client needed to fetch the delta base
+	# to resolve the delta.
+	git hash-object big-blob.txt >hash &&
+	grep "want $(cat hash)" trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.24.0.rc1.363.gb1bccd3e3d-goog


             reply	other threads:[~2019-11-01 20:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-01 20:38 Jonathan Tan [this message]
2019-11-01 22:05 ` [PATCH] fetch: remove fetch_if_missing=0 Jonathan Nieder
2019-11-02  5:55   ` Junio C Hamano
2019-11-02  6:11     ` Eric Sunshine
2019-11-02  5:59   ` Junio C Hamano
2019-11-05 18:53   ` Jonathan Tan
2019-11-05 18:58     ` Jonathan Nieder
2019-11-05 18:56 ` [PATCH v2] " Jonathan Tan
2019-11-05 20:06   ` Eric Sunshine
2019-11-06  1:45   ` Junio C Hamano
2019-11-08  6:33   ` Junio C Hamano
2019-11-08  7:40     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191101203830.231676-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).