git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH] diff: only prefetch for certain output formats
@ 2020-01-28 21:35 Jonathan Tan
  2020-01-29  5:09 ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Tan @ 2020-01-28 21:35 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Running "git status" on a partial clone that was cloned with
"--no-checkout", like this:

  git clone --no-checkout --filter=blob:none <url> foo
  git -C foo status

results in an unnecessary lazy fetch. This is because of commit
7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08), which
optimized "diff" by prefetching, but inadvertently affected at least one
command ("status") that does not need prefetching. These 2 cases can be
distinguished by looking at output_format; the former uses
DIFF_FORMAT_PATCH and the latter uses DIFF_FORMAT_CALLBACK.

Therefore, restrict prefetching only to output_format values like the
one used by "diff".

(Note that other callers that use DIFF_FORMAT_CALLBACK will also lose
prefetching. I haven't investigated enough to see if this is a net
benefit or drawback, but I think this will need to be done on a
caller-by-caller basis and can be done in the future.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Points of discussion I can think of:

 - Is the whitelist of output_format constants the best?
 - Should we just have the callers pass a prefetch boolean arg instead
   of trying to guess it ourselves? I'm leaning towards no since I think
   we should avoid options unless they are necessary.

Perhaps there are others.
---
 diff.c                   | 10 +++++++++-
 t/t0410-partial-clone.sh | 13 +++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 8e2914c031..8263491783 100644
--- a/diff.c
+++ b/diff.c
@@ -6504,7 +6504,15 @@ static void add_if_missing(struct repository *r,
 
 void diffcore_std(struct diff_options *options)
 {
-	if (options->repo == the_repository && has_promisor_remote()) {
+	int output_formats_to_prefetch = DIFF_FORMAT_RAW |
+		DIFF_FORMAT_DIFFSTAT |
+		DIFF_FORMAT_NUMSTAT |
+		DIFF_FORMAT_SUMMARY |
+		DIFF_FORMAT_PATCH |
+		DIFF_FORMAT_SHORTSTAT |
+		DIFF_FORMAT_DIRSTAT;
+	if ((options->output_format & output_formats_to_prefetch) &&
+	    options->repo == the_repository && has_promisor_remote()) {
 		/*
 		 * Prefetch the diff pairs that are about to be flushed.
 		 */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index a3988bd4b8..d72631a3a0 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -567,6 +567,19 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
 	git -C repo cherry-pick side1
 '
 
+test_expect_success 'do not fetch when running status against --no-checkout partial clone' '
+	rm -rf server client trace &&
+	test_create_repo server &&
+	test_commit -C server one &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	git clone --no-checkout --filter=blob:none "file://$(pwd)/server" client &&
+
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client status &&
+	! test_path_exists trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-02-01 11:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28 21:35 [RFC PATCH] diff: only prefetch for certain output formats Jonathan Tan
2020-01-29  5:09 ` Jeff King
2020-01-30  1:39   ` Jonathan Tan
2020-01-30  5:51     ` Jeff King
2020-01-30 23:20       ` Jonathan Tan
2020-01-31  0:14         ` Jeff King
2020-01-31 18:08           ` Junio C Hamano
2020-02-01 11:29             ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).