git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [PATCH] sha1-file: make pretend_object_file() not prefetch
Date: Tue, 21 Jul 2020 15:50:20 -0700	[thread overview]
Message-ID: <20200721225020.1352772-1-jonathantanmy@google.com> (raw)

When pretend_object_file() is invoked with an object that does not exist
(as is the typical case), there is no need to fetch anything from the
promisor remote, because the caller already knows what the object is
supposed to contain. Therefore, suppress the fetch. (The
OBJECT_INFO_QUICK flag is added for the same reason.)

This was noticed at $DAYJOB when "blame" was run on a file that had
uncommitted modifications.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This is another instance of the issue Junio raised here [1]:

> Fixing issues hit by real users reactively is a necessary and good
> thing, but this is not the first time we patch callers of
> has_object_file() for this kind of "we are merely trying to
> determine the boundary of what we have, so that we know what we need
> to add to this repository" queries, I am afraid.
>
> Perhaps it is a good idea to sweep all the hits from "git grep -e
> has_object_file \*.c" and audit the codebase to see if there are
> other problematic ones?

To make progress towards solving this, I'm thinking of making a new
function has_object() that takes a repository, an object ID, and flags,
and only supports one flag: HAS_OBJECT_RECHECK_PACKED (which has the
opposite meaning of OBJECT_INFO_QUICK). Checks that should not fetch
should use has_object(), and checks that should fetch (if they exist - I
think that most would want additional information about the object, so
they would use oid_object_info() or similar already) should use
oid_object_info() or a similar function. That way we can see how many
uses of has_object_file() we have checked, and at the same time make
behavior we usually want into the default behavior. Any opinions?

[1] https://lore.kernel.org/git/xmqqpn8wie21.fsf@gitster.c.googlers.com/
---
 sha1-file.c      |  3 ++-
 t/t8002-blame.sh | 11 +++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/sha1-file.c b/sha1-file.c
index ccd34dd9e8..45fdb8415c 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1600,7 +1600,8 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
 	struct cached_object *co;
 
 	hash_object_file(the_hash_algo, buf, len, type_name(type), oid);
-	if (has_object_file(oid) || find_cached_object(oid))
+	if (has_object_file_with_flags(oid, OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT) ||
+	    find_cached_object(oid))
 		return 0;
 	ALLOC_GROW(cached_objects, cached_object_nr + 1, cached_object_alloc);
 	co = &cached_objects[cached_object_nr++];
diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
index eea048e52c..2ed6aaae35 100755
--- a/t/t8002-blame.sh
+++ b/t/t8002-blame.sh
@@ -122,4 +122,15 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
 	test_must_fail git blame --exclude-promisor-objects one
 '
 
+test_expect_success 'blame with uncommitted edits in partial clone does not crash' '
+	git init server &&
+	echo foo >server/file.txt &&
+	git -C server add file.txt &&
+	git -C server commit -m file &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	echo bar >>client/file.txt &&
+	git -C client blame file.txt
+'
+
 test_done
-- 
2.28.0.rc0.105.gf9edc3c819-goog


             reply	other threads:[~2020-07-21 22:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 22:50 Jonathan Tan [this message]
2020-07-21 23:25 ` [PATCH] sha1-file: make pretend_object_file() not prefetch Junio C Hamano
2020-07-21 23:27 ` Junio C Hamano
2020-07-23 17:47   ` Jeff King
2020-07-23 18:06     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200721225020.1352772-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).