git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Richard Oliver <roliver@roku.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Taylor Blau <me@ttaylorr.com>,
	Derrick Stolee <derrickstolee@github.com>,
	git@vger.kernel.org, jonathantanmy@google.com
Subject: [PATCH] mktree: do not check type of remote objects
Date: Tue, 21 Jun 2022 14:59:39 +0100	[thread overview]
Message-ID: <748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com> (raw)
In-Reply-To: <xmqqy1xwtsds.fsf@gitster.g>

On 16/06/2022 18:44, Junio C Hamano wrote:
> This patch would be a good first cut as a starting point, but we
> probably can do better by doing oid_object_info_extended() call with
> OBJECT_INFO_SKIP_FETCH_OBJECT bit (and probably QUICK bit, too) set,
> with the current code structure.
> 
> And when we do so, the title would not match the purpose of the
> change.  The verification was disabled with "--missing" all along
> and that is not what we are changing.  What we will be fixing is the
> wasteful implementation.
> 
>     mktree: do not check types of remote objects
> 
>     With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
>     called the sha1_object_info() API to obtain the type information,
>     but allowed the call to silently fail when the object was missing
>     locally, so that we can sanity-check the types opportunistically
>     when the object did exist.
> 
>     The implementation is understandable because back then there was no
>     lazy/on-demand downloading of individual objects from the promisor
>     remotes that causes a long delay and materializes the object, hence
>     defeating the point of using "--missing".  The design is hurting us
>     now.
> 
>     We could bypass the opportunistic type/mode consistency check
>     altogether when "--missing" is given, but instead, use the
>     oid_object_info_extended() API and tell it that we are only
>     interested in objects that locally exist and are immediately
>     available by passing OBJECT_INFO_SKIP_FETCH_OBJECT bit to it.  That
>     way, we will still retain the cheap and opportunistic sanity check
>     for local objects.

I've prepared a patch below as per your suggestion.

As a side note, do you think we need to re-work some uses of the word
'missing' in the documentation? Some uses of the word, such as in
mktree, predate the concept of promisor remotes. The partial-clone.txt
documentation differentiates between missing "due to a partial clone
or fetch" and missing "due to repository corruption".  Would making
such a distinction elsewhere be useful?

Cheers,
Richard

-- >8 --
Subject: [PATCH] mktree: do not check type of remote objects

With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
called the sha1_object_info() API to obtain the type information, but
allowed the call to silently fail when the object was missing locally,
so that we can sanity-check the types opportunistically when the
object did exist.

The implementation is understandable because back then there was no
lazy/on-demand downloading of individual objects from the promisor
remotes that causes a long delay and materializes the object, hence
defeating the point of using "--missing".  The design is hurting us
now.

We could bypass the opportunistic type/mode consistency check
altogether when "--missing" is given, but instead, use the
oid_object_info_extended() API and tell it that we are only interested
in objects that locally exist and are immediately available by passing
OBJECT_INFO_SKIP_FETCH_OBJECT bit to it.  That way, we will still
retain the cheap and opportunistic sanity check for local objects.

Signed-off-by: Richard Oliver <roliver@roku.com>
---
 builtin/mktree.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 902edba6d2..cfadb52670 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -116,8 +116,15 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
 			path, ptr, type_name(mode_type));
 	}
 
-	/* Check the type of object identified by sha1 */
-	obj_type = oid_object_info(the_repository, &oid, NULL);
+	/* Check the type of object identified by oid without fetching objects */
+	struct object_info oi = OBJECT_INFO_INIT;
+	oi.typep = &obj_type;
+	if (oid_object_info_extended(the_repository, &oid, &oi,
+				     OBJECT_INFO_LOOKUP_REPLACE |
+				     OBJECT_INFO_QUICK |
+				     OBJECT_INFO_SKIP_FETCH_OBJECT) < 0)
+		obj_type = -1;
+
 	if (obj_type < 0) {
 		if (allow_missing) {
 			; /* no problem - missing objects are presumed to be of the right type */
-- 
2.36.1.467.g4f6db706e6.dirty


  reply	other threads:[~2022-06-21 14:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 13:36 [PATCH] mktree: learn about promised objects Richard Oliver
2022-06-14 14:14 ` Derrick Stolee
2022-06-14 16:33   ` Richard Oliver
2022-06-14 17:27     ` Derrick Stolee
2022-06-15  0:35       ` Taylor Blau
2022-06-15  4:00         ` Jeff King
2022-06-15 17:40           ` Richard Oliver
2022-06-15 18:17             ` Derrick Stolee
2022-06-16  6:07               ` Jeff King
2022-06-16  6:54                 ` [PATCH] is_promisor_object(): walk promisor packs in pack-order Jeff King
2022-06-16 14:00                   ` Derrick Stolee
2022-06-17 19:50                   ` Jonathan Tan
2022-06-16 13:59                 ` [PATCH] mktree: learn about promised objects Derrick Stolee
2022-06-15 21:01             ` Junio C Hamano
2022-06-16  5:02               ` Jeff King
2022-06-16 15:46               ` [PATCH] mktree: Make '--missing' behave as documented Richard Oliver
2022-06-16 17:44                 ` Junio C Hamano
2022-06-21 13:59                   ` Richard Oliver [this message]
2022-06-21 16:51                     ` [PATCH] mktree: do not check type of remote objects Junio C Hamano
2022-06-21 17:48                     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com \
    --to=roliver@roku.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).