git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, Johannes.Schindelin@gmx.de, szeder.dev@gmail.com
Subject: Re: [PATCH v2 1/2] sha1-file: support OBJECT_INFO_FOR_PREFETCH
Date: Fri, 5 Apr 2019 18:00:06 -0400	[thread overview]
Message-ID: <20190405220005.GA10312@sigill.intra.peff.net> (raw)
In-Reply-To: <068861632b85179d2a5a5ceb966e951a78b27141.1553895166.git.jonathantanmy@google.com>

On Fri, Mar 29, 2019 at 02:39:27PM -0700, Jonathan Tan wrote:

> Teach oid_object_info_extended() to support a new flag that inhibits
> fetching of missing objects. This is equivalent to setting
> fetch_is_missing to 0, calling oid_object_info_extended(), then setting
> fetch_if_missing to whatever it was before. Update unpack-trees.c to use
> this new flag instead of repeatedly setting fetch_if_missing.
> 
> This new flag complicates things slightly in that there are now 2 ways
> to do the same thing. But this eliminates the need to repeatedly set a
> global variable, and more importantly, allows prefetching to be done in
> parallel (in the future); hence, this patch.

Sorry I'm a little late to review this. I don't have any critical
comments, so if this gets ignored, I'll live with it.

> +/*
> + * Do not attempt to fetch the object if missing (even if fetch_is_missing is
> + * nonzero). This is meant for bulk prefetching of missing blobs in a partial
> + * clone. Implies OBJECT_INFO_QUICK.
> + */
> +#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK)

Mostly I found the name and semantics of this flag to be a little
confusing. Really what we want is to tell oid_object_info() not do any
on-demand fetching for us. That seems like a thing that we might
eventually want for other purposes (e.g., a diff operation that could
produce a real blob diff but would be happy outputting a less-detailed
tree diff).

If it were just OBJECT_INFO_NO_FETCH or similar, that tells more clearly
what it does, and would make sense in more contexts.

I suspect that QUICK would be the norm when used with it, though I
probably would have kept the two orthogonal for the sake of simplicity
and clarity.

> diff --git a/unpack-trees.c b/unpack-trees.c
> index 22c41a3ba8..381b0cd65e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -404,20 +404,21 @@ static int check_updates(struct unpack_trees_options *o)
>  		 * below.
>  		 */
>  		struct oid_array to_fetch = OID_ARRAY_INIT;
> -		int fetch_if_missing_store = fetch_if_missing;
> -		fetch_if_missing = 0;
>  		for (i = 0; i < index->cache_nr; i++) {
>  			struct cache_entry *ce = index->cache[i];
> -			if ((ce->ce_flags & CE_UPDATE) &&
> -			    !S_ISGITLINK(ce->ce_mode)) {
> -				if (!has_object_file(&ce->oid))
> -					oid_array_append(&to_fetch, &ce->oid);
> -			}
> +
> +			if (!(ce->ce_flags & CE_UPDATE) ||
> +			    S_ISGITLINK(ce->ce_mode))
> +				continue;
> +			if (!oid_object_info_extended(the_repository, &ce->oid,
> +						      NULL,
> +						      OBJECT_INFO_FOR_PREFETCH))
> +				continue;
> +			oid_array_append(&to_fetch, &ce->oid);

Here we get rid of the global set/restore dance, which is nice. But
there's also a behavior change, as we've picked up QUICK. I think that's
probably the right thing to do, but I was a bit surprised not to see any
discussion in the commit message.

-Peff

  parent reply	other threads:[~2019-04-05 22:00 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26 22:09 [PATCH] diff: batch fetching of missing blobs Jonathan Tan
2019-03-27 10:10 ` SZEDER Gábor
2019-03-27 22:02 ` Johannes Schindelin
2019-03-28  6:52 ` Jeff King
2019-03-29 21:39 ` [PATCH v2 0/2] Batch fetching of missing blobs in diff and show Jonathan Tan
2019-03-29 21:39   ` [PATCH v2 1/2] sha1-file: support OBJECT_INFO_FOR_PREFETCH Jonathan Tan
2019-04-05 14:13     ` Johannes Schindelin
2019-04-05 22:00     ` Jeff King [this message]
2019-03-29 21:39   ` [PATCH v2 2/2] diff: batch fetching of missing blobs Jonathan Tan
2019-04-04  2:47     ` SZEDER Gábor
2019-04-05 13:38       ` Johannes Schindelin
2019-04-07  6:00         ` Christian Couder
2019-04-08  2:36           ` Junio C Hamano
2019-04-08  5:51             ` Junio C Hamano
2019-04-08  6:03               ` Junio C Hamano
2019-04-08  6:45                 ` Christian Couder
2019-04-08  6:40             ` Christian Couder
2019-04-08  7:59               ` Junio C Hamano
2019-04-08  9:56                 ` Christian Couder
2019-04-05  9:39     ` Duy Nguyen
2019-04-05 17:09       ` [PATCH] fixup! " Jonathan Tan
2019-04-05 20:16         ` Johannes Schindelin
2019-04-06  4:17         ` Duy Nguyen
2019-04-08  3:46           ` Junio C Hamano
2019-04-08  4:06           ` Junio C Hamano
2019-04-08  9:58             ` Duy Nguyen
2019-04-09  6:36               ` Junio C Hamano
2019-04-05 14:17     ` [PATCH v2 2/2] " Johannes Schindelin
2019-04-05 22:12   ` [PATCH v2 0/2] Batch fetching of missing blobs in diff and show Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190405220005.GA10312@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).