From: Jeff King <peff@peff.net>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, jonathantanmy@google.com, jrnieder@gmail.com
Subject: Re: [PATCH v2] rev-list: exclude promisor objects at walk time
Date: Thu, 4 Apr 2019 20:00:01 -0400 [thread overview]
Message-ID: <20190405000001.GA20793@sigill.intra.peff.net> (raw)
In-Reply-To: <20190404234726.GG60888@google.com>
On Thu, Apr 04, 2019 at 04:47:26PM -0700, Josh Steadmon wrote:
> > Did you (or anybody else) have any thoughts on the case where a given
> > object is referred to both by a promisor and a non-promisor (and we
> > don't have it)? That's the "shortcut" I think we're taking here: we
> > would no longer realize that it's available via the promisor when we
> > traverse to it from the non-promisor. I'm just not clear on whether that
> > can ever happen.
>
> I am not sure either. In process_blob() and process_tree() there are
> additional checks for whether missing blobs/trees are promisor objects
> using is_promisor_object()... but if we call that we undo the
> performance gains from this change.
Hmm. That might be a good outcome, though. If it never happens, we're
fast. If it does happen, then our worst case is that we fall back to the
current slower-but-more-thorough check. (And I think that happens with
your patch, without us having to do anything further).
> > One other possible small optimization: we don't look up the object
> > unless the caller asked to exclude promisors, which is good. But we
> > could also keep a single flag for "is there a promisor pack at all?".
> > When there isn't, we know there's no point in looking for the object.
> [...]
> I'm not necessarily opposed, but I'm leaning towards the "won't matter
> much" side.
>
> Where would such a flag live, in this case, and who would be responsible
> for initializing it? I guess it would only matter for rev-list, so we
> could initialize it in cmd_rev_list() if --exclude-promisor-objects is
> passed?
The check is really something like:
int have_promisor_pack() {
for (p = packed_git; p; p = p->next) {
if (p->pack_promisor)
return 1;
}
return 0;
}
That could be lazily cached as a single bit, but it would need to be
reset whenever we call reprepare_packed_git().
Let's just punt on it for now. I'm not convinced it would actually yield
any benefit, unless we have a partial-clone repo that doesn't have any
promisor packs (but then, I suspect whatever un-partial'd it should
probably be resetting the partial flag in the config).
> > I didn't see any tweaks to the callers, which makes sense; we're already
> > passing --exclude-promisor-objects as necessary. Which means by itself,
> > this patch should be making things faster, right? Do you have timings to
> > show that off?
>
> Yeah, for a partial clone of a large-ish Android repo [1], we see the
> connectivity check go from >180s to ~7s.
Those are nice numbers. :) Worth mentioning in the commit message, I
think. How does it compare to your earlier patch? I'd hope they're about
the same.
-Peff
next prev parent reply other threads:[~2019-04-05 0:00 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-03 17:27 [PATCH] clone: do faster object check for partial clones Josh Steadmon
2019-04-03 18:58 ` Jonathan Tan
2019-04-03 19:41 ` Jeff King
2019-04-03 20:57 ` Jonathan Tan
2019-04-04 0:21 ` Josh Steadmon
2019-04-04 1:33 ` Jeff King
2019-04-04 22:53 ` [PATCH v2] rev-list: exclude promisor objects at walk time Josh Steadmon
2019-04-04 23:08 ` Jeff King
2019-04-04 23:47 ` Josh Steadmon
2019-04-05 0:00 ` Jeff King [this message]
2019-04-05 0:09 ` Josh Steadmon
2019-04-08 20:59 ` Josh Steadmon
2019-04-08 21:06 ` [PATCH v3] " Josh Steadmon
2019-04-08 22:23 ` Christian Couder
2019-04-08 23:12 ` Josh Steadmon
2019-04-09 15:14 ` Junio C Hamano
2019-04-09 15:15 ` Jeff King
2019-04-09 15:43 ` Junio C Hamano
2019-04-09 16:35 ` Josh Steadmon
2019-04-09 18:04 ` SZEDER Gábor
2019-04-09 23:42 ` Josh Steadmon
2019-04-11 4:06 ` Jeff King
2019-04-12 22:38 ` Josh Steadmon
2019-04-13 5:34 ` Jeff King
2019-04-19 20:26 ` Josh Steadmon
2019-04-19 21:00 ` [PATCH v4] clone: do faster object check for partial clones Josh Steadmon
2019-04-22 21:31 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190405000001.GA20793@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).