git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, jonathantanmy@google.com
Subject: Re: [PATCH v4] clone: do faster object check for partial clones
Date: Mon, 22 Apr 2019 17:31:14 -0400	[thread overview]
Message-ID: <20190422213113.GB4728@sigill.intra.peff.net> (raw)
In-Reply-To: <b4a285e2a199ea059c165aa344d037374797fa40.1555707373.git.steadmon@google.com>

On Fri, Apr 19, 2019 at 02:00:13PM -0700, Josh Steadmon wrote:

> For partial clones, doing a full connectivity check is wasteful; we skip
> promisor objects (which, for a partial clone, is all known objects), and
> enumerating them all to exclude them from the connectivity check can
> take a significant amount of time on large repos.
> 
> At most, we want to make sure that we get the objects referred to by any
> wanted refs. For partial clones, just check that these objects were
> transferred.
> 
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
> This is an update of the original V1 approach (skipping the fully
> connectivity check when doing a partial clone) with updated commit
> message and comments to address the review concerns.

This looks OK to me. Just trying to think of ways it could bite us, the
obvious line of thinking is where "--reference" is used. If we tell the
other side we already have object X, it will not be sent to us, and we
are relying on our local non-promisor alternate to have all of the
required objects. But I think this is OK:

  - for it to be mentioned in a ref, then the server must have been
    advertising it. Which means that it should similarly be on the hook
    for providing it to us via the promisor mechanism. That's a little
    hand-wavy, but then so is the entire promisor concept. We're
    inherently at the server's mercy, so if they're lying to us about
    what they're willing or able to provide, there's not much we can do
    anyway.

  - if we sent it as a "have" to the server, that means that our
    alternate was advertising it from a ref tip. Which means that unless
    it's corrupted, we're fine (which is no different than the
    connectivity promise we'd make for our own refs). I actually think
    that the connectivity check should "--not" any advertised alternate
    tips. I even have a patch to do that, but I need to polish it a
    little.

So I think this optimization will yield correct results. If we later
figure out a better way for rev-list to optimize this case itself, then
we can rip this out.

I think you had dug up some numbers to put in the commit message after
the last discussion? Likewise, is there a t/perf test which shows this
off (and will help us catch regressions)? If not, it might be worth
adding one (AFAICT a simple no-blobs partial-clone would be enough).

-Peff

      reply	other threads:[~2019-04-22 21:31 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-03 17:27 [PATCH] clone: do faster object check for partial clones Josh Steadmon
2019-04-03 18:58 ` Jonathan Tan
2019-04-03 19:41 ` Jeff King
2019-04-03 20:57   ` Jonathan Tan
2019-04-04  0:21     ` Josh Steadmon
2019-04-04  1:33     ` Jeff King
2019-04-04 22:53 ` [PATCH v2] rev-list: exclude promisor objects at walk time Josh Steadmon
2019-04-04 23:08   ` Jeff King
2019-04-04 23:47     ` Josh Steadmon
2019-04-05  0:00       ` Jeff King
2019-04-05  0:09         ` Josh Steadmon
2019-04-08 20:59           ` Josh Steadmon
2019-04-08 21:06 ` [PATCH v3] " Josh Steadmon
2019-04-08 22:23   ` Christian Couder
2019-04-08 23:12     ` Josh Steadmon
2019-04-09 15:14   ` Junio C Hamano
2019-04-09 15:15     ` Jeff King
2019-04-09 15:43       ` Junio C Hamano
2019-04-09 16:35         ` Josh Steadmon
2019-04-09 18:04   ` SZEDER Gábor
2019-04-09 23:42     ` Josh Steadmon
2019-04-11  4:06       ` Jeff King
2019-04-12 22:38         ` Josh Steadmon
2019-04-13  5:34           ` Jeff King
2019-04-19 20:26             ` Josh Steadmon
2019-04-19 21:00 ` [PATCH v4] clone: do faster object check for partial clones Josh Steadmon
2019-04-22 21:31   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190422213113.GB4728@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).