git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] index-pack: prefetch missing REF_DELTA bases
Date: Thu, 16 May 2019 17:42:58 -0400	[thread overview]
Message-ID: <20190516214257.GD10787@sigill.intra.peff.net> (raw)
In-Reply-To: <20190516213056.221406-1-jonathantanmy@google.com>

On Thu, May 16, 2019 at 02:30:56PM -0700, Jonathan Tan wrote:

> > So I could go either way, though I do think it makes sense for on-demand
> > fetches for partial clones to avoid asking for thin packs as a general
> > principle.
> 
> This should not be a problem since fetch-pack can already know that
> we're doing an on-demand fetch (args->no_dependents), so we should be
> able to either plumb a "no-thin-pack" arg in the same way or rename
> args->no_dependents to also encompass the no-thin-pack option. But this
> can be done separately from this patch set, I think.

Yeah, I think it can be done separately. Though the two may intermingle
if we want to instruct index-pack that it should not try to pre-fetch if
we did not ask for a thin pack.

> > As a matter of fact, should partial clones _always_ avoid
> > asking for thin packs?  That would make this issue go away entirely.
> > 
> > Sometimes it would be more efficient (we do not have to get an extra
> > base object just to resolve the delta we needed) but sometimes worse (if
> > we did actually have the base, it's a win). Whether it's a win would
> > depend on the "hit" rate, and I suspect that is heavily dependent on
> > workload characteristics (what kind of filtering is in use, are we
> > topping up in a non-partial way, etc).
> 
> I think it's best if we still allow servers to serve thin packs. For
> example, if we're excluding only large blobs, clients would still want
> servers to be able to delta against blobs that they have.

Yes, this is getting into the hit-rate thing I mentioned. You're right
that for a reasonably typical case of "no blobs over 10MB" we'd have a
very high hit rate, and disabling thin packs would almost certainly be a
big loss.

I guess even when we have a "miss", the cost is usually not that high
either. If we get A as a delta against B, then in the non-thin-pack case
we transfer all of A. In the thin-pack case with pre-fetch we transfer
all of B, and then the delta. But the delta is often small enough
compared to the total content that it's not that big a deal either way.
There are pathological cases, of course, but that's already true. :)

So you're right, it's probably still a win to use thin packs when we
can.

> > Right, REF_DELTA is definitely correctly handled currently, and I don't
> > think that would break with your patch. It's just that your patch would
> > introduce a bunch of extra traffic as we request bases separately that
> > are already in the pack.
> 
> Ah...I see. For this problem, I think that it can be solved with the
> "if (objects[d->obj_no].real_type != OBJ_REF_DELTA)" check that the
> existing code uses before calling read_object(). I'll include this in
> the next reroll if any other issue comes up.

I'm confused about this. Aren't we pre-fetching before we've actually
resolved deltas? The base could be in the pack as a true base, and we
might have seen it already then. But it could itself be a delta, and we
wouldn't know we have it until we resolve it (this gets into the
lucky/unlucky ordering thing).

-Peff

  reply	other threads:[~2019-05-16 21:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14 21:10 [PATCH 0/2] Partial clone fix: handling received REF_DELTA Jonathan Tan
2019-05-14 21:10 ` [PATCH 1/2] t5616: refactor packfile replacement Jonathan Tan
2019-05-15  8:36   ` Johannes Schindelin
2019-05-15 18:22     ` Jonathan Tan
2019-05-14 21:10 ` [PATCH 2/2] index-pack: prefetch missing REF_DELTA bases Jonathan Tan
2019-05-15  8:46   ` Johannes Schindelin
2019-05-15 18:28     ` Jonathan Tan
2019-05-17 18:33       ` Johannes Schindelin
2019-05-15 23:16   ` Jeff King
2019-05-16  1:43     ` Junio C Hamano
2019-05-16  4:04       ` Jeff King
2019-05-16 18:26     ` Jonathan Tan
2019-05-16 21:12       ` Jeff King
2019-05-16 21:30         ` Jonathan Tan
2019-05-16 21:42           ` Jeff King [this message]
2019-05-16 23:15             ` Jonathan Tan
2019-05-17  1:09               ` Jeff King
2019-05-17  1:22                 ` Jeff King
2019-05-17  4:39                   ` Jeff King
2019-05-17  4:42                     ` Jeff King
2019-05-17  7:20                     ` Duy Nguyen
2019-05-17  8:55                       ` Jeff King
2019-05-18 11:39                         ` Duy Nguyen
2019-05-20 23:04                           ` Nicolas Pitre
2019-05-21 21:20                             ` Jeff King
2019-06-03 22:23   ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190516214257.GD10787@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).