git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] fetch-pack: write fetched refs to .promisor
Date: Wed, 2 Oct 2019 12:03:52 -0400	[thread overview]
Message-ID: <20191002160351.GE6116@sigill.intra.peff.net> (raw)
In-Reply-To: <20190905183926.137490-1-jonathantanmy@google.com>

On Thu, Sep 05, 2019 at 11:39:26AM -0700, Jonathan Tan wrote:

> > I'm not really opposed to what you're doing here, but I did recently
> > think of another possible use for .promisor files. So it seems like a
> > good time to bring it up, since presumably we'd have to choose one or
> > the other.
> 
> Thanks for bringing it up - yes, we should discuss this.

Sorry for starting a discussion and then abandoning it. :)

> > I know one of the original design features of the promisor pack was that
> > the client would _not_ keep a list of all of the objects it didn't have.
> > But I wonder if it would make sense to keep a cache of these "cut
> > points" in the partial clone. That's potentially smaller than the
> > complete set of objects (especially for tree-based partial cloning), and
> > it seems clear we're willing to store it in memory anyway.
> 
> Well, before the current design was implemented, I had a design that had
> such a list of missing objects. :-) I couldn't find a writeup, but here
> is some preliminary code [1]. In that code, as far as I can tell, the
> server gives us the list directly during fetch and the client merges it
> with a repository-wide file called $GIT_DIR/objects/promisedblob, but we
> don't have to follow the design (we could lazily generate the file, have
> per-packfile promisedblob files, etc.).
> 
> [1] https://public-inbox.org/git/cover.1499800530.git.jonathantanmy@google.com/

This was also a feature of my very old "external odb" patches. In fact,
there the server told the client the type and size, which let us easily
know that many objects weren't worth fetching (e.g., a large blob would
be marked as binary and skipped for diffing, without the diff code even
having to care about "is it a promisor object that we don't have?").

For just omitting some blobs, I think that carrying extra information
(either just the set of blobs, or even some basic meta-information) is
valuable. But for "narrow" or "cone" clones, the current system of
implied promisors is pretty nice, because then the client effort
literally does (or could) scale with the size of their cone, independent
of the number of objects outside their cone.

> > And if we do that, would the .promisor file for a pack be a good place
> > to store it?
> 
> After looking at [1], it might be better in another place. If we want to
> preserve fast fetches, we still need another file to indicate that the
> pack is a promisor, so ".promisor" seems good for that. The presence or
> absence of the cutoff points is a separate issue and could go into a
> separate file, and it might be worth putting all cutoff points into a
> single per-repository file too.

OK, that makes sense. Thanks for giving it some thought.

-Peff

  reply	other threads:[~2019-10-02 16:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-26 21:47 [PATCH] fetch-pack: write fetched refs to .promisor Jonathan Tan
2019-08-27 20:27 ` Junio C Hamano
2019-08-27 21:50   ` Jonathan Tan
2019-09-05  7:01 ` Jeff King
2019-09-05 17:13   ` Junio C Hamano
2019-09-05 17:59     ` Jeff King
2019-09-05 18:39   ` Jonathan Tan
2019-10-02 16:03     ` Jeff King [this message]
2019-10-14 22:27 ` Josh Steadmon
2019-10-14 23:56   ` Jonathan Tan
2019-10-15  0:12 ` [PATCH v2] " Jonathan Tan
2019-10-15 19:40   ` Josh Steadmon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191002160351.GE6116@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).