RE: Proposal for "fetch-any-blob Git protocol" and server design

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Kevin David <Kevin.David@microsoft.com>
To: Jonathan Tan <jonathantanmy@google.com>,
	Ben Peart <peartben@gmail.com>,
	Stefan Beller <sbeller@google.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	Mark Thomas <markbt@efaref.net>,
	Jeff Hostetler <git@jeffhostetler.com>
Subject: RE: Proposal for "fetch-any-blob Git protocol" and server design
Date: Wed, 12 Apr 2017 22:02:31 +0000	[thread overview]
Message-ID: <BY2PR21MB0052067C2CDC6F727CCB6969EE030@BY2PR21MB0052.namprd21.prod.outlook.com> (raw)
In-Reply-To: <00bf01d2aed7$b13492a0$139db7e0$@gmail.com>

Hi Jonathan, 

I work on the network protocols for the GVFS project at Microsoft.
I shared a couple thoughts and questions below.

Thanks,

Kevin

-----Original Message-----
From: Stefan Beller [mailto:sbeller@google.com]
Sent: Tuesday, March 28, 2017 7:20 PM
To: Jonathan Tan <jonathantanmy@google.com>; Ben Peart <peartben@gmail.com>
Cc: git@vger.kernel.org; Mark Thomas <markbt@efaref.net>; Jeff Hostetler <git@jeffhostetler.com>
Subject: Re: Proposal for "fetch-any-blob Git protocol" and server design

+cc Ben Peart, who sent
"[RFC] Add support for downloading blobs on demand" to the list recently.
This proposal here seems like it has the same goal, so maybe your review could go a long way here?

Thanks,
Stefan

On Tue, Mar 14, 2017 at 3:57 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> As described in "Background" below, there have been at least 2 patch 
> sets to support "partial clones" and on-demand blob fetches, where the 
> server part that supports on-demand blob fetches was treated at least 
> in outline. Here is a proposal treating that server part in detail.
>
> == Background
>
> The desire for Git to support (i) missing blobs and (ii) fetching them 
> as needed from a remote repository has surfaced on the mailing list a 
> few times, most recently in the form of RFC patch sets [1] [2].
>
> A local repository that supports (i) will be created by a "partial 
> clone", that is, a clone with some special parameters (exact 
> parameters are still being discussed) that does not download all blobs 
> normally downloaded. Such a repository should support (ii), which is what this proposal describes.
>
> == Design
>
> A new endpoint "server" is created. The client will send a message in 
> the following format:
>
> ----
> fbp-request = PKT-LINE("fetch-blob-pack")
>               1*want
>               flush-pkt
> want = PKT-LINE("want" SP obj-id)
> ----
>
> The client may send one or more SHA-1s for which it wants blobs, then 
> a flush-pkt.

I know we're considering server behavior here, but how large do you generally
expect these blob-want requests to be? I ask because we took an initial approach
very similar to this, however, we had a hard time being clever about figuring out
what set of blobs to request for those clients that didn't want the entire set, and
ended up falling back to single-blob requests. 

Obviously, this could be due to thenature of our filesystem-virtualization-based client, 
but I also suspect that the teams attacking this problem are more often than not dealing
with very large blob objects, so the cost of a round-trip becomes lower relative to sending
the object content itself.

>
> The server will then reply:
>
> ----
> server-reply = flush-pkt | PKT-LINE("ERR" SP message)
> ----
>
> If there was no error, the server will then send them in a packfile, 
> formatted like described in "Packfile Data" in pack-protocol.txt with 
> "side-band-64k" enabled.
> Any server that supports "partial clone" will also support this, and 
> the client will automatically assume this. (How a client discovers 
> "partial clone" is not covered by this proposal.)

Along the same lines as above, this is where we started and it worked well for
low-volume requests. However, when we started ramping up the load,
`pack-objects` operating on a very large packed repository (~150 GiB) became
very computationally expensive, even with `--compression=1 --depth=0 --window=0`.

Being a bit more clever about packing objects (e.g. splitting blobs out from commits
and trees) improved this a bit, but we still hit a bottlenecks from what appeared to
be a large number of memory-mapping operations on a ~140GiB packfile of blobs.

Each `pack-objects` process would consume approximately one CPU core for the
duration of the request. It's possible that further splitting of these large blob packs
would have improved performance in some scenarios, but that would increase the
amount of pack-index lookups necessary to find a single object.

>
> The server will perform reachability checks on requested blobs through 
> the equivalent of "git rev-list --use-bitmap-index" (like "git 
> upload-pack" when using the allowreachablesha1inwant option), unless 
> configured to suppress reachability checks through a config option.
> The server administrator is highly recommended to regularly regenerate 
> the bitmap (or suppress reachability checks).
>
> === Endpoint support for forward compatibility
>
> This "server" endpoint requires that the first line be understood, but 
> will ignore any other lines starting with words that it does not 
> understand. This allows new "commands" to be added (distinguished by 
> their first lines) and existing commands to be "upgraded" with backwards compatibility.

This seems like a clever way to avoid the canonical `/info/refs?service=git-upload-pack` 
capability negotiation on every call. However, using error handling to fallback seems
slightly wonky to me. Hopefully users are incentivized to upgrade their clients.

> === Related improvements possible with new endpoint
>
> Previous protocol upgrade suggestions have had to face the difficulty 
> of allowing updated clients to discover the server support while not 
> slowing down (for example, through extra network round-trips) any 
> client, whether non-updated or updated. The introduction of "partial 
> clone" allows clients to rely on the guarantee that any server that supports "partial clone"
> supports "fetch-blob-pack", and we can extend the guarantee to other 
> protocol upgrades that such repos would want.
>
> One such upgrade is "ref-in-want" [3]. The full details can be 
> obtained from that email thread, but to summarize, the patch set 
> eliminates the need for the initial ref advertisement and allows 
> communication in ref name globs, making it much easier for multiple 
> load-balanced servers to serve large repos to clients - this is 
> something that would greatly benefit the Android project, for example, and possibly many others.
>
> Bundling support for "ref-in-want" with "fetch-blob-pack" simplifies 
> matters for the client in that a client needs to only handle one 
> "version" of server (a server that supports both). If "ref-in-want"
> were added later, instead of now, clients would need to be able to 
> handle two "versions" (one with only "fetch-blob-pack" and one with both "fetch-blob-pack" and "ref-in-want").
>
> As for its implementation, that email thread already contains a patch 
> set that makes it work with the existing "upload-pack" endpoint; I can 
> update that patch set to use the proposed "server" endpoint (with a 
> "fetch-commit-pack" message) if need be.
>
> == Client behavior
>
> This proposal is concerned with server behavior only, but it is useful 
> to envision how the client would use this to ensure that the server 
> behavior is useful.
>
> === Indication to use the proposed endpoint
>
> The client will probably already record that at least one of its 
> remotes (the one that it successfully performed a "partial clone"
> from) supports this new endpoint (if not, it can’t determine whether a 
> missing blob was caused by repo corruption or by the "partial clone").
> This knowledge can be used both to know that the server supports 
> "fetch-blob-pack" and "fetch-commit-pack" (for the latter, the client 
> can fall back to "fetch-pack"/"upload-pack" when fetching from other servers).

This makes a lot of sense to me. When we built our caching proxy, we had to be careful
when designing how we'd handle clients requesting objects missing from the proxy. 

For example, a client requests a single blob and the proxy doesn't have it - we can't simply
download that object from the "authoritative" remote and stick it in the `.git\objects\xx\yyy...` 
directory, because the repository would be made corrupt. 

Having a way to specify that the repo is a "partial clone" and allowing "holes" would help a lot,
I believe. I know there have been varying opinions on how these missing objects should be
"marked" and I'm not ready to propose anything there - just agreeing the problem is important.

> === Multiple remotes
>
> Fetches of missing blobs should (at least by default?) go to the 
> remote that sent the tree that points to them. This means that if 
> there are multiple remotes, the client needs to remember which remote 
> it learned about a given missing blob from.

Definitely handy for someone writing a proxy, but some level of error fallback might be good.

> == Alternatives considered
>
> The "fetch-blob-pack" and "fetch-commit-pack" messages could be split 
> into their own endpoints. It seemed more reasonable to combine them 
> together since they serve similar use cases (large repos), and (for
> example) reduces the number of binaries in PATH, but I do not feel strongly about this.

Not do derail us too far off blobs, but I wonder if we need a `fetch-commit-pack` endpoint,
or could get away with introducing a new capability (e.g. `no-blobs`) to `upload-pack` instead.
As a casual observer, this seems like it would be a much smaller change since the rest of the 
negotiation/reachability calculation would look the same, right? Or would this `fetch-commit-pack` 
not return trees either? 

I only ask because, in our observations, when git wants to read commits it's 
usually followed by a lot of "related" trees - again caveated with the fact that 
we're intercepting many things at the filesystem layer.

> The client could supply commit information about the blobs it wants 
> (or other information that could help the reachability analysis).
> However, these lines wouldn’t be used by the proposed server design. 
> And if we do discover that these lines are useful, the protocol could 
> be extended with new lines that contain this information (since old 
> servers will ignore all lines that they do not understand).
>
> We could extend "upload-pack" to allow blobs in "want" lines instead 
> of having a new endpoint. Due to a quirk in the Git implementation 
> (but possibly not other implementations like JGit), this is already 
> supported [4]. However, each invocation would require the server to 
> generate an unnecessary ref list, and would require both the server 
> and the client to undergo more network traffic.
>
> Also, the new "server" endpoint might be made to be discovered through 
> another mechanism (for example, a capability advertisement on another 
> endpoint). It is probably simpler to tie it to the "partial clone"
> feature, though, since they are so likely to be used together.

Just to keep the discussion interesting, I'll throw an alternative out there that's
worked well for us. As I understand it, the HTTP-based dumb transfer protocol
supports returning objects in loose object format, but only if they already exist
in loose format. 

Extending this to have the remote provide these objects via a "dumb" protocol
when they are packed as well - i.e. the server would "loosens" them upon request -
is basically what we do and it works quite well for low-latency clients. To further improve
performance at the cost of complexity, we've added caching at the memory and disk layer
for these loose objects in the same format we send to the client. 

There's a clear tradeoff here - the servers must have adequate disk and/or memory to store
these loose objects in optimal format. In addition, the higher the latency is to the remote,
the worse this solution will perform. Fortunately, in our case, network topology allows us to
put these caching proxies close enough to clients for it not to matter.

> [1] <20170304191901.9622-1-markbt@efaref.net>
> [2] <1488999039-37631-1-git-send-email-git@jeffhostetler.com>
> [3] <cover.1485381677.git.jonathantanmy@google.com>
> [4] <20170309003547.6930-1-jonathantanmy@google.com>

next prev parent reply	other threads:[~2017-04-12 22:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-14 22:57 Proposal for "fetch-any-blob Git protocol" and server design Jonathan Tan
2017-03-15 17:59 ` Junio C Hamano
2017-03-16 17:31   ` Jonathan Tan
2017-03-16 21:17     ` Junio C Hamano
2017-03-16 22:48       ` Jonathan Tan
2017-03-28 23:19 ` Stefan Beller
     [not found]   ` <00bf01d2aed7$b13492a0$139db7e0$@gmail.com>
2017-04-12 22:02     ` Kevin David [this message]
2017-04-13 20:12       ` Jonathan Tan
2017-04-21 16:41         ` Kevin David
2017-04-26 22:51           ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BY2PR21MB0052067C2CDC6F727CCB6969EE030@BY2PR21MB0052.namprd21.prod.outlook.com \
    --to=kevin.david@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=markbt@efaref.net \
    --cc=peartben@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).