git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Shawn Pearce <spearce@spearce.org>, git <git@vger.kernel.org>
Subject: Re: RFC: Resumable clone based on hybrid "smart" and "dumb" HTTP
Date: Wed, 10 Feb 2016 14:17:58 -0800	[thread overview]
Message-ID: <20160210221758.GC10155@google.com> (raw)
In-Reply-To: <20160210214945.GA5853@sigill.intra.peff.net>

Jeff King wrote:
> On Wed, Feb 10, 2016 at 12:11:46PM -0800, Shawn Pearce wrote:

>> Several of us at $DAY_JOB talked about this more today and thought a
>> variation makes more sense:
>>
>> 1. Clients attempting clone ask for /info/refs?service=git-upload-pack
>> like they do today.
>>
>> 2. Servers that support resumable clone include a "resumable"
>> capability in the advertisement.
>
> Because the magic happens in the git protocol, that would mean this does
> not have to be limited to git-over-http. It could be "resumable=<url>"
> to point the client anywhere (the same server over a different protocol,
> another server, etc).

Thanks for bringing this up.  A worry with putting the URL in the
capabilities line is that it makes it easy to run into the 1000-byte
limit.  It's been a while since v1.8.3-rc0~148^2~6 (pkt-line: provide
a LARGE_PACKET_MAX static buffer, 2013-02-20) but we still can't
rely on clients having that applied.

(I also haven't checked whether current versions of git are able to
handle longer capability strings with that patch applied.)

Another nice thing about using a 302 is that you can set cookies
during the redirect, which might make authenticated access easier.
(That said, authenticated access through e.g. signed URLs can work
fine without that.)

[...]
> Clients do not have to _just_ fetch a packfile. They could get a bundle
> file that contains the roots along with the packfile. I know that one of
> your goals is not duplicating the storage of the packfile on the server,
> but it would not be hard for the server to store the packfile and the
> bundle header separately, and concatenate them on the fly.

Doesn't that prevent using a git-unaware file transfer service to
serve the files?

It also means the client can't use the downloaded file as-is --- they
need to separate the root list from the packfile (that's not a big
deal; just some added complication to switch files at the appropriate
moment during download if you want to avoid temporarily using twice
the space).

That said, both these problems are avoided by serving the 'split
bundle' you described as-is instead of concatenating.

[...]
> And you'll notice, too, that all of the bundle-http magic kicks in
> during step 2 because the client sees they're grabbing a bundle. Which
> means that the <url> in step 1 doesn't _have_ to be a bundle. It can be
> "go fetch from kernel.org, then come back to me".

I think that use case brings in complications that make it not
necessarily worth it.  In this example, if kernel.org is serving pack
files, why shouldn't I point directly at the advertised pack CDN URL
instead of adding an extra hop that puts added load on kernel.org
servers?

Allowing an arbitrary "fetch from here first" capability is very
flexible.  I guess my fear comes from not knowing what the flexibility
buys beyond aesthetics.  (My motivation comes from the example of
alternates: it is pretty and very flexible and ended up as a support
and maintenance headache instead of being widely useful.  I think what
you are proposing is more harmless but I'd still want to have an
example of what it's used for before going in that direction.)

Thanks,
Jonathan

  reply	other threads:[~2016-02-10 22:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-10 18:59 RFC: Resumable clone based on hybrid "smart" and "dumb" HTTP Shawn Pearce
2016-02-10 20:11 ` Shawn Pearce
2016-02-10 20:23   ` Stefan Beller
2016-02-10 20:57     ` Junio C Hamano
2016-02-10 21:22       ` Jonathan Nieder
2016-02-10 22:03         ` Jeff King
2016-02-10 21:01     ` Jonathan Nieder
2016-02-10 21:07       ` Junio C Hamano
2016-02-11  3:43       ` Junio C Hamano
2016-02-11 18:04         ` Shawn Pearce
2016-02-11 23:53       ` Duy Nguyen
2016-02-13  5:07         ` Junio C Hamano
2016-02-10 21:49   ` Jeff King
2016-02-10 22:17     ` Jonathan Nieder [this message]
2016-02-10 23:03       ` Jeff King
2016-02-10 22:40     ` Junio C Hamano
2016-02-11 21:32     ` Junio C Hamano
2016-02-11 21:46       ` Jeff King
2016-02-13  1:40     ` Blake Burkhart
2016-02-13 17:00       ` Jeff King
2016-02-14  2:14     ` Shawn Pearce
2016-02-14 17:05       ` Jeff King
2016-02-14 17:56         ` Shawn Pearce
2016-02-16 18:34         ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160210221758.GC10155@google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).