git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: peff@peff.net, git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH v2 5/8] Documentation: add Packfile URIs design doc
Date: Wed, 24 Apr 2019 00:18:55 +0200	[thread overview]
Message-ID: <871s1stlf4.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20190423203801.29156-1-jonathantanmy@google.com>


On Tue, Apr 23 2019, Jonathan Tan wrote:

>> The problem I see is that the client doesn't get to vet the list of
>> URIs; it only gets to specify a protocol match. But there are many other
>> reasons it might want to reject a URI: we don't like the protocol, the
>> domain name is on a blacklist (or not on a whitelist), the domain name
>> can't resolve, we can't make a TCP connection to the server, we can't
>> successfully fetch the pack.
>>
>> You'll note that those rise in complexity and time as you go down the
>> list. I'm not sure where on that spectrum we'd want our clients to stop
>> vetting (and it may even depend on config). But I think we ought to
>> design the protocol to put the decision in the hands of the client so
>> that it _can_ make those choices itself.
>>
>> I.e., I think the conversation ought to be more like:
>>
>>   Server: I support packfile-uris X, Y, Z.
>>
>>   Client: Great. I'll use URIs X and Z.
>>
>>   Server: OK, here's your pack, minus any objects I know are in X and Z.
>>           I'll send you the objects from Y as normal.
>>
>> And then the client is free to pick and choose.
>
> One drawback I see is that the server needs to compute objects to be
> sent twice - once to generate the URIs and once after the client has
> informed the server which URIs it wants.
>
> If we expect some packfile-uris to not be usable sometimes (for any of
> the reasons you listed), this would be nice. The protocol in my patches
> support a rudimentary version of this (if any of the URIs don't work for
> any reason, just fetch again without advertising that we support
> packfile URIs) but the packfile received during the first patch is
> wasted.

This is really orthagonal to this series, but wouldn't a better
resumption strategy here be to walk the pack we just downloaded, run the
equivalent of 'commit-graph write' on it to figure out likely "tip"
commits, and use those in "have" lines to negotiate with the server the
next time around?

I've sometimes wished we optionally had that sort of fetch algorithm, in
particular now setting "alternates" on a freshly init-ed repo will do
the full initial fetch even though we have most/all of the objects in
the alternates now.

> So the tradeoff is: in the good case, your suggestion means that we make
> another fetch request, increasing the load on the server and taking more
> time. In the bad case, your suggestions means that we avoid sending a
> useless packfile upon the first patch, and if the server is smart
> enough, even the second packfile will be smaller. It depends on how
> often we think the bad case occurs, if servers will typically send more
> than one packfile-uri, and how smart we think servers will typically be.
>
>> The initial server uri
>> list can come in the capabilities list, or it can be a separate request
>> once the client sees the server supports packfile-uris and wants to ask
>> about them.
>
> I don't think this is possible - the URI list is dependent on the wants
> and haves.
>
>> We may need some way for the server to group the uris so
>> that the client knows which ones are alternates of each other (and which
>> ones are needed to make a complete set).
>
> My initial design didn't have this feature (in fact, it seems to me that
> each thing should have one canonical URL, which means that there is no
> need for an alternate) - do you think we should be thinking about this
> at this stage?

  reply	other threads:[~2019-04-23 22:20 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 23:38 [WIP 0/7] CDN offloading of fetch response Jonathan Tan
2019-02-23 23:38 ` [WIP 1/7] http: use --stdin and --keep when downloading pack Jonathan Tan
2019-02-23 23:38 ` [WIP 2/7] http: improve documentation of http_pack_request Jonathan Tan
2019-02-23 23:38 ` [WIP 3/7] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-02-23 23:38 ` [WIP 4/7] Documentation: order protocol v2 sections Jonathan Tan
2019-02-23 23:38 ` [WIP 5/7] Documentation: add Packfile URIs design doc Jonathan Tan
2019-02-23 23:39 ` [WIP 6/7] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-02-23 23:39 ` [WIP 7/7] upload-pack: send part of packfile response as uri Jonathan Tan
2019-02-24 15:54   ` Junio C Hamano
2019-02-25 21:04   ` Christian Couder
2019-02-26  1:53     ` Jonathan Nieder
2019-02-26  7:08       ` Christian Couder
2019-03-01  0:09   ` Josh Steadmon
2019-03-01  0:17     ` Jonathan Tan
2019-02-25 21:30 ` [WIP 0/7] CDN offloading of fetch response Christian Couder
2019-02-25 23:45   ` Jonathan Nieder
2019-02-26  8:30     ` Christian Couder
2019-02-26  9:12       ` Ævar Arnfjörð Bjarmason
2019-03-04  8:24         ` Christian Couder
2019-02-28 23:21       ` Jonathan Nieder
2019-03-04  8:54         ` Christian Couder
2019-03-08 21:55 ` [PATCH v2 0/8] " Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 1/8] http: use --stdin when getting dumb HTTP pack Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 2/8] http: improve documentation of http_pack_request Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 3/8] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 4/8] Documentation: order protocol v2 sections Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 5/8] Documentation: add Packfile URIs design doc Jonathan Tan
2019-04-23  5:31     ` Jeff King
2019-04-23 20:38       ` Jonathan Tan
2019-04-23 22:18         ` Ævar Arnfjörð Bjarmason [this message]
2019-04-23 22:22           ` Jonathan Nieder
2019-04-23 22:30             ` Ævar Arnfjörð Bjarmason
2019-04-23 22:51               ` Jonathan Nieder
2019-04-23 22:11       ` Jonathan Nieder
2019-04-23 22:25         ` Ævar Arnfjörð Bjarmason
2019-04-23 22:48           ` Jonathan Nieder
2019-04-24  7:48             ` Ævar Arnfjörð Bjarmason
2019-04-24  3:01       ` Junio C Hamano
2019-03-08 21:55   ` [PATCH v2 6/8] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 7/8] fetch-pack: support more than one pack lockfile Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 8/8] upload-pack: send part of packfile response as uri Jonathan Tan
2019-03-19 20:48   ` [PATCH v2 0/8] CDN offloading of fetch response Josh Steadmon
2019-04-23  5:21   ` Jeff King
2019-04-23 19:23     ` Jonathan Tan
2019-04-24  9:09   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871s1stlf4.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).