git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "Jonathan Tan" <jonathantanmy@google.com>,
	git <git@vger.kernel.org>, "Junio C Hamano" <gitster@pobox.com>,
	"Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [WIP 0/7] CDN offloading of fetch response
Date: Mon, 4 Mar 2019 09:54:07 +0100	[thread overview]
Message-ID: <CAP8UFD3SWNu=btPxV2vV=neYrofbgKPzz_WLvsJbv6bKjRoCpQ@mail.gmail.com> (raw)
In-Reply-To: <20190228232108.GA163714@google.com>

Hi,

On Fri, Mar 1, 2019 at 12:21 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Sorry for the slow followup.  Thanks for probing into the design ---
> this should be useful for getting the docs to be clear.
>
> Christian Couder wrote:
>
> > So it's likely that users will want a way to host on such sites
> > incomplete repos using CDN offloading to a CDN on another site. And
> > then if the CDN is not accessible for some reason, things will
> > completely break when users will clone.
>
> I think this would be a broken setup --- we can make it clear in the
> protocol and server docs that you should only point to a CDN for which
> you control the contents, to avoid breaking clients.

We can say whatever in the docs, but in real life if it's
simpler/cheaper for repo admins to use a CDN for example on Google and
a repo on GitHub, they are likely to do it anyway.

> That doesn't prevent adding additional features in the future e.g. for
> "server suggested alternates" --- it's just that I consider that a
> separate feature.
>
> Using CDN offloading requires cooperation of the hosting provider.
> It's a way to optimize how fetches work, not a way to have a partial
> repository on the server side.

We can say whatever we want about what it is for. Users are likely to
use it anyway in the way they think it will benefit them the most.

> > On Tue, Feb 26, 2019 at 12:45 AM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> >> This doesn't stop a hosting provider from using e.g. server options to
> >> allow the client more control over how their response is served, just
> >> like can be done for other features of how the transfer works (how
> >> often to send progress updates, whether to prioritize latency or
> >> throughput, etc).
> >
> > Could you give a more concrete example of what could be done?
>
> What I mean is passing server options using "git fetch --server-option".
> For example:
>
>         git fetch -o priority=BATCH origin master
>
> or
>
>         git fetch -o avoid-cdn=badcdn.example.com origin master
>
> The interpretation of server options is up to the server.

If you often have to tell things like "-o
avoid-cdn=badcdn.example.com", then how is it better than just
specifying "-o usecdn=goodcdn.example.com" or even better using the
remote mechanism to configure a remote for goodcdn.example.com and
then configuring this remote to be used along the origin remote (which
is what many promisor remotes is about)?

> >> What the client *can* do is turn off support for packfile URLs in a
> >> request completely.  This is required for backward compatibility and
> >> allows working around a host that has configured the feature
> >> incorrectly.
> >
> > If the full content of a repo is really large, the size of a full pack
> > file sent by an initial clone could be really big and many client
> > machines could not have enough memory to deal with that. And this
> > suppose that repo hosting providers would be ok to host very large
> > repos in the first place.
>
> Do we require the packfile to fit in memory?  If so, we should fix
> that (to use streaming instead).

Even if we stream the packfile to write it, at one point we have to use it.

And I could be wrong but I think that mmap doesn't work on Windows, so
I think we will just try to read the whole thing into memory. Even on
Linux I don't think it's a good idea to mmap a very large file and
then use some big parts of it which I think we will have to do when
checking out the large files from inside the packfile.

Yeah, we can improve that part of Git too. I think though that it
means yet another thing (and not an easy one) that needs to be
improved before CDN offloading can work well in the real world.

I think that the Git "development philosophy" since the beginning has
been more about adding things that work well in the real world first
even if they are small and a bit manual, and then improving on top of
those early things, rather than adding a big thing that doesn't quite
work well in the real world but is automated and then improving on
that.

  reply	other threads:[~2019-03-04  8:54 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 23:38 [WIP 0/7] CDN offloading of fetch response Jonathan Tan
2019-02-23 23:38 ` [WIP 1/7] http: use --stdin and --keep when downloading pack Jonathan Tan
2019-02-23 23:38 ` [WIP 2/7] http: improve documentation of http_pack_request Jonathan Tan
2019-02-23 23:38 ` [WIP 3/7] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-02-23 23:38 ` [WIP 4/7] Documentation: order protocol v2 sections Jonathan Tan
2019-02-23 23:38 ` [WIP 5/7] Documentation: add Packfile URIs design doc Jonathan Tan
2019-02-23 23:39 ` [WIP 6/7] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-02-23 23:39 ` [WIP 7/7] upload-pack: send part of packfile response as uri Jonathan Tan
2019-02-24 15:54   ` Junio C Hamano
2019-02-25 21:04   ` Christian Couder
2019-02-26  1:53     ` Jonathan Nieder
2019-02-26  7:08       ` Christian Couder
2019-03-01  0:09   ` Josh Steadmon
2019-03-01  0:17     ` Jonathan Tan
2019-02-25 21:30 ` [WIP 0/7] CDN offloading of fetch response Christian Couder
2019-02-25 23:45   ` Jonathan Nieder
2019-02-26  8:30     ` Christian Couder
2019-02-26  9:12       ` Ævar Arnfjörð Bjarmason
2019-03-04  8:24         ` Christian Couder
2019-02-28 23:21       ` Jonathan Nieder
2019-03-04  8:54         ` Christian Couder [this message]
2019-03-08 21:55 ` [PATCH v2 0/8] " Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 1/8] http: use --stdin when getting dumb HTTP pack Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 2/8] http: improve documentation of http_pack_request Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 3/8] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 4/8] Documentation: order protocol v2 sections Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 5/8] Documentation: add Packfile URIs design doc Jonathan Tan
2019-04-23  5:31     ` Jeff King
2019-04-23 20:38       ` Jonathan Tan
2019-04-23 22:18         ` Ævar Arnfjörð Bjarmason
2019-04-23 22:22           ` Jonathan Nieder
2019-04-23 22:30             ` Ævar Arnfjörð Bjarmason
2019-04-23 22:51               ` Jonathan Nieder
2019-04-23 22:11       ` Jonathan Nieder
2019-04-23 22:25         ` Ævar Arnfjörð Bjarmason
2019-04-23 22:48           ` Jonathan Nieder
2019-04-24  7:48             ` Ævar Arnfjörð Bjarmason
2019-04-24  3:01       ` Junio C Hamano
2019-03-08 21:55   ` [PATCH v2 6/8] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 7/8] fetch-pack: support more than one pack lockfile Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 8/8] upload-pack: send part of packfile response as uri Jonathan Tan
2019-03-19 20:48   ` [PATCH v2 0/8] CDN offloading of fetch response Josh Steadmon
2019-04-23  5:21   ` Jeff King
2019-04-23 19:23     ` Jonathan Tan
2019-04-24  9:09   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP8UFD3SWNu=btPxV2vV=neYrofbgKPzz_WLvsJbv6bKjRoCpQ@mail.gmail.com' \
    --to=christian.couder@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).