git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [WIP 0/7] CDN offloading of fetch response
@ 2019-02-23 23:38 Jonathan Tan
  2019-02-23 23:38 ` [WIP 1/7] http: use --stdin and --keep when downloading pack Jonathan Tan
                   ` (8 more replies)
  0 siblings, 9 replies; 45+ messages in thread
From: Jonathan Tan @ 2019-02-23 23:38 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff, christian.couder, avarab

It's been a while, so here is an updated version of what I previously
sent [1]. The main difference is that fetch-pack now actually downloads
whatever the server tells it to. The second main difference is that
we no longer buffer progress messages and suspend keepalives - we
no longer need to, because the sideband-all patches have been merged.

I think I've address all the technical comments in [1], except one
comment of Junio's [2] that I still don't understand:

> And the "fix" I was alluding to was to update that "else" clause to
> make it a no-op that keeps os->used non-zero, which would not call
> send-client-data.
>
> When that fix happens, the part that early in the function this
> patch added "now we know we will call send-client-data, so let's say
> 'here comes packdata' unless we have already said that" is making
> the decision too early.  Depending on the value of os->used when we
> enter the code and the number of bytes xread() reads from the
> upstream, we might not call send-client-data yet (namely, when we
> have no buffered data and we happened to get only one byte).

With or without this fix, I don't think there is ever a time when we
say "here comes packdata" without calling send-client-data - we say
"here comes packdata" only when we see the string "PACK", which forms
part of the packfile, and thus we should have no problem sending any
client data. Having said that, maybe this is a moot point - Junio says
that this only happens when the fix is implemented, and the fix is not
implemented.

There are probably some more design discussions to be had:

 - Requirement that all pointed-to CDNs support byte ranges for
   resumability, and to guarantee that the packfiles will be there
   permanently. After some thought, it is a good idea for CDNs to
   support that, but I think that we should support CDNs that can only
   give temporal guarantees (e.g. if/when we start implementing
   resumption, we could read the Cache headers). I didn't add any
   mention of this issue in the documentation.

 - Client-side whitelist of protocol and hostnames. I've implemented
   whitelist of protocol, but not hostnames.

 - Any sort of follow-up fetch - for example, if the download from the
   CDN fails or if we allow the server to tell us of best-effort
   packfiles (but the client still must check and formulate the correct
   request to the server to fetch the rest). This protocol seems like a
   prerequisite to all those, and is independently useful, so maybe all
   of those can be future work.

Please take a look. Feel free to comment on anything, but I prefer
comments on the major things first (e.g. my usage of a separate process
(http-fetch) to fetch packfiles, since as far as I know, Git doesn't
link to libcurl; any of the design decisions I described above). I know
that there are some implementation details that could be improved (e.g.
parallelization of the CDN downloads, starting CDN downloads *after*
closing the first HTTP request, holding on to the .keep locks until
after the refs are set), but will work on those once the overall design
is more or less finalized.

Note that the first patch is exactly the same as one I've previously
sent [3].

[1] https://public-inbox.org/git/cover.1543879256.git.jonathantanmy@google.com/
[2] https://public-inbox.org/git/xmqqmupi89ub.fsf@gitster-ct.c.googlers.com/
[3] https://public-inbox.org/git/20190221001447.124088-1-jonathantanmy@google.com/

Jonathan Tan (7):
  http: use --stdin and --keep when downloading pack
  http: improve documentation of http_pack_request
  http-fetch: support fetching packfiles by URL
  Documentation: order protocol v2 sections
  Documentation: add Packfile URIs design doc
  upload-pack: refactor reading of pack-objects out
  upload-pack: send part of packfile response as uri

 Documentation/git-http-fetch.txt         |   7 +-
 Documentation/technical/packfile-uri.txt |  79 ++++++++++++
 Documentation/technical/protocol-v2.txt  |  22 ++--
 builtin/pack-objects.c                   |  63 +++++++++
 fetch-pack.c                             |  58 +++++++++
 http-fetch.c                             |  65 ++++++++--
 http-push.c                              |   7 +-
 http-walker.c                            |   5 +-
 http.c                                   |  83 +++++++-----
 http.h                                   |  26 +++-
 t/t5550-http-fetch-dumb.sh               |  18 +++
 t/t5702-protocol-v2.sh                   |  54 ++++++++
 upload-pack.c                            | 155 +++++++++++++++++------
 13 files changed, 542 insertions(+), 100 deletions(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

-- 
2.19.0.271.gfe8321ec05.dirty


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-04-24  9:09 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-23 23:38 [WIP 0/7] CDN offloading of fetch response Jonathan Tan
2019-02-23 23:38 ` [WIP 1/7] http: use --stdin and --keep when downloading pack Jonathan Tan
2019-02-23 23:38 ` [WIP 2/7] http: improve documentation of http_pack_request Jonathan Tan
2019-02-23 23:38 ` [WIP 3/7] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-02-23 23:38 ` [WIP 4/7] Documentation: order protocol v2 sections Jonathan Tan
2019-02-23 23:38 ` [WIP 5/7] Documentation: add Packfile URIs design doc Jonathan Tan
2019-02-23 23:39 ` [WIP 6/7] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-02-23 23:39 ` [WIP 7/7] upload-pack: send part of packfile response as uri Jonathan Tan
2019-02-24 15:54   ` Junio C Hamano
2019-02-25 21:04   ` Christian Couder
2019-02-26  1:53     ` Jonathan Nieder
2019-02-26  7:08       ` Christian Couder
2019-03-01  0:09   ` Josh Steadmon
2019-03-01  0:17     ` Jonathan Tan
2019-02-25 21:30 ` [WIP 0/7] CDN offloading of fetch response Christian Couder
2019-02-25 23:45   ` Jonathan Nieder
2019-02-26  8:30     ` Christian Couder
2019-02-26  9:12       ` Ævar Arnfjörð Bjarmason
2019-03-04  8:24         ` Christian Couder
2019-02-28 23:21       ` Jonathan Nieder
2019-03-04  8:54         ` Christian Couder
2019-03-08 21:55 ` [PATCH v2 0/8] " Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 1/8] http: use --stdin when getting dumb HTTP pack Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 2/8] http: improve documentation of http_pack_request Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 3/8] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 4/8] Documentation: order protocol v2 sections Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 5/8] Documentation: add Packfile URIs design doc Jonathan Tan
2019-04-23  5:31     ` Jeff King
2019-04-23 20:38       ` Jonathan Tan
2019-04-23 22:18         ` Ævar Arnfjörð Bjarmason
2019-04-23 22:22           ` Jonathan Nieder
2019-04-23 22:30             ` Ævar Arnfjörð Bjarmason
2019-04-23 22:51               ` Jonathan Nieder
2019-04-23 22:11       ` Jonathan Nieder
2019-04-23 22:25         ` Ævar Arnfjörð Bjarmason
2019-04-23 22:48           ` Jonathan Nieder
2019-04-24  7:48             ` Ævar Arnfjörð Bjarmason
2019-04-24  3:01       ` Junio C Hamano
2019-03-08 21:55   ` [PATCH v2 6/8] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 7/8] fetch-pack: support more than one pack lockfile Jonathan Tan
2019-03-08 21:55   ` [PATCH v2 8/8] upload-pack: send part of packfile response as uri Jonathan Tan
2019-03-19 20:48   ` [PATCH v2 0/8] CDN offloading of fetch response Josh Steadmon
2019-04-23  5:21   ` Jeff King
2019-04-23 19:23     ` Jonathan Tan
2019-04-24  9:09   ` Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).