From: Jonathan Tan <jonathantanmy@google.com>
To: avarab@gmail.com, git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 5/8] Documentation: add Packfile URIs design doc
Date: Fri, 8 Mar 2019 13:55:17 -0800 [thread overview]
Message-ID: <5ce56844d3fb740e29d2f3d4be2ade0b2ad5f7fd.1552073690.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1552073690.git.jonathantanmy@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
Documentation/technical/packfile-uri.txt | 78 ++++++++++++++++++++++++
Documentation/technical/protocol-v2.txt | 28 ++++++++-
2 files changed, 105 insertions(+), 1 deletion(-)
create mode 100644 Documentation/technical/packfile-uri.txt
diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt
new file mode 100644
index 0000000000..6a5a6440d5
--- /dev/null
+++ b/Documentation/technical/packfile-uri.txt
@@ -0,0 +1,78 @@
+Packfile URIs
+=============
+
+This feature allows servers to serve part of their packfile response as URIs.
+This allows server designs that improve scalability in bandwidth and CPU usage
+(for example, by serving some data through a CDN), and (in the future) provides
+some measure of resumability to clients.
+
+This feature is available only in protocol version 2.
+
+Protocol
+--------
+
+The server advertises `packfile-uris`.
+
+If the client then communicates which protocols (HTTPS, etc.) it supports with
+a `packfile-uris` argument, the server MAY send a `packfile-uris` section
+directly before the `packfile` section (right after `wanted-refs` if it is
+sent) containing URIs of any of the given protocols. The URIs point to
+packfiles that use only features that the client has declared that it supports
+(e.g. ofs-delta and thin-pack). See protocol-v2.txt for the documentation of
+this section.
+
+Clients then should understand that the returned packfile could be incomplete,
+and that it needs to download all the given URIs before the fetch or clone is
+complete.
+
+Server design
+-------------
+
+The server can be trivially made compatible with the proposed protocol by
+having it advertise `packfile-uris`, tolerating the client sending
+`packfile-uris`, and never sending any `packfile-uris` section. But we should
+include some sort of non-trivial implementation in the Minimum Viable Product,
+at least so that we can test the client.
+
+This is the implementation: a feature, marked experimental, that allows the
+server to be configured by one or more `uploadpack.blobPackfileUri=<sha1>
+<uri>` entries. Whenever the list of objects to be sent is assembled, a blob
+with the given sha1 can be replaced by the given URI. This allows, for example,
+servers to delegate serving of large blobs to CDNs.
+
+Client design
+-------------
+
+While fetching, the client needs to remember the list of URIs and cannot
+declare that the fetch is complete until all URIs have been downloaded as
+packfiles.
+
+The division of work (initial fetch + additional URIs) introduces convenient
+points for resumption of an interrupted clone - such resumption can be done
+after the Minimum Viable Product (see "Future work").
+
+The client can inhibit this feature (i.e. refrain from sending the
+`packfile-uris` parameter) by passing --no-packfile-uris to `git fetch`.
+
+Future work
+-----------
+
+The protocol design allows some evolution of the server and client without any
+need for protocol changes, so only a small-scoped design is included here to
+form the MVP. For example, the following can be done:
+
+ * On the server, a long-running process that takes in entire requests and
+ outputs a list of URIs and the corresponding inclusion and exclusion sets of
+ objects. This allows, e.g., signed URIs to be used and packfiles for common
+ requests to be cached.
+ * On the client, resumption of clone. If a clone is interrupted, information
+ could be recorded in the repository's config and a "clone-resume" command
+ can resume the clone in progress. (Resumption of subsequent fetches is more
+ difficult because that must deal with the user wanting to use the repository
+ even after the fetch was interrupted.)
+
+There are some possible features that will require a change in protocol:
+
+ * Additional HTTP headers (e.g. authentication)
+ * Byte range support
+ * Different file formats referenced by URIs (e.g. raw object)
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 36239ec7e9..7b63c26ecd 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -323,13 +323,26 @@ included in the client's request:
indicating its sideband (1, 2, or 3), and the server may send "0005\2"
(a PKT-LINE of sideband 2 with no payload) as a keepalive packet.
+If the 'packfile-uris' feature is advertised, the following argument
+can be included in the client's request as well as the potential
+addition of the 'packfile-uris' section in the server's response as
+explained below.
+
+ packfile-uris <comma-separated list of protocols>
+ Indicates to the server that the client is willing to receive
+ URIs of any of the given protocols in place of objects in the
+ sent packfile. Before performing the connectivity check, the
+ client should download from all given URIs. Currently, the
+ protocols supported are "http" and "https".
+
The response of `fetch` is broken into a number of sections separated by
delimiter packets (0001), with each section beginning with its section
header. Most sections are sent only when the packfile is sent.
output = acknowledgements flush-pkt |
[acknowledgments delim-pkt] [shallow-info delim-pkt]
- [wanted-refs delim-pkt] packfile flush-pkt
+ [wanted-refs delim-pkt] [packfile-uris delim-pkt]
+ packfile flush-pkt
acknowledgments = PKT-LINE("acknowledgments" LF)
(nak | *ack)
@@ -347,6 +360,9 @@ header. Most sections are sent only when the packfile is sent.
*PKT-LINE(wanted-ref LF)
wanted-ref = obj-id SP refname
+ packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
+ packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
+
packfile = PKT-LINE("packfile" LF)
*PKT-LINE(%x01-03 *%x00-ff)
@@ -418,6 +434,16 @@ header. Most sections are sent only when the packfile is sent.
* The server MUST NOT send any refs which were not requested
using 'want-ref' lines.
+ packfile-uris section
+ * This section is only included if the client sent
+ 'packfile-uris' and the server has at least one such URI to
+ send.
+
+ * Always begins with the section header "packfile-uris".
+
+ * For each URI the server sends, it sends a hash of the pack's
+ contents (as output by git index-pack) followed by the URI.
+
packfile section
* This section is only included if the client has sent 'want'
lines in its request and either requested that no more
--
2.19.0.271.gfe8321ec05.dirty
next prev parent reply other threads:[~2019-03-08 21:55 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-23 23:38 [WIP 0/7] CDN offloading of fetch response Jonathan Tan
2019-02-23 23:38 ` [WIP 1/7] http: use --stdin and --keep when downloading pack Jonathan Tan
2019-02-23 23:38 ` [WIP 2/7] http: improve documentation of http_pack_request Jonathan Tan
2019-02-23 23:38 ` [WIP 3/7] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-02-23 23:38 ` [WIP 4/7] Documentation: order protocol v2 sections Jonathan Tan
2019-02-23 23:38 ` [WIP 5/7] Documentation: add Packfile URIs design doc Jonathan Tan
2019-02-23 23:39 ` [WIP 6/7] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-02-23 23:39 ` [WIP 7/7] upload-pack: send part of packfile response as uri Jonathan Tan
2019-02-24 15:54 ` Junio C Hamano
2019-02-25 21:04 ` Christian Couder
2019-02-26 1:53 ` Jonathan Nieder
2019-02-26 7:08 ` Christian Couder
2019-03-01 0:09 ` Josh Steadmon
2019-03-01 0:17 ` Jonathan Tan
2019-02-25 21:30 ` [WIP 0/7] CDN offloading of fetch response Christian Couder
2019-02-25 23:45 ` Jonathan Nieder
2019-02-26 8:30 ` Christian Couder
2019-02-26 9:12 ` Ævar Arnfjörð Bjarmason
2019-03-04 8:24 ` Christian Couder
2019-02-28 23:21 ` Jonathan Nieder
2019-03-04 8:54 ` Christian Couder
2019-03-08 21:55 ` [PATCH v2 0/8] " Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 1/8] http: use --stdin when getting dumb HTTP pack Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 2/8] http: improve documentation of http_pack_request Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 3/8] http-fetch: support fetching packfiles by URL Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 4/8] Documentation: order protocol v2 sections Jonathan Tan
2019-03-08 21:55 ` Jonathan Tan [this message]
2019-04-23 5:31 ` [PATCH v2 5/8] Documentation: add Packfile URIs design doc Jeff King
2019-04-23 20:38 ` Jonathan Tan
2019-04-23 22:18 ` Ævar Arnfjörð Bjarmason
2019-04-23 22:22 ` Jonathan Nieder
2019-04-23 22:30 ` Ævar Arnfjörð Bjarmason
2019-04-23 22:51 ` Jonathan Nieder
2019-04-23 22:11 ` Jonathan Nieder
2019-04-23 22:25 ` Ævar Arnfjörð Bjarmason
2019-04-23 22:48 ` Jonathan Nieder
2019-04-24 7:48 ` Ævar Arnfjörð Bjarmason
2019-04-24 3:01 ` Junio C Hamano
2019-03-08 21:55 ` [PATCH v2 6/8] upload-pack: refactor reading of pack-objects out Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 7/8] fetch-pack: support more than one pack lockfile Jonathan Tan
2019-03-08 21:55 ` [PATCH v2 8/8] upload-pack: send part of packfile response as uri Jonathan Tan
2019-03-19 20:48 ` [PATCH v2 0/8] CDN offloading of fetch response Josh Steadmon
2019-04-23 5:21 ` Jeff King
2019-04-23 19:23 ` Jonathan Tan
2019-04-24 9:09 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5ce56844d3fb740e29d2f3d4be2ade0b2ad5f7fd.1552073690.git.jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).