git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [WIP RFC 2/5] Documentation: add Packfile URIs design doc
Date: Mon,  3 Dec 2018 15:37:35 -0800	[thread overview]
Message-ID: <0461b362569362c6d0e73951469c547a03a1b59d.1543879256.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1543879256.git.jonathantanmy@google.com>

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/packfile-uri.txt | 83 ++++++++++++++++++++++++
 Documentation/technical/protocol-v2.txt  |  6 +-
 2 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/technical/packfile-uri.txt

diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt
new file mode 100644
index 0000000000..6535801486
--- /dev/null
+++ b/Documentation/technical/packfile-uri.txt
@@ -0,0 +1,83 @@
+Packfile URIs
+=============
+
+This feature allows servers to serve part of their packfile response as URIs.
+This allows server designs that improve scalability in bandwidth and CPU usage
+(for example, by serving some data through a CDN), and (in the future) provides
+some measure of resumability to clients.
+
+This feature is available only in protocol version 2.
+
+Protocol
+--------
+
+The server advertises `packfile-uris`.
+
+If the client replies with the following arguments:
+
+ * packfile-uris
+ * thin-pack
+ * ofs-delta
+
+when the server sends the packfile, it MAY send a `packfile-uris` section
+directly before the `packfile` section (right after `wanted-refs` if it is
+sent) containing HTTP(S) URIs. See protocol-v2.txt for the documentation of
+this section.
+
+Clients then should understand that the returned packfile could be incomplete,
+and that it needs to download all the given URIs before the fetch or clone is
+complete. Each URI should point to a Git packfile (which may be a thin pack and
+which may contain offset deltas).
+
+Server design
+-------------
+
+The server can be trivially made compatible with the proposed protocol by
+having it advertise `packfile-uris`, tolerating the client sending
+`packfile-uris`, and never sending any `packfile-uris` section. But we should
+include some sort of non-trivial implementation in the Minimum Viable Product,
+at least so that we can test the client.
+
+This is the implementation: a feature, marked experimental, that allows the
+server to be configured by one or more `uploadpack.blobPackfileUri=<sha1>
+<uri>` entries. Whenever the list of objects to be sent is assembled, a blob
+with the given sha1 can be replaced by the given URI. This allows, for example,
+servers to delegate serving of large blobs to CDNs.
+
+Client design
+-------------
+
+While fetching, the client needs to remember the list of URIs and cannot
+declare that the fetch is complete until all URIs have been downloaded as
+packfiles.
+
+The division of work (initial fetch + additional URIs) introduces convenient
+points for resumption of an interrupted clone - such resumption can be done
+after the Minimum Viable Product (see "Future work").
+
+The client can inhibit this feature (i.e. refrain from sending the
+`packfile-urls` parameter) by passing --no-packfile-urls to `git fetch`.
+
+Future work
+-----------
+
+The protocol design allows some evolution of the server and client without any
+need for protocol changes, so only a small-scoped design is included here to
+form the MVP. For example, the following can be done:
+
+ * On the server, a long-running process that takes in entire requests and
+   outputs a list of URIs and the corresponding inclusion and exclusion sets of
+   objects. This allows, e.g., signed URIs to be used and packfiles for common
+   requests to be cached.
+ * On the client, resumption of clone. If a clone is interrupted, information
+   could be recorded in the repository's config and a "clone-resume" command
+   can resume the clone in progress. (Resumption of subsequent fetches is more
+   difficult because that must deal with the user wanting to use the repository
+   even after the fetch was interrupted.)
+
+There are some possible features that will require a change in protocol:
+
+ * Additional HTTP headers (e.g. authentication)
+ * Byte range support
+ * Different file formats referenced by URIs (e.g. raw object)
+
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 345c00e08c..2cb1c41742 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -313,7 +313,8 @@ header. Most sections are sent only when the packfile is sent.
 
     output = acknowledgements flush-pkt |
 	     [acknowledgments delim-pkt] [shallow-info delim-pkt]
-	     [wanted-refs delim-pkt] packfile flush-pkt
+	     [wanted-refs delim-pkt] [packfile-uris delim-pkt]
+	     packfile flush-pkt
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
 		      (nak | *ack)
@@ -331,6 +332,9 @@ header. Most sections are sent only when the packfile is sent.
 		  *PKT-LINE(wanted-ref LF)
     wanted-ref = obj-id SP refname
 
+    packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
+    packfile-uri = PKT-LINE("uri" SP *%x20-ff LF)
+
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
 
-- 
2.19.0.271.gfe8321ec05.dirty


  parent reply	other threads:[~2018-12-03 23:37 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-03 23:37 [WIP RFC 0/5] Design for offloading part of packfile response to CDN Jonathan Tan
2018-12-03 23:37 ` [WIP RFC 1/5] Documentation: order protocol v2 sections Jonathan Tan
2018-12-05  4:10   ` Junio C Hamano
2018-12-06 22:54     ` Jonathan Tan
2018-12-09  0:15       ` Junio C Hamano
2018-12-03 23:37 ` Jonathan Tan [this message]
2018-12-04  0:21   ` [WIP RFC 2/5] Documentation: add Packfile URIs design doc Stefan Beller
2018-12-04  1:54   ` brian m. carlson
2018-12-04 19:29     ` Jonathan Tan
2019-02-19 13:22       ` Christian Couder
2019-02-19 20:10         ` Jonathan Tan
2019-02-22 11:35           ` Christian Couder
2019-02-19 13:44     ` Ævar Arnfjörð Bjarmason
2019-02-21  1:09       ` brian m. carlson
2019-02-22  9:34         ` Ævar Arnfjörð Bjarmason
2018-12-05  5:02   ` Junio C Hamano
2018-12-05  5:55     ` Junio C Hamano
2018-12-06 23:16     ` Jonathan Tan
2019-02-19 14:28   ` Ævar Arnfjörð Bjarmason
2019-02-19 22:06     ` Jonathan Tan
2018-12-03 23:37 ` [WIP RFC 3/5] upload-pack: refactor reading of pack-objects out Jonathan Tan
2018-12-04  0:30   ` Stefan Beller
2018-12-05  6:30   ` Junio C Hamano
2018-12-03 23:37 ` [WIP RFC 4/5] upload-pack: refactor writing of "packfile" line Jonathan Tan
2018-12-06  6:35   ` Junio C Hamano
2018-12-06 23:25     ` Jonathan Tan
2018-12-07  0:22       ` Junio C Hamano
2018-12-03 23:37 ` [WIP RFC 5/5] upload-pack: send part of packfile response as uri Jonathan Tan
2018-12-04 20:09   ` Stefan Beller
2018-12-04  0:01 ` [WIP RFC 0/5] Design for offloading part of packfile response to CDN Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0461b362569362c6d0e73951469c547a03a1b59d.1543879256.git.jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).