git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Derrick Stolee" <derrickstolee@github.com>,
	"Jonathan Tan" <jonathantanmy@google.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Albert Cui" <albertqcui@gmail.com>,
	"Robin H . Johnson" <robbat2@gentoo.org>,
	"Teng Long" <dyroneteng@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [RFC PATCH v2 35/36] bundle-uri docs: add design notes
Date: Mon, 18 Apr 2022 19:23:52 +0200	[thread overview]
Message-ID: <RFC-patch-v2-35.36-cfda9323aaa-20220418T165545Z-avarab@gmail.com> (raw)
In-Reply-To: <RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@gmail.com>

Add a design doc for the bundle-uri protocol extension to go along
with the packfile-uri extension added in cd8402e0fd8 (Documentation:
add Packfile URIs design doc, 2020-06-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/technical/bundle-uri.txt  | 119 ++++++++++++++++++++++++
 Documentation/technical/protocol-v2.txt |   5 +
 2 files changed, 124 insertions(+)
 create mode 100644 Documentation/technical/bundle-uri.txt

diff --git a/Documentation/technical/bundle-uri.txt b/Documentation/technical/bundle-uri.txt
new file mode 100644
index 00000000000..5ae9a15eafe
--- /dev/null
+++ b/Documentation/technical/bundle-uri.txt
@@ -0,0 +1,119 @@
+Bundle URI Design Notes
+=======================
+
+Protocol
+--------
+
+See `bundle-uri` in the link:protocol-v2.html[protocol-v2]
+documentation for a discussion of the bundle-uri command, and the
+expectations of clients and servers.
+
+This document is a a more general discussion of how the `bundle-uri`
+command fits in with the rest of the git ecosystem, its design goals
+and non-goals, comparison to alternatives etc.
+
+Comparison with Packfile URIs
+-----------------------------
+
+There is a similar "Packfile URIs" facility, see the
+link:packfile-uri.html[packfile-uri] documentation for details.
+
+The Packfile URIs facility requires a much closer cooperation between
+CDN and server than the bundle URI facility.
+
+I.e. the server MUST know what objects exist in the packfile URI it's
+pointing to, as well as its pack checksum. Failure to do so will not
+only result in a client error (the packfile hash won't match), but
+even if it got past that would likely result in a corrupt repository
+with tips pointing to unreachable objects.
+
+By comparison the bundle URIs are meant to be a "dumb" solution
+friendly to e.g. having a weekly cronjob take a snapshot of a git
+repository, that snapshot being uploaded to a network of FTP mirrors
+(which may be inconsistent or out of date).
+
+The server does not need to know what state the side-channel download
+is at, because the client will first validate it, and then optionally
+negotiate with the server using what it discovers there.
+
+Using the local `transfer.injectBundleURI` configuration variable (see
+linkgit:git-config[1]) the `bundle-uri` mechanism doesn't even need
+the server to support it.
+
+Security
+--------
+
+The omission of something equivalent to the packfile <OID> in the
+Packfile URIs protocol is intentional, as having it would require
+closer server and CDN cooperation than some server operators are
+comfortable with.
+
+Furthermore, it is not needed for security. The server doesn't need to
+trust its CDN. If the server were to attempt to send harmful content
+to the client, the result would not validate against the server's
+provided ref tips gotten from ls-refs.
+
+The lack of a such a hash does leave room open to a malicious CDN
+operation to be annoying however. E.g. they could inject irrelevant
+objects into the bundles, which would enlarge the downloaded
+repository until a "gc" would eventually throw them away.
+
+In practice the lack of a hash is considered to be a non-issue. Anyone
+concerned about such security problems between their server and their
+CDN is going to be pointing to a "https" URL under their control. For
+a client the "threat" is the same as without bundle-uri, i.e. a server
+is free to be annoying today and send you garbage in the PACK that you
+won't need.
+
+Security issues peculiar to bundle-uri
+--------------------------------------
+
+Both packfile-uri and bundle-uri use the `fetch.uriProtocols`
+configuration variable (see linkgit:git-config[1]) to configure which
+protocols they support.
+
+By default this is set to "http,https" for both, but bundle-uri
+supports adding "file" to that list. The server can thus point to
+"file://" URIs it expects the client to have access to.
+
+This is primarily intended for use with the `transfer.injectBundleURI`
+mechanism, but can also be useful e.g. in a centralized environment
+where a server might point to a "file:///mnt/bundles/big-repo.bdl" it
+knows to be mounted on the local machine (e.g. a racked server),
+points to it in its "bundle-uri" response.
+
+The client can then add "file" to the `fetch.uriProtocols` list to
+obey such responses. That does mean that a malicious server can point
+to any arbitrary file on the local machine. The threat of this is
+considered minimal, since anyone adding `file` to `fetch.uriProtocols`
+likely knows what they're doing and controls both ands, and the worst
+they can do is make a curl(1) pipe garbage into "index-pack" (which
+will likely promptly die on the non-PACK-file).
+
+Security comparison with packfile-uri
+-------------------------------------
+
+The initial implementation of packfile-uri needed special adjusting to
+run "git fsck" on incoming .gitmodules files, this was to deal with a
+general security issue in git, See CVE-2018-17456.
+
+The current packfile-uri mechanism requires special handling around
+"fsck" to do such cross-PACK fsck's, this is because it first indexes
+the "incremental" PACK, and then any PACK(s) provided via
+packfile-uri, before finally doing a full connectivity check.
+
+This is effect doing the fsck one might do via "clone" and "fetch" in
+reverse, or the equivalent of starting with the incremental "fetch",
+followed by the "clone".
+
+Since the packfile-uri mechanism can result in the .gitmodules blob
+referenced by such a "fetch" to be in the pack for the "clone" the
+fetch-pack process needs to keep state between the indexing of
+multiple packs, to remember to fsck the blob (via the "clone") later
+after seeing it in a tree (from the "fetch).
+
+There are no known security issues with the way packfile-uri does
+this, but since bundle-uri effectively emulates what a which doesn't
+support either "bundle-uri" or "packfile-uri" would do on clone/fetch,
+any future security issues peculiar to the packfile-uri approach are
+unlikely to be shared by it.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 3ea96add398..3a51492049f 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -775,3 +775,8 @@ A client receiving such a a response MAY assume that they can skip
 retrieving the header from a bundle at the indicated URI, and thus
 save themselves and the server(s) the request(s) needed to inspect the
 headers of that bundle or bundles.
+
+bundle-uri SEE ALSO
+^^^^^^^^^^^^^^^^^^^
+
+See the link:bundle-uri.html[Bundle URI Design Notes] for more.
-- 
2.36.0.rc2.902.g60576bbc845


  parent reply	other threads:[~2022-04-18 17:26 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 21:25 [PATCH 0/3] bundle-uri: "dumb" static CDN offloading, spec & server implementation Ævar Arnfjörð Bjarmason
2021-10-25 21:25 ` [PATCH 1/3] leak tests: mark t5701-git-serve.sh as passing SANITIZE=leak Ævar Arnfjörð Bjarmason
2021-10-25 21:25 ` [PATCH 2/3] protocol v2: specify static seeding of clone/fetch via "bundle-uri" Ævar Arnfjörð Bjarmason
2021-10-26 14:00   ` Derrick Stolee
2021-10-26 15:00     ` Ævar Arnfjörð Bjarmason
2021-10-27  1:55       ` Derrick Stolee
2021-10-27 17:49         ` Ævar Arnfjörð Bjarmason
2021-10-27  2:01   ` Derrick Stolee
2021-10-27  8:29     ` Ævar Arnfjörð Bjarmason
2021-10-27 16:31       ` Derrick Stolee
2021-10-27 18:01         ` Ævar Arnfjörð Bjarmason
2021-10-27 19:23           ` Derrick Stolee
2021-10-27 20:22             ` Ævar Arnfjörð Bjarmason
2021-10-29 18:30               ` Derrick Stolee
2021-10-30 14:51           ` Philip Oakley
2021-10-25 21:25 ` [PATCH 3/3] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2021-10-26 14:05   ` Derrick Stolee
2021-10-29 18:46 ` [PATCH 0/3] bundle-uri: "dumb" static CDN offloading, spec & server implementation Derrick Stolee
2021-10-30  7:21   ` Ævar Arnfjörð Bjarmason
2021-11-01 21:00     ` Derrick Stolee
2021-11-01 23:18       ` Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 00/13] bundle-uri: a "dumb CDN" for git Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 01/13] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 02/13] bundle-uri docs: add design notes Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 03/13] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 04/13] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 05/13] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 06/13] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 07/13] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 08/13] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 09/13] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 10/13] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 11/13] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 12/13] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2022-03-11 16:24   ` [RFC PATCH v2 13/13] bundle-uri: make the download program configurable Ævar Arnfjörð Bjarmason
2022-03-11 21:28   ` [RFC PATCH v2 00/13] bundle-uri: a "dumb CDN" for git Derrick Stolee
2022-04-18 17:23   ` [RFC PATCH v2 00/36] bundle-uri: a "dumb CDN" for git + TOC format Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 01/36] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 02/36] dir API: add a generalized path_match_flags() function Ævar Arnfjörð Bjarmason
2022-04-21 17:26       ` Derrick Stolee
2022-04-18 17:23     ` [RFC PATCH v2 03/36] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2022-04-21 17:28       ` Derrick Stolee
2022-04-18 17:23     ` [RFC PATCH v2 04/36] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 05/36] http: make http_get_file() external Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 06/36] remote: move relative_url() Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 07/36] remote: allow relative_url() to return an absolute url Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 08/36] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 09/36] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 10/36] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 11/36] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 12/36] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 13/36] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 14/36] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 15/36] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 16/36] bundle-uri: make the download program configurable Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 17/36] remote-curl: add 'get' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 18/36] bundle: implement 'fetch' command for direct bundles Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 19/36] bundle: parse table of contents during 'fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 20/36] bundle: add --filter option to 'fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 21/36] bundle: allow relative URLs in table of contents Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 22/36] bundle: make it easy to call 'git bundle fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 23/36] clone: add --bundle-uri option Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 24/36] clone: --bundle-uri cannot be combined with --depth Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 25/36] bundle: only fetch bundles if timestamp is new Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 26/36] fetch: fetch bundles before fetching original data Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 27/36] protocol-caps: implement cap_features() Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 28/36] serve: understand but do not advertise 'features' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 29/36] serve: advertise 'features' when config exists Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 30/36] connect: implement get_recommended_features() Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 31/36] transport: add connections for 'features' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 32/36] clone: use server-recommended bundle URI Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 33/36] t5601: basic bundle URI test Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` [RFC PATCH v2 34/36] protocol v2: add server-side "bundle-uri" skeleton (docs) Ævar Arnfjörð Bjarmason
2022-04-18 17:23     ` Ævar Arnfjörð Bjarmason [this message]
2022-04-18 17:23     ` [RFC PATCH v2 36/36] docs: document bundle URI standard Ævar Arnfjörð Bjarmason
2022-04-21 19:54     ` [RFC PATCH v2 00/36] bundle-uri: a "dumb CDN" for git + TOC format Derrick Stolee
2022-04-22  9:37       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=RFC-patch-v2-35.36-cfda9323aaa-20220418T165545Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=albertqcui@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dyroneteng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=robbat2@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).