From: Josh Steadmon <steadmon@google.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
newren@gmail.com, avarab@gmail.com, dyroneteng@gmail.com,
Johannes.Schindelin@gmx.de, "SZEDER Gábor" <szeder.dev@gmail.com>,
"Matthew John Cheetham" <mjcheetham@outlook.com>,
"Derrick Stolee" <derrickstolee@github.com>
Subject: Re: [PATCH v3 0/2] bundle URIs: design doc and initial git fetch --bundle-uri implementation
Date: Mon, 25 Jul 2022 13:05:15 -0700 [thread overview]
Message-ID: <Yt73eyk0PKBYoyKn@google.com> (raw)
In-Reply-To: <pull.1248.v3.git.1658757188.gitgitgadget@gmail.com>
On 2022.07.25 13:53, Derrick Stolee via GitGitGadget wrote:
> This is the first of series towards building the bundle URI feature as
> discussed in previous RFCs, specifically pulled directly out of [5]:
>
> [1]
> https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/
> [2]
> https://lore.kernel.org/git/cover-0.3-00000000000-20211025T211159Z-avarab@gmail.com/
> [3]
> https://lore.kernel.org/git/pull.1160.git.1645641063.gitgitgadget@gmail.com
> [4]
> https://lore.kernel.org/git/RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@gmail.com/
> [5]
> https://lore.kernel.org/git/pull.1234.git.1653072042.gitgitgadget@gmail.com
>
> THIS ONLY INCLUDES THE DESIGN DOCUMENT. See "Updates in v3". There are two
> patches:
>
> 1. The main design document that details the bundle URI standard and how
> the client interacts with the bundle data.
> 2. An addendum to the design document that details one strategy for
> organizing bundles from the perspective of a bundle provider.
>
> As outlined in [5], the next steps after this are:
>
> 1. Add 'git clone --bundle-uri=' to run a 'git bundle fetch ' step before
> doing a fetch negotiation with the origin remote. [6]
> 2. Allow parsing a bundle list as a config file at the given URI. The
> key-value format is unified with the protocol v2 verb (coming in (3)).
> [7]
> 3. Implement the protocol v2 verb, re-using the bundle list logic from (2).
> Use this to auto-discover bundle URIs during 'git clone' (behind a
> config option). [8]
> 4. Implement the 'creationToken' heuristic, allowing incremental 'git
> fetch' commands to download a bundle list from a configured URI, and
> only download bundles that are new based on the creation token values.
> [9]
>
> I have prepared some of this work as pull requests on my personal fork so
> curious readers can look ahead to where we are going:
>
> [6] https://github.com/derrickstolee/git/pull/18 [7]
> https://github.com/derrickstolee/git/pull/20 [8]
> https://github.com/derrickstolee/git/pull/21 [9]
> https://github.com/derrickstolee/git/pull/22
>
> As mentioned in the design document, this is not all that is possible. For
> instance, Ævar's suggestion to download only the bundle headers can be used
> as a second heuristic (and as an augmentation of the timestamp heuristic).
>
>
> Updates in v3
> =============
>
> * This version only includes the design document. Thanks to all the
> reviewers for the significant attention that improves the doc a lot.
> * The second patch has an addition to the design document that details a
> potential way to organize bundles from the provider's perspective.
> * Based on some off-list feedback, I was going to switch git fetch
> --bundle-uri into git bundle fetch, but that has a major conflict with
> [10] which was just submitted.
> * I will move the git bundle fetch implementation into [6] which also has
> the git clone --bundle-uri implementation. [10]
> https://lore.kernel.org/git/20220725123857.2773963-1-szeder.dev@gmail.com/
>
>
> Updates in v2
> =============
>
> * The design document has been updated based on Junio's feedback.
> * The "bundle.list." keys are now just "bundle.".
> * The "timestamp" heuristic is now "creationToken".
> * More clarity on how Git parses data from the bundle URI.
> * Dropped some unnecessary bundle list keys (*.list, *.requires).
>
> Thanks, -Stolee
>
> Derrick Stolee (2):
> docs: document bundle URI standard
> bundle-uri: add example bundle organization
>
> Documentation/Makefile | 1 +
> Documentation/technical/bundle-uri.txt | 573 +++++++++++++++++++++++++
> 2 files changed, 574 insertions(+)
> create mode 100644 Documentation/technical/bundle-uri.txt
>
>
> base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1248%2Fderrickstolee%2Fbundle-redo%2Ffetch-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1248/derrickstolee/bundle-redo/fetch-v3
> Pull-Request: https://github.com/gitgitgadget/git/pull/1248
>
> Range-diff vs v2:
>
> 1: d444042dc4d ! 1: e0f003e1b5f docs: document bundle URI standard
> @@ Commit message
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>
> + ## Documentation/Makefile ##
> +@@ Documentation/Makefile: TECH_DOCS += SubmittingPatches
> + TECH_DOCS += ToolsForGit
> + TECH_DOCS += technical/bitmap-format
> + TECH_DOCS += technical/bundle-format
> ++TECH_DOCS += technical/bundle-uri
> + TECH_DOCS += technical/cruft-packs
> + TECH_DOCS += technical/hash-function-transition
> + TECH_DOCS += technical/http-protocol
> +
> ## Documentation/technical/bundle-uri.txt (new) ##
> @@
> +Bundle URIs
> +===========
> +
> ++Git bundles are files that store a pack-file along with some extra metadata,
> ++including a set of refs and a (possibly empty) set of necessary commits. See
> ++linkgit:git-bundle[1] and link:bundle-format.txt[the bundle format] for more
> ++information.
> ++
> +Bundle URIs are locations where Git can download one or more bundles in
> +order to bootstrap the object database in advance of fetching the remaining
> +objects from a remote.
> @@ Documentation/technical/bundle-uri.txt (new)
> + If this string-valued key exists, then the bundle list is designed to
> + work well with incremental `git fetch` commands. The heuristic signals
> + that there are additional keys available for each bundle that help
> -+ determine which subset of bundles the client should download.
> ++ determine which subset of bundles the client should download. The only
> ++ heuristic currently planned is `creationToken`.
> +
> +The remaining keys include an `<id>` segment which is a server-designated
> -+name for each available bundle.
> ++name for each available bundle. The `<id>` must contain only alphanumeric
> ++and `-` characters.
> +
> +bundle.<id>.uri::
> + (Required) This string value is the URI for downloading bundle `<id>`.
> @@ Documentation/technical/bundle-uri.txt (new)
> +
> +Here is an example bundle list using the Git config format:
> +
> -+```
> -+[bundle]
> -+ version = 1
> -+ mode = all
> -+ heuristic = creationToken
> ++ [bundle]
> ++ version = 1
> ++ mode = all
> ++ heuristic = creationToken
> +
> -+[bundle "2022-02-09-1644442601-daily"]
> -+ uri = https://bundles.example.com/git/git/2022-02-09-1644442601-daily.bundle
> -+ timestamp = 1644442601
> ++ [bundle "2022-02-09-1644442601-daily"]
> ++ uri = https://bundles.example.com/git/git/2022-02-09-1644442601-daily.bundle
> ++ creationToken = 1644442601
> +
> -+[bundle "2022-02-02-1643842562"]
> -+ uri = https://bundles.example.com/git/git/2022-02-02-1643842562.bundle
> -+ timestamp = 1643842562
> ++ [bundle "2022-02-02-1643842562"]
> ++ uri = https://bundles.example.com/git/git/2022-02-02-1643842562.bundle
> ++ creationToken = 1643842562
> +
> -+[bundle "2022-02-09-1644442631-daily-blobless"]
> -+ uri = 2022-02-09-1644442631-daily-blobless.bundle
> -+ timestamp = 1644442631
> -+ filter = blob:none
> ++ [bundle "2022-02-09-1644442631-daily-blobless"]
> ++ uri = 2022-02-09-1644442631-daily-blobless.bundle
> ++ creationToken = 1644442631
> ++ filter = blob:none
> +
> -+[bundle "2022-02-02-1643842568-blobless"]
> -+ uri = /git/git/2022-02-02-1643842568-blobless.bundle
> -+ timestamp = 1643842568
> -+ filter = blob:none
> -+```
> ++ [bundle "2022-02-02-1643842568-blobless"]
> ++ uri = /git/git/2022-02-02-1643842568-blobless.bundle
> ++ creationToken = 1643842568
> ++ filter = blob:none
> +
> +This example uses `bundle.mode=all` as well as the
> +`bundle.<id>.creationToken` heuristic. It also uses the `bundle.<id>.filter`
> @@ Documentation/technical/bundle-uri.txt (new)
> +* The client fails to connect with a server at the given URI or a connection
> + is lost without any chance to recover.
> +
> -+* The client receives a response other than `200 OK` (such as `404 Not Found`,
> -+ `401 Not Authorized`, or `500 Internal Server Error`). The client should
> -+ use the `credential.helper` to attempt authentication after the first
> -+ `401 Not Authorized` response, but a second such response is a failure.
> ++* The client receives a 400-level response (such as `404 Not Found` or
> ++ `401 Not Authorized`). The client should use the credential helper to
> ++ find and provide a credential for the URI, but match the semantics of
> ++ Git's other HTTP protocols in terms of handling specific 400-level
> ++ errors.
> +
> -+* The client receives data that is not parsable as a bundle or bundle list.
> ++* The server reports any other failure reponse.
> +
> -+* The bundle list describes a directed cycle in the
> -+ `bundle.<id>.requires` links.
> ++* The client receives data that is not parsable as a bundle or bundle list.
> +
> +* A bundle includes a filter that does not match expectations.
> +
> +* The client cannot unbundle the bundles because the prerequisite commit OIDs
> -+ are not in the object database and there are no more
> -+ `bundle.<id>.requires` links to follow.
> ++ are not in the object database and there are no more bundles to download.
> +
> +There are also situations that could be seen as wasteful, but are not
> +error conditions:
> @@ Documentation/technical/bundle-uri.txt (new)
> + the client is using hourly prefetches with background maintenance, but
> + the server is computing bundles weekly. For this reason, the client
> + should not use bundle URIs for fetch unless the server has explicitly
> -+ recommended it through the `bundle.flags = forFetch` value.
> ++ recommended it through a `bundle.heuristic` value.
> +
> +Implementation Plan
> +-------------------
> @@ Documentation/technical/bundle-uri.txt (new)
> + that the config format parsing feeds a list of key-value pairs into the
> + bundle list logic.
> +
> -+3. Create the `bundle-uri` protocol v2 verb so Git servers can advertise
> ++3. Create the `bundle-uri` protocol v2 command so Git servers can advertise
> + bundle URIs using the key-value pairs. Plug into the existing key-value
> + input to the bundle list logic. Allow `git clone` to discover these
> + bundle URIs and bootstrap the client repository from the bundle data.
> 2: 0a2cf60437f < -: ----------- remote-curl: add 'get' capability
> 3: abec47564fd < -: ----------- bundle-uri: create basic file-copy logic
> 4: f6255ec5188 < -: ----------- fetch: add --bundle-uri option
> 5: bfbd11b48bf < -: ----------- bundle-uri: add support for http(s):// and file://
> 6: a217e9a0640 < -: ----------- fetch: add 'refs/bundle/' to log.excludeDecoration
> -: ----------- > 2: a933471c3af bundle-uri: add example bundle organization
>
> --
> gitgitgadget
Looks good to me, thanks for the series!
Reviewed-by: Josh Steadmon <steadmon@google.com>
next prev parent reply other threads:[~2022-07-25 20:05 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-06 19:55 [PATCH 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 1/6] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-06-06 22:18 ` Junio C Hamano
2022-06-08 19:20 ` Derrick Stolee
2022-06-08 19:27 ` Junio C Hamano
2022-06-08 20:44 ` Junio C Hamano
2022-06-08 20:39 ` Junio C Hamano
2022-06-08 20:52 ` Derrick Stolee
2022-06-07 0:33 ` Junio C Hamano
2022-06-08 19:46 ` Derrick Stolee
2022-06-08 21:01 ` Junio C Hamano
2022-06-09 16:00 ` Derrick Stolee
2022-06-09 17:56 ` Junio C Hamano
2022-06-09 18:27 ` Ævar Arnfjörð Bjarmason
2022-06-09 19:39 ` Derrick Stolee
2022-06-09 20:13 ` Junio C Hamano
2022-06-21 19:34 ` Derrick Stolee
2022-06-21 20:16 ` Junio C Hamano
2022-06-21 21:10 ` Derrick Stolee
2022-06-21 21:33 ` Junio C Hamano
2022-06-06 19:55 ` [PATCH 2/6] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-21 22:59 ` Junio C Hamano
2022-06-06 19:55 ` [PATCH 3/6] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 4/6] fetch: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 5/6] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-06-06 19:55 ` [PATCH 6/6] fetch: add 'refs/bundle/' to log.excludeDecoration Derrick Stolee via GitGitGadget
2022-06-29 20:40 ` [PATCH v2 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Derrick Stolee via GitGitGadget
2022-06-29 20:40 ` [PATCH v2 1/6] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-07-18 9:20 ` SZEDER Gábor
2022-07-21 12:09 ` Matthew John Cheetham
2022-07-22 13:52 ` Derrick Stolee
2022-07-22 16:03 ` Derrick Stolee
2022-07-21 21:39 ` Josh Steadmon
2022-07-22 13:15 ` Derrick Stolee
2022-07-22 15:01 ` Derrick Stolee
2022-06-29 20:40 ` [PATCH v2 2/6] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-21 21:41 ` Josh Steadmon
2022-06-29 20:40 ` [PATCH v2 3/6] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-07-21 21:45 ` Josh Steadmon
2022-07-22 13:18 ` Derrick Stolee
2022-06-29 20:40 ` [PATCH v2 4/6] fetch: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-06-29 20:40 ` [PATCH v2 5/6] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-06-29 20:40 ` [PATCH v2 6/6] fetch: add 'refs/bundle/' to log.excludeDecoration Derrick Stolee via GitGitGadget
2022-07-21 21:47 ` Josh Steadmon
2022-07-22 13:20 ` Derrick Stolee
2022-07-21 21:48 ` [PATCH v2 0/6] bundle URIs: design doc and initial git fetch --bundle-uri implementation Josh Steadmon
2022-07-21 21:56 ` Junio C Hamano
2022-07-25 13:53 ` [PATCH v3 0/2] " Derrick Stolee via GitGitGadget
2022-07-25 13:53 ` [PATCH v3 1/2] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-07-28 1:23 ` tenglong.tl
2022-08-01 13:42 ` Derrick Stolee
2022-07-25 13:53 ` [PATCH v3 2/2] bundle-uri: add example bundle organization Derrick Stolee via GitGitGadget
2022-08-04 16:09 ` Matthew John Cheetham
2022-08-04 17:39 ` Derrick Stolee
2022-08-04 20:29 ` Ævar Arnfjörð Bjarmason
2022-08-05 18:29 ` Derrick Stolee
2022-07-25 20:05 ` Josh Steadmon [this message]
2022-08-09 13:12 ` [PATCH v4 0/2] bundle URIs: design doc Derrick Stolee via GitGitGadget
2022-08-09 13:12 ` [PATCH v4 1/2] docs: document bundle URI standard Derrick Stolee via GitGitGadget
2022-10-04 19:48 ` Philip Oakley
2022-08-09 13:12 ` [PATCH v4 2/2] bundle-uri: add example bundle organization Derrick Stolee via GitGitGadget
2022-08-09 13:49 ` [PATCH v4 0/2] bundle URIs: design doc Phillip Wood
2022-08-09 15:50 ` Derrick Stolee
2022-08-11 15:42 ` Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yt73eyk0PKBYoyKn@google.com \
--to=steadmon@google.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=dyroneteng@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=mjcheetham@outlook.com \
--cc=newren@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).