git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Bundle URIs V: creationToken heuristic for incremental fetches
@ 2023-01-06 20:36 Derrick Stolee via GitGitGadget
  2023-01-06 20:36 ` [PATCH 1/8] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
                   ` (8 more replies)
  0 siblings, 9 replies; 79+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2023-01-06 20:36 UTC (permalink / raw)
  To: git; +Cc: gitster, me, vdye, avarab, steadmon, chooglen, Derrick Stolee

This fifth part to the bundle URIs feature follows part IV (advertising via
protocol v2) which recently merged to 'master', so this series is based on
'master'.

This part introduces the concept of a heuristic that a bundle list can
advertise. The purpose of the heuristic is to hint to the Git client that
the bundles can be downloaded and unbundled in a certain order. In
particular, that order can assist with using the same bundle URI to download
new bundles from an updated bundle list. This allows bundle URIs to assist
with incremental fetches, not just initial clones.

The only planned heuristic is the "creationToken" heuristic where the bundle
list adds a 64-bit unsigned integer "creationToken" value to each bundle in
the list. Those values provide an ordering on the bundles implying that the
bundles can be unbundled in increasing creationToken order and at each point
the required commits for the ith bundle were provided by bundles with lower
creationTokens.

At clone time, the only difference implied by the creationToken order is
that the Git client does not need to guess at the order to apply the
bundles, but instead can use the creationToken order to apply them without
failure and retry. However, this presents an interesting benefit during
fetches: the Git client can check the bundle list and download bundles in
decreasing creationToken order until the required commits for these bundles
are present within the repository's object store. This prevents downloading
more bundle information than required.

The creationToken value is also a promise that the Git client will not need
to download a bundle if its creationToken is less than or equal to the
creationToken of a previously-downloaded bundle. This further improves the
performance during a fetch in that the client does not need to download any
bundles at all if it recognizes that the maximum creationToken is the same
(or smaller than) a previously-downloaded creationToken.

The creationToken concept is documented in the existing design document at
Documentation/technical/bundle-uri.txt, including suggested ways for bundle
providers to organize their bundle lists to take advantage of the heuristic.

This series formalizes the creationToken heuristic and the Git client logic
for understanding it. Further, for bundle lists provided by the git clone
--bundle-uri option, the Git client will recognize the heuristic as being
helpful for incremental fetches and store config values so that future git
fetch commands check the bundle list before communicating with any Git
remotes.

Note that this option does not integrate fetches with bundle lists
advertised via protocol v2. I spent some time working on this, but found the
implementation to be distinct enough that it merited its own attention in a
separate series. In particular, the configuration for indicating that a
fetch should check the bundle-uri protocol v2 command seemed best to be
located within a Git remote instead of a repository-global key such as is
being used for a static URI. Further, the timing of querying the bundle-uri
command during a git fetch command is significantly different and more
complicated than how it is used in git clone.


What Remains?
=============

Originally, I had planned on making this bundle URI work a 5-part series,
and this is part 5. Shouldn't we be done now?

There are two main things that should be done after this series, in any
order:

 * Teach git fetch to check a bundle list advertised by a remote over the
   bundle-uri protocol v2 command.
 * Add the bundle.<id>.filter option to allow advertising bundles and
   partial bundles side-by-side.

There is also room for expanding tests for more error conditions, or for
other tweaks that are not currently part of the design document. I do think
that after this series, the feature will be easier to work on different
parts in parallel.


Patch Outline
=============

 * Patch 1 creates a test setup demonstrating a creationToken heuristic. At
   this point, the Git client ignores the heuristic and uses its ad-hoc
   strategy for ordering the bundles.
 * Patches 2 and 3 teach Git to parse the bundle.heuristic and
   bundle.<id>.creationToken keys in a bundle list.
 * Patch 4 teaches Git to download bundles using the creationToken order.
   This order uses a stack approach to start from the maximum creationToken
   and continue downloading the next bundle in the list until all bundles
   can successfully be unbundled. This is the algorithm required for
   incremental fetches, while initial clones could download in the opposite
   order. Since clones will download all bundles anyway, having a second
   code path just for clones seemed unnecessary.
 * Patch 5 teaches git clone --bundle-uri to set fetch.bundleURI when the
   advertised bundle list includs a heuristic that Git understands.
 * Patch 6 updates the design document to remove reference to a bundle.flag
   option that was previously going to indicate the list was designed for
   fetches, but the bundle.heuristic option already does that.
 * Patch 7 teaches git fetch to check fetch.bundleURI and download bundles
   from that static URI before connecting to remotes via the Git protocol.
 * Patch 8 introduces a new fetch.bundleCreationToken config value to store
   the maximum creationToken of downloaded bundles. This prevents
   downloading the latest bundle on every git fetch command, reducing waste.

Thanks,

 * Stolee

Derrick Stolee (8):
  t5558: add tests for creationToken heuristic
  bundle-uri: parse bundle.heuristic=creationToken
  bundle-uri: parse bundle.<id>.creationToken values
  bundle-uri: download in creationToken order
  clone: set fetch.bundleURI if appropriate
  bundle-uri: drop bundle.flag from design doc
  fetch: fetch from an external bundle URI
  bundle-uri: store fetch.bundleCreationToken

 Documentation/config/bundle.txt        |   7 +
 Documentation/config/fetch.txt         |  16 ++
 Documentation/technical/bundle-uri.txt |   8 +-
 builtin/clone.c                        |   6 +-
 builtin/fetch.c                        |   8 +
 bundle-uri.c                           | 208 ++++++++++++++++++++++++-
 bundle-uri.h                           |  28 +++-
 t/t5558-clone-bundle-uri.sh            | 204 +++++++++++++++++++++++-
 t/t5601-clone.sh                       |  50 ++++++
 t/t5750-bundle-uri-parse.sh            |  37 +++++
 10 files changed, 561 insertions(+), 11 deletions(-)


base-commit: 4dbebc36b0893f5094668ddea077d0e235560b16
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1454%2Fderrickstolee%2Fbundle-redo%2FcreationToken-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1454/derrickstolee/bundle-redo/creationToken-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1454
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2023-01-31 22:01 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-06 20:36 [PATCH 0/8] Bundle URIs V: creationToken heuristic for incremental fetches Derrick Stolee via GitGitGadget
2023-01-06 20:36 ` [PATCH 1/8] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-17 18:17   ` Victoria Dye
2023-01-17 21:00     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 2/8] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-09  2:38   ` Junio C Hamano
2023-01-09 14:20     ` Derrick Stolee
2023-01-17 19:13   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 3/8] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-09  3:08   ` Junio C Hamano
2023-01-09 14:41     ` Derrick Stolee
2023-01-17 19:24   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 4/8] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-09  3:22   ` Junio C Hamano
2023-01-09 14:58     ` Derrick Stolee
2023-01-19 18:32   ` Victoria Dye
2023-01-20 14:56     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 5/8] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-19 19:42   ` Victoria Dye
2023-01-20 15:42     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 6/8] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-19 19:44   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 7/8] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-19 20:34   ` Victoria Dye
2023-01-20 15:47     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 8/8] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-19 22:24   ` Victoria Dye
2023-01-20 15:53     ` Derrick Stolee
2023-01-23 15:21 ` [PATCH v2 00/10] Bundle URIs V: creationToken heuristic for incremental fetches Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 01/10] bundle: optionally skip reachability walk Derrick Stolee via GitGitGadget
2023-01-23 18:03     ` Junio C Hamano
2023-01-23 18:24       ` Derrick Stolee
2023-01-23 20:13         ` Junio C Hamano
2023-01-23 22:30           ` Junio C Hamano
2023-01-24 12:27             ` Derrick Stolee
2023-01-24 14:14               ` [PATCH v2.5 01/11] bundle: test unbundling with incomplete history Derrick Stolee
2023-01-24 17:16                 ` Junio C Hamano
2023-01-24 14:16               ` [PATCH v2.5 02/11] bundle: verify using connected() Derrick Stolee
2023-01-24 17:33                 ` Junio C Hamano
2023-01-24 18:46                   ` Derrick Stolee
2023-01-24 20:41                     ` Junio C Hamano
2023-01-24 15:22               ` [PATCH v2 01/10] bundle: optionally skip reachability walk Junio C Hamano
2023-01-23 21:08         ` Junio C Hamano
2023-01-23 15:21   ` [PATCH v2 02/10] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-27 19:15     ` Victoria Dye
2023-01-23 15:21   ` [PATCH v2 03/10] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 04/10] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 05/10] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-27 19:17     ` Victoria Dye
2023-01-27 19:32       ` Junio C Hamano
2023-01-30 18:43         ` Derrick Stolee
2023-01-30 19:02           ` Junio C Hamano
2023-01-30 19:12             ` Derrick Stolee
2023-01-23 15:21   ` [PATCH v2 06/10] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 07/10] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 08/10] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-27 19:18     ` Victoria Dye
2023-01-23 15:21   ` [PATCH v2 09/10] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 10/10] bundle-uri: test missing bundles with heuristic Derrick Stolee via GitGitGadget
2023-01-27 19:21     ` Victoria Dye
2023-01-30 18:47       ` Derrick Stolee
2023-01-27 19:28   ` [PATCH v2 00/10] Bundle URIs V: creationToken heuristic for incremental fetches Victoria Dye
2023-01-31 13:29   ` [PATCH v3 00/11] " Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 01/11] bundle: test unbundling with incomplete history Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 02/11] bundle: verify using check_connected() Derrick Stolee via GitGitGadget
2023-01-31 17:35       ` Junio C Hamano
2023-01-31 19:31         ` Derrick Stolee
2023-01-31 19:36           ` Junio C Hamano
2023-01-31 13:29     ` [PATCH v3 03/11] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 04/11] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 05/11] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-31 21:22       ` Junio C Hamano
2023-01-31 13:29     ` [PATCH v3 06/11] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 07/11] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 08/11] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 09/11] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 10/11] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 11/11] bundle-uri: test missing bundles with heuristic Derrick Stolee via GitGitGadget
2023-01-31 22:01     ` [PATCH v3 00/11] Bundle URIs V: creationToken heuristic for incremental fetches Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).