git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Victoria Dye <vdye@github.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: gitster@pobox.com, me@ttaylorr.com, avarab@gmail.com,
	steadmon@google.com, chooglen@google.com,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 7/8] fetch: fetch from an external bundle URI
Date: Thu, 19 Jan 2023 12:34:13 -0800	[thread overview]
Message-ID: <58269fe6-ebe1-7b12-4cd9-2110a94543e5@github.com> (raw)
In-Reply-To: <1627fc158b1e301a1663e24f9f21268b4f1caa55.1673037405.git.gitgitgadget@gmail.com>

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> When a user specifies a URI via 'git clone --bundle-uri', that URI may
> be a bundle list that advertises a 'bundle.heuristic' value. In that
> case, the Git client stores a 'fetch.bundleURI' config value storing
> that URI.
> 
> Teach 'git fetch' to check for this config value and download bundles
> from that URI before fetching from the Git remote(s). Likely, the bundle
> provider has configured a heuristic (such as "creationToken") that will
> allow the Git client to download only a portion of the bundles before
> continuing the fetch.
> 
> Since this URI is completely independent of the remote server, we want
> to be sure that we connect to the bundle URI before creating a
> connection to the Git remote. We do not want to hold a stateful
> connection for too long if we can avoid it.
> 
> To test that this works correctly, extend the previous tests that set
> 'fetch.bundleURI' to do follow-up fetches. The bundle list is updated
> incrementally at each phase to demonstrate that the heuristic avoids
> downloading older bundles. This includes the middle fetch downloading
> the objects in bundle-3.bundle from the Git remote, and therefore not
> needing that bundle in the third fetch.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/fetch.c             |  8 +++++
>  t/t5558-clone-bundle-uri.sh | 59 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 7378cafeec9..fbb1d470c38 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -29,6 +29,7 @@
>  #include "commit-graph.h"
>  #include "shallow.h"
>  #include "worktree.h"
> +#include "bundle-uri.h"
>  
>  #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000)
>  
> @@ -2109,6 +2110,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
>  int cmd_fetch(int argc, const char **argv, const char *prefix)
>  {
>  	int i;
> +	const char *bundle_uri;
>  	struct string_list list = STRING_LIST_INIT_DUP;
>  	struct remote *remote = NULL;
>  	int result = 0;
> @@ -2194,6 +2196,12 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  	if (dry_run)
>  		write_fetch_head = 0;
>  
> +	if (!git_config_get_string_tmp("fetch.bundleuri", &bundle_uri) &&
> +	    !starts_with(bundle_uri, "remote:")) {

Maybe a silly question, by why would the bundle URI start with 'remote:'
(and why do we silently skip fetching from the URI in that case)?

> +		if (fetch_bundle_uri(the_repository, bundle_uri, NULL))
> +			warning(_("failed to fetch bundles from '%s'"), bundle_uri);
> +	}
> +
>  	if (all) {
>  		if (argc == 1)
>  			die(_("fetch --all does not take a repository argument"));
> diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
> index 8ff560425ee..3f4d61a915c 100755
> --- a/t/t5558-clone-bundle-uri.sh
> +++ b/t/t5558-clone-bundle-uri.sh
> @@ -465,6 +465,65 @@ test_expect_success 'http clone with bundle.heuristic creates fetch.bundleURI' '
>  	cat >expect <<-\EOF &&
>  	refs/bundles/base
>  	EOF
> +	test_cmp expect refs &&

At this point in the test, only 'base' is in the cloned repo (equivalent to
the contents of 'bundle-1').

> +
> +	cat >>"$HTTPD_DOCUMENT_ROOT_PATH/bundle-list" <<-EOF &&
> +	[bundle "bundle-2"]
> +		uri = bundle-2.bundle
> +		creationToken = 2
> +	EOF
> +
> +	# Fetch the objects for bundle-2 _and_ bundle-3.
> +	GIT_TRACE2_EVENT="$(pwd)/trace1.txt" \
> +		git -C fetch-http-4 fetch origin --no-tags \
> +		refs/heads/left:refs/heads/left \
> +		refs/heads/right:refs/heads/right &&
> +
> +	# This fetch should copy two files: the list and bundle-2.
> +	test_bundle_downloaded bundle-list trace1.txt &&
> +	test_bundle_downloaded bundle-2.bundle trace1.txt &&
> +	! test_bundle_downloaded bundle-1.bundle trace1.txt &&

Now, with 'bundle-2' in the list, fetch 'left' and 'right'. 'bundle-1' is
not fetched because we already have its contents (in 'base'), 'bundle-2' is
downloaded to get 'left', and 'right' is fetched directly from the repo.

> +
> +	# received left from bundle-2
> +	git -C fetch-http-4 for-each-ref --format="%(refname)" "refs/bundles/*" >refs &&
> +	cat >expect <<-\EOF &&
> +	refs/bundles/base
> +	refs/bundles/left
> +	EOF
> +	test_cmp expect refs &&
> +
> +	cat >>"$HTTPD_DOCUMENT_ROOT_PATH/bundle-list" <<-EOF &&
> +	[bundle "bundle-3"]
> +		uri = bundle-3.bundle
> +		creationToken = 3
> +
> +	[bundle "bundle-4"]
> +		uri = bundle-4.bundle
> +		creationToken = 4
> +	EOF
> +
> +	# This fetch should skip bundle-3.bundle, since its objets are

s/objets/objects

> +	# already local (we have the requisite commits for bundle-4.bundle).
> +	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
> +		git -C fetch-http-4 fetch origin --no-tags \
> +		refs/heads/merge:refs/heads/merge &&
> +
> +	# This fetch should copy three files: the list, bundle-3, and bundle-4.
> +	test_bundle_downloaded bundle-list trace2.txt &&
> +	test_bundle_downloaded bundle-4.bundle trace2.txt &&
> +	! test_bundle_downloaded bundle-1.bundle trace2.txt &&
> +	! test_bundle_downloaded bundle-2.bundle trace2.txt &&
> +	! test_bundle_downloaded bundle-3.bundle trace2.txt &&

'bundle-3' and 'bundle-4' are then added to the list and we fetch 'merge'.
The repository already has 'base', 'left', and 'right' so we don't need to
download 'bundle-1', 'bundle-2', and 'bundle-3' respectively; 'bundle-4' is
downloaded to get 'merge'.

> +
> +	# received merge ref from bundle-4, but right is missing
> +	# because we did not download bundle-3.
> +	git -C fetch-http-4 for-each-ref --format="%(refname)" "refs/bundles/*" >refs &&
> +
> +	cat >expect <<-\EOF &&
> +	refs/bundles/base
> +	refs/bundles/left
> +	refs/bundles/merge

And this confirms that 'base', 'left', and 'merge' all came from bundles
(and 'right' did not), and everything is working as expected. This is test
is both easy to understand (the comments clearly explain each step without
being overly verbose) and thorough. Looks good!

> +	EOF
>  	test_cmp expect refs
>  '
>  


  reply	other threads:[~2023-01-19 20:34 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-06 20:36 [PATCH 0/8] Bundle URIs V: creationToken heuristic for incremental fetches Derrick Stolee via GitGitGadget
2023-01-06 20:36 ` [PATCH 1/8] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-17 18:17   ` Victoria Dye
2023-01-17 21:00     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 2/8] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-09  2:38   ` Junio C Hamano
2023-01-09 14:20     ` Derrick Stolee
2023-01-17 19:13   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 3/8] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-09  3:08   ` Junio C Hamano
2023-01-09 14:41     ` Derrick Stolee
2023-01-17 19:24   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 4/8] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-09  3:22   ` Junio C Hamano
2023-01-09 14:58     ` Derrick Stolee
2023-01-19 18:32   ` Victoria Dye
2023-01-20 14:56     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 5/8] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-19 19:42   ` Victoria Dye
2023-01-20 15:42     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 6/8] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-19 19:44   ` Victoria Dye
2023-01-06 20:36 ` [PATCH 7/8] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-19 20:34   ` Victoria Dye [this message]
2023-01-20 15:47     ` Derrick Stolee
2023-01-06 20:36 ` [PATCH 8/8] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-19 22:24   ` Victoria Dye
2023-01-20 15:53     ` Derrick Stolee
2023-01-23 15:21 ` [PATCH v2 00/10] Bundle URIs V: creationToken heuristic for incremental fetches Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 01/10] bundle: optionally skip reachability walk Derrick Stolee via GitGitGadget
2023-01-23 18:03     ` Junio C Hamano
2023-01-23 18:24       ` Derrick Stolee
2023-01-23 20:13         ` Junio C Hamano
2023-01-23 22:30           ` Junio C Hamano
2023-01-24 12:27             ` Derrick Stolee
2023-01-24 14:14               ` [PATCH v2.5 01/11] bundle: test unbundling with incomplete history Derrick Stolee
2023-01-24 17:16                 ` Junio C Hamano
2023-01-24 14:16               ` [PATCH v2.5 02/11] bundle: verify using connected() Derrick Stolee
2023-01-24 17:33                 ` Junio C Hamano
2023-01-24 18:46                   ` Derrick Stolee
2023-01-24 20:41                     ` Junio C Hamano
2023-01-24 15:22               ` [PATCH v2 01/10] bundle: optionally skip reachability walk Junio C Hamano
2023-01-23 21:08         ` Junio C Hamano
2023-01-23 15:21   ` [PATCH v2 02/10] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-27 19:15     ` Victoria Dye
2023-01-23 15:21   ` [PATCH v2 03/10] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 04/10] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 05/10] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-27 19:17     ` Victoria Dye
2023-01-27 19:32       ` Junio C Hamano
2023-01-30 18:43         ` Derrick Stolee
2023-01-30 19:02           ` Junio C Hamano
2023-01-30 19:12             ` Derrick Stolee
2023-01-23 15:21   ` [PATCH v2 06/10] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 07/10] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 08/10] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-27 19:18     ` Victoria Dye
2023-01-23 15:21   ` [PATCH v2 09/10] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-23 15:21   ` [PATCH v2 10/10] bundle-uri: test missing bundles with heuristic Derrick Stolee via GitGitGadget
2023-01-27 19:21     ` Victoria Dye
2023-01-30 18:47       ` Derrick Stolee
2023-01-27 19:28   ` [PATCH v2 00/10] Bundle URIs V: creationToken heuristic for incremental fetches Victoria Dye
2023-01-31 13:29   ` [PATCH v3 00/11] " Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 01/11] bundle: test unbundling with incomplete history Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 02/11] bundle: verify using check_connected() Derrick Stolee via GitGitGadget
2023-01-31 17:35       ` Junio C Hamano
2023-01-31 19:31         ` Derrick Stolee
2023-01-31 19:36           ` Junio C Hamano
2023-01-31 13:29     ` [PATCH v3 03/11] t5558: add tests for creationToken heuristic Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 04/11] bundle-uri: parse bundle.heuristic=creationToken Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 05/11] bundle-uri: parse bundle.<id>.creationToken values Derrick Stolee via GitGitGadget
2023-01-31 21:22       ` Junio C Hamano
2023-01-31 13:29     ` [PATCH v3 06/11] bundle-uri: download in creationToken order Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 07/11] clone: set fetch.bundleURI if appropriate Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 08/11] bundle-uri: drop bundle.flag from design doc Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 09/11] fetch: fetch from an external bundle URI Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 10/11] bundle-uri: store fetch.bundleCreationToken Derrick Stolee via GitGitGadget
2023-01-31 13:29     ` [PATCH v3 11/11] bundle-uri: test missing bundles with heuristic Derrick Stolee via GitGitGadget
2023-01-31 22:01     ` [PATCH v3 00/11] Bundle URIs V: creationToken heuristic for incremental fetches Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58269fe6-ebe1-7b12-4cd9-2110a94543e5@github.com \
    --to=vdye@github.com \
    --cc=avarab@gmail.com \
    --cc=chooglen@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).