git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] Bundle URIs IV: advertise over protocol v2
@ 2022-11-01  1:07 Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
                   ` (9 more replies)
  0 siblings, 10 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

This is based on the recent master batch that included ds/bundle-uri-....

Now that git clone --bundle-uri can download a bundle list from a plaintex
file in config format, we can use the same set of key-value pairs to
advertise a bundle list over protocol v2. At the end of this series:

 1. A server can advertise bundles when uploadPack.advertiseBundleURIs is
    enabled. The bundle list comes from the server's local config,
    specifically the bundle.* namespace.
 2. A client can notice a server's bundle-uri advertisement and request the
    bundle list if transfer.bundleURI is enabled. The bundles are downloaded
    as if the list was advertised from the --bundle-uri option.

Many patches in this series were adapted from Ævar's v2 RFC [1]. He is
retained as author and I added myself as co-author only if the modifications
were significant.

[1]
https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

 * Patches 1-5 are mostly taken from [1], again with mostly minor updates.
   The one major difference is the packet line format being a single
   key=value format instead of a sequence of pairs. This also means that
   Patch 4 is entirely new since it feeds these pairs directly from the
   server's config.

 * Patches 6-9 finish off the ability for the client to notice the
   capability, request the values, and download bundles before continuing
   with the rest of the download.

One thing that is not handled here but could be handled in a future change
is to disconnect from the origin Git server while downloading the bundle
URIs, then reconnecting afterwards. This does not make any difference for
HTTPS, but SSH may benefit from the reduced connection time. The git clone
--bundle-uri option did not suffer from this because the bundles are
downloaded before the server connection begins.

After this series, there is one more before the original scope of the plan
is complete: using creation tokens as a heuristic. See [2] for the RFC
version of those patches.

[2] https://github.com/derrickstolee/git/pull/22

Thanks,

 * Stolee

Derrick Stolee (5):
  bundle-uri: serve bundle.* keys from config
  strbuf: reintroduce strbuf_parent_directory()
  bundle-uri: allow relative URLs in bundle lists
  bundle-uri: download bundles from an advertised list
  clone: unbundle the advertised bundles

Ævar Arnfjörð Bjarmason (4):
  protocol v2: add server-side "bundle-uri" skeleton
  bundle-uri client: add minimal NOOP client
  bundle-uri client: add helper for testing server
  bundle-uri client: add boolean transfer.bundleURI setting

 Documentation/config/transfer.txt      |   6 +
 Documentation/gitprotocol-v2.txt       | 193 +++++++++++++++++++++
 builtin/clone.c                        |  23 +++
 bundle-uri.c                           |  91 +++++++++-
 bundle-uri.h                           |  27 +++
 connect.c                              |  47 +++++
 remote.h                               |   5 +
 serve.c                                |   6 +
 strbuf.c                               |   9 +
 strbuf.h                               |   7 +
 t/helper/test-bundle-uri.c             |  48 ++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh  | 229 +++++++++++++++++++++++++
 t/t5601-clone.sh                       |  59 +++++++
 t/t5701-git-serve.sh                   |  40 ++++-
 t/t5730-protocol-v2-bundle-uri-file.sh |  36 ++++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 ++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 ++
 t/t5750-bundle-uri-parse.sh            |  54 ++++++
 transport-helper.c                     |  13 ++
 transport-internal.h                   |   7 +
 transport.c                            |  87 ++++++++++
 transport.h                            |  23 +++
 22 files changed, 1042 insertions(+), 2 deletions(-)
 create mode 100644 t/lib-t5730-protocol-v2-bundle-uri.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1400%2Fderrickstolee%2Fbundle-redo%2Fadvertise-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1400/derrickstolee/bundle-redo/advertise-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1400
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
@ 2022-11-01  1:07 ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-08 17:08   ` SZEDER Gábor
  2022-11-11  1:59   ` Victoria Dye
  2022-11-01  1:07 ` [PATCH 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 verb with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/gitprotocol-v2.txt | 193 +++++++++++++++++++++++++++++++
 bundle-uri.c                     |  36 ++++++
 bundle-uri.h                     |   7 ++
 serve.c                          |   6 +
 t/t5701-git-serve.sh             |  40 ++++++-
 5 files changed, 281 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 59bf41cefb9..57642b4a415 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -578,6 +578,199 @@ and associated requested information, each separated by a single space.
 
 	obj-info = obj-id SP obj-size
 
+bundle-uri
+~~~~~~~~~~
+
+If the 'bundle-uri' capability is advertised, the server supports the
+`bundle-uri' command.
+
+The capability is currently advertised with no value (i.e. not
+"bundle-uri=somevalue"), a value may be added in the future for
+supporting command-wide extensions. Clients MUST ignore any unknown
+capability values and proceed with the 'bundle-uri` dialog they
+support.
+
+The 'bundle-uri' command is intended to be issued before `fetch` to
+get URIs to bundle files (see linkgit:git-bundle[1]) to "seed" and
+inform the subsequent `fetch` command.
+
+The client CAN issue `bundle-uri` before or after any other valid
+command. To be useful to clients it's expected that it'll be issued
+after an `ls-refs` and before `fetch`, but CAN be issued at any time
+in the dialog.
+
+DISCUSSION of bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The intent of the feature is optimize for server resource consumption
+in the common case by changing the common case of fetching a very
+large PACK during linkgit:git-clone[1] into a smaller incremental
+fetch.
+
+It also allows servers to achieve better caching in combination with
+an `uploadpack.packObjectsHook` (see linkgit:git-config[1]).
+
+By having new clones or fetches be a more predictable and common
+negotiation against the tips of recently produces *.bundle file(s).
+Servers might even pre-generate the results of such negotiations for
+the `uploadpack.packObjectsHook` as new pushes come in.
+
+One way that servers could take advantage of these bundles is that the
+server would anticipate that fresh clones will download a known bundle,
+followed by catching up to the current state of the repository using ref
+tips found in that bundle (or bundles).
+
+PROTOCOL for bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A `bundle-uri` request takes no arguments, and as noted above does not
+currently advertise a capability value. Both may be added in the
+future.
+
+When the client issues a `command=bundle-uri` the response is a list of
+key-value pairs provided as packet lines with value `<key>=<value>`. The
+meaning of these key-value pairs are provided by the config keys in the
+`bundle.*` namespace (see linkgit:git-config[1]).
+
+Clients are still expected to fully parse the line according to the
+above format, lines that do not conform to the format SHOULD be
+discarded. The user MAY be warned in such a case.
+
+bundle-uri CLIENT AND SERVER EXPECTATIONS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+URI CONTENTS::
+The advertised URIs MUST be in one of two possible formats.
++
+The first possible format is a bundle file that `git bundle verify`
+would accept. I.e. they MUST contain one or more reference tips for
+use by the client, MUST indicate prerequisites (in any) with standard
+"-" prefixes, and MUST indicate their "object-format", if
+applicable. Create "*.bundle" files with `git bundle create`.
++
+The second possible format is a plaintext file that `git config --list`
+would accept (with the `--file` option). The key-value pairs in this list
+are in the `bundle.*` namespace (see linkgit:git-config[1]).
+
+bundle-uri CLIENT ERROR RECOVERY::
+A client MUST above all gracefully degrade on errors, whether that
+error is because of bad missing/data in the bundle URI(s), because
+that client is too dumb to e.g. understand and fully parse out bundle
+headers and their prerequisite relationships, or something else.
++
+Server operators should feel confident in turning on "bundle-uri" and
+not worry if e.g. their CDN goes down that clones or fetches will run
+into hard failures. Even if the server bundle bundle(s) are
+incomplete, or bad in some way the client should still end up with a
+functioning repository, just as if it had chosen not to use this
+protocol extension.
++
+All subsequent discussion on client and server interaction MUST keep
+this in mind.
+
+bundle-uri SERVER TO CLIENT::
+The ordering of the returned bundle uris is not significant. Clients
+MUST parse their headers to discover their contained OIDS and
+prerequisites. A client MUST consider the content of the bundle(s)
+themselves and their header as the ultimate source of truth.
++
+A server MAY even return bundle(s) that don't have any direct
+relationship to the repository being cloned (either through accident,
+or intentional "clever" configuration), and expect a client to sort
+out what data they'd like from the bundle(s), if any.
+
+bundle-uri CLIENT TO SERVER::
+The client SHOULD provide reference tips found in the bundle header(s)
+as 'have' lines in any subsequent `fetch` request. A client MAY also
+ignore the bundle(s) entirely if doing so is deemed worse for some
+reason, e.g. if the bundles can't be downloaded, it doesn't like the
+tips it finds etc.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
+If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
+of the bundle(s) the client finds that the ref tips it wants can be
+retrieved entirety from advertised bundle(s), it MAY disconnect. The
+results of such a 'clone' or 'fetch' should be indistinguishable from
+the state attained without using bundle-uri.
+
+EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
+A client MAY perform an early disconnect while still downloading the
+bundle(s) (having streamed and parsed their headers). In such a case
+the client MUST gracefully recover from any errors related to
+finishing the download and validation of the bundle(s).
++
+I.e. a client might need to re-connect and issue a 'fetch' command,
+and possibly fall back to not making use of 'bundle-uri' at all.
++
+This "MAY" behavior is specified as such (and not a "SHOULD") on the
+assumption that a server advertising bundle uris is more likely than
+not to be serving up a relatively large repository, and to be pointing
+to URIs that have a good chance of being in working order. A client
+MAY e.g. look at the payload size of the bundles as a heuristic to see
+if an early disconnect is worth it, should falling back on a full
+"fetch" dialog be necessary.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
+A client SHOULD commence a negotiation of a PACK from the server via
+the "fetch" command using the OID tips found in advertised bundles,
+even if's still in the process of downloading those bundle(s).
++
+This allows for aggressive early disconnects from any interactive
+server dialog. The client blindly trusts that the advertised OID tips
+are relevant, and issues them as 'have' lines, it then requests any
+tips it would like (usually from the "ls-refs" advertisement) via
+'want' lines. The server will then compute a (hopefully small) PACK
+with the expected difference between the tips from the bundle(s) and
+the data requested.
++
+The only connection the client then needs to keep active is to the
+concurrently downloading static bundle(s), when those and the
+incremental PACK are retrieved they should be inflated and
+validated. Any errors at this point should be gracefully recovered
+from, see above.
+
+bundle-uri PROTOCOL FEATURES
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As noted above the `<key>=<value>` definitions are documented by the
+`bundle.*` config namespace.
+
+In particular, the `bundle.version` key specifies an integer value. The
+only accepted value at the moment is `1`, but if the client sees an
+unexpected value here then the client MUST ignore the bundle list.
+
+As long as `bundle.version` is understood, all other unknown keys MAY be
+ignored by the client. The server will guarantee compatibility with older
+clients, though newer clients may be better able to use the extra keys to
+minimize downloads.
+
+Any backwards-incompatible addition of pre-URI key-value will be
+guarded by a new `bundle.version` value or values in 'bundle-uri'
+capability advertisement itself, and/or by new future `bundle-uri`
+request arguments.
+
+Some example key-value pairs that are not currently implemented but could
+be implemented in the future include:
+
+ * Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
+   size of the bundle file.
+
+ * Advertise that one or more bundle files are the same (to e.g. have
+   clients round-robin or otherwise choose one of N possible files).
+
+ * A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
+   expressing the common case of a bundle with one tip and no
+   prerequisites, or one tip and one prerequisite.
++
+This would allow for optimizing the common case of servers who'd like
+to provide one "big bundle" containing only their "main" branch,
+and/or incremental updates thereof.
++
+A client receiving such a a response MAY assume that they can skip
+retrieving the header from a bundle at the indicated URI, and thus
+save themselves and the server(s) the request(s) needed to inspect the
+headers of that bundle or bundles.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/bundle-uri.c b/bundle-uri.c
index 79a914f961b..32022595964 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -563,6 +563,42 @@ cleanup:
 	return result;
 }
 
+/**
+ * API for serve.c.
+ */
+
+int bundle_uri_advertise(struct repository *r, struct strbuf *value)
+{
+	static int advertise_bundle_uri = -1;
+
+	if (advertise_bundle_uri != -1)
+		goto cached;
+
+	advertise_bundle_uri = 0;
+	git_config_get_maybe_bool("uploadpack.advertisebundleuris", &advertise_bundle_uri);
+
+cached:
+	return advertise_bundle_uri;
+}
+
+int bundle_uri_command(struct repository *r,
+		       struct packet_reader *request)
+{
+	struct packet_writer writer;
+	packet_writer_init(&writer, 1);
+
+	while (packet_reader_read(request) == PACKET_READ_NORMAL)
+		die(_("bundle-uri: unexpected argument: '%s'"), request->line);
+	if (request->status != PACKET_READ_FLUSH)
+		die(_("bundle-uri: expected flush after arguments"));
+
+	/* TODO: Implement the communication */
+
+	packet_writer_flush(&writer);
+
+	return 0;
+}
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 4dbc269823c..357111ecce8 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -4,6 +4,7 @@
 #include "hashmap.h"
 #include "strbuf.h"
 
+struct packet_reader;
 struct repository;
 struct string_list;
 
@@ -92,6 +93,12 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * API for serve.c.
+ */
+int bundle_uri_advertise(struct repository *r, struct strbuf *value);
+int bundle_uri_command(struct repository *r, struct packet_reader *request);
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/serve.c b/serve.c
index 733347f602a..cbf4a143cfe 100644
--- a/serve.c
+++ b/serve.c
@@ -7,6 +7,7 @@
 #include "protocol-caps.h"
 #include "serve.h"
 #include "upload-pack.h"
+#include "bundle-uri.h"
 
 static int advertise_sid = -1;
 static int client_hash_algo = GIT_HASH_SHA1;
@@ -135,6 +136,11 @@ static struct protocol_capability capabilities[] = {
 		.advertise = always_advertise,
 		.command = cap_object_info,
 	},
+	{
+		.name = "bundle-uri",
+		.advertise = bundle_uri_advertise,
+		.command = bundle_uri_command,
+	},
 };
 
 void protocol_v2_advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 1896f671cb3..f21e5e9d33d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -13,7 +13,7 @@ test_expect_success 'test capability advertisement' '
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
-	cat >expect <<-EOF &&
+	cat >expect.base <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs=unborn
@@ -21,8 +21,11 @@ test_expect_success 'test capability advertisement' '
 	server-option
 	object-format=$(test_oid algo)
 	object-info
+	EOF
+	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -342,4 +345,39 @@ test_expect_success 'basics of object-info' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with uploadpack.advertiseBundleURIs' '
+	test_config uploadpack.advertiseBundleURIs true &&
+
+	cat >expect.extra <<-EOF &&
+	bundle-uri
+	EOF
+	cat expect.base \
+	    expect.extra \
+	    expect.trailer >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'basics of bundle-uri: dies if not enabled' '
+	test-tool pkt-line pack >in <<-EOF &&
+	command=bundle-uri
+	0000
+	EOF
+
+	cat >err.expect <<-\EOF &&
+	fatal: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	cat >expect <<-\EOF &&
+	ERR serve: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	test_must_fail test-tool serve-v2 --stateless-rpc <in >out 2>err.actual &&
+	test_cmp err.expect err.actual &&
+	test_must_be_empty out
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 2/9] bundle-uri client: add minimal NOOP client
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-01  1:07 ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-01  1:07 ` [PATCH 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Set up all the needed client parts of the "bundle-uri" protocol
extension, without actually doing anything with the bundle URIs.

I.e. if the server says it supports "bundle-uri" we'll issue a
command=bundle-uri after command=ls-refs when we're cloning. We'll
parse the returned output using the code already tested for in
t5750-bundle-uri-parse.sh.

What we aren't doing is actually acting on that data, i.e. downloading
the bundle(s) before we get to doing the command=fetch, and adjusting
our negotiation dialog appropriately. I'll do that in subsequent
commits.

There's a question of what level of encapsulation we should use here,
I've opted to use connect.h in clone.c, but we could also e.g. make
transport_get_remote_refs() invoke this, i.e. make it implicitly get
the bundle-uri list for later steps.

This approach means that we don't "support" this in "git fetch" for
now. I'm starting with the case of initial clones, although as noted
in preceding commits to the protocol documentation nothing about this
approach precludes getting bundles on incremental fetches.

For the t5732-protocol-v2-bundle-uri-http.sh it's not easy to set
environment variables for git-upload-pack (it's started by Apache), so
let's skip the test under T5730_HTTP, and add unused T5730_{FILE,GIT}
prerequisites for consistency and future use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c                        |   7 ++
 bundle-uri.c                           |   4 +
 connect.c                              |  47 ++++++++
 remote.h                               |   5 +
 t/lib-t5730-protocol-v2-bundle-uri.sh  | 148 +++++++++++++++++++++++++
 t/t5730-protocol-v2-bundle-uri-file.sh |  36 ++++++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 +++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 +++
 transport-helper.c                     |  13 +++
 transport-internal.h                   |   7 ++
 transport.c                            |  55 +++++++++
 transport.h                            |  19 ++++
 12 files changed, 375 insertions(+)
 create mode 100644 t/lib-t5730-protocol-v2-bundle-uri.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh

diff --git a/builtin/clone.c b/builtin/clone.c
index 547d6464b3c..edf98295af2 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -27,6 +27,7 @@
 #include "iterator.h"
 #include "sigchain.h"
 #include "branch.h"
+#include "connect.h"
 #include "remote.h"
 #include "run-command.h"
 #include "connected.h"
@@ -1266,6 +1267,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
+	/*
+	 * Populate transport->got_remote_bundle_uri and
+	 * transport->bundle_uri. We might get nothing.
+	 */
+	transport_get_remote_bundle_uri(transport);
+
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 
diff --git a/bundle-uri.c b/bundle-uri.c
index 32022595964..2201b604b11 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -571,6 +571,10 @@ int bundle_uri_advertise(struct repository *r, struct strbuf *value)
 {
 	static int advertise_bundle_uri = -1;
 
+	if (value &&
+	    git_env_bool("GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE", 0))
+		strbuf_addstr(value, "test-unknown-capability-value");
+
 	if (advertise_bundle_uri != -1)
 		goto cached;
 
diff --git a/connect.c b/connect.c
index 5ea53deda23..d39effb7492 100644
--- a/connect.c
+++ b/connect.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "protocol.h"
 #include "alias.h"
+#include "bundle-uri.h"
 
 static char *server_capabilities_v1;
 static struct strvec server_capabilities_v2 = STRVEC_INIT;
@@ -491,6 +492,52 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	}
 }
 
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc)
+{
+	int line_nr = 1;
+
+	/* Assert bundle-uri support */
+	server_supports_v2("bundle-uri", 1);
+
+	/* (Re-)send capabilities */
+	send_capabilities(fd_out, reader);
+
+	/* Send command */
+	packet_write_fmt(fd_out, "command=bundle-uri\n");
+	packet_delim(fd_out);
+
+	/* Send options */
+	if (git_env_bool("GIT_TEST_PROTOCOL_BAD_BUNDLE_URI", 0))
+		packet_write_fmt(fd_out, "test-bad-client\n");
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *line = reader->line;
+		line_nr++;
+
+		if (!bundle_uri_parse_line(bundles, line))
+			continue;
+
+		return error(_("error on bundle-uri response line %d: %s"),
+			     line_nr, line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		return error(_("expected flush after bundle-uri listing"));
+
+	/*
+	 * Might die(), but obscure enough that that's OK, e.g. in
+	 * serve.c we'll call BUG() on its equivalent (the
+	 * PACKET_READ_RESPONSE_END check).
+	 */
+	check_stateless_delimiter(stateless_rpc, reader,
+				  _("expected response end packet after ref listing"));
+
+	return 0;
+}
+
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     struct transport_ls_refs_options *transport_options,
diff --git a/remote.h b/remote.h
index 1c4621b414b..1ebbe42792e 100644
--- a/remote.h
+++ b/remote.h
@@ -234,6 +234,11 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
+/* Used for protocol v2 in order to retrieve refs from a remote */
+struct bundle_list;
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc);
+
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
 /*
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
new file mode 100644
index 00000000000..27294e9c976
--- /dev/null
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -0,0 +1,148 @@
+# Included from t573*-protocol-v2-bundle-uri-*.sh
+
+T5730_PARENT=
+T5730_URI=
+T5730_BUNDLE_URI=
+case "$T5730_PROTOCOL" in
+file)
+	T5730_PARENT=file_parent
+	T5730_URI="file://$PWD/file_parent"
+	T5730_BUNDLE_URI="$T5730_URI/fake.bdl"
+	test_set_prereq T5730_FILE
+	;;
+git)
+	. "$TEST_DIRECTORY"/lib-git-daemon.sh
+	start_git_daemon --export-all --enable=receive-pack
+	T5730_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
+	T5730_URI="$GIT_DAEMON_URL/parent"
+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq T5730_GIT
+	;;
+http)
+	. "$TEST_DIRECTORY"/lib-httpd.sh
+	start_httpd
+	T5730_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
+	T5730_URI="$HTTPD_URL/smart/http_parent"
+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq T5730_HTTP
+	;;
+*)
+	BUG "Need to pass valid T5730_PROTOCOL (was $T5730_PROTOCOL)"
+	;;
+esac
+
+test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
+	git init "$T5730_PARENT" &&
+	test_commit -C "$T5730_PARENT" one &&
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
+'
+
+# Poor man's URI escaping. Good enough for the test suite whose trash
+# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
+# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
+test_uri_escape() {
+	sed 's/ /%20/g'
+}
+
+case "$T5730_PROTOCOL" in
+http)
+	test_expect_success "setup config for $T5730_PROTOCOL:// tests" '
+		git -C "$T5730_PARENT" config http.receivepack true
+	'
+	;;
+*)
+	;;
+esac
+T5730_BUNDLE_URI_ESCAPED=$(echo "$T5730_BUNDLE_URI" | test_uri_escape)
+
+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundle-uri" '
+	test_when_finished "rm -f log" &&
+	test_when_finished "git -C \"$T5730_PARENT\" config uploadpack.advertiseBundleURIs true" &&
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	! grep bundle-uri log
+'
+
+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bundle-uri" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep bundle-uri log
+'
+
+test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
+		"$T5730_BUNDLE_URI_ESCAPED" &&
+
+	cat >err.expect <<-\EOF &&
+	Cloning into '"'"'child'"'"'...
+	EOF
+	case "$T5730_PROTOCOL" in
+	file)
+		cat >fatal-bundle-uri.expect <<-\EOF
+		fatal: bundle-uri: unexpected argument: '"'"'test-bad-client'"'"'
+		EOF
+		;;
+	*)
+		cat >fatal.expect <<-\EOF
+		fatal: read error: Connection reset by peer
+		EOF
+		;;
+	esac &&
+
+	test_when_finished "rm -rf child" &&
+	test_must_fail ok=sigpipe env \
+		GIT_TRACE_PACKET="$PWD/log" \
+		GIT_TEST_PROTOCOL_BAD_BUNDLE_URI=true \
+		git -c protocol.version=2 \
+		clone "$T5730_URI" child \
+		>out 2>err &&
+	test_must_be_empty out &&
+
+	grep -v -e ^fatal: -e ^error: err >err.actual &&
+	test_cmp err.expect err.actual &&
+
+	case "$T5730_PROTOCOL" in
+	file)
+		# Due to general race conditions with client/server replies we
+		# may or may not get "fatal: the remote end hung up
+		# expectedly" here
+		grep "^fatal: bundle-uri:" err >fatal-bundle-uri.actual &&
+		test_cmp fatal-bundle-uri.expect fatal-bundle-uri.actual
+		;;
+	*)
+		grep "^fatal:" err >fatal.actual &&
+		# Due to the same race conditions this might be
+		# "fatal: read error: Connection reset by peer", "fatal: the remote end
+		# hung up unexpectedly" etc.
+		cat fatal.actual &&
+		test_file_not_empty fatal.actual
+		;;
+	esac &&
+
+	grep "clone> test-bad-client$" log >sent-bad-request &&
+	test_file_not_empty sent-bad-request
+'
diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
new file mode 100755
index 00000000000..89203d3a23c
--- /dev/null
+++ b/t/t5730-protocol-v2-bundle-uri-file.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'file://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+T5730_PROTOCOL=file
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_expect_success "unknown capability value with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE=true \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	grep "> bundle-uri=test-unknown-capability-value" log
+'
+
+test_done
diff --git a/t/t5731-protocol-v2-bundle-uri-git.sh b/t/t5731-protocol-v2-bundle-uri-git.sh
new file mode 100755
index 00000000000..282847b311f
--- /dev/null
+++ b/t/t5731-protocol-v2-bundle-uri-git.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+T5730_PROTOCOL=git
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_done
diff --git a/t/t5732-protocol-v2-bundle-uri-http.sh b/t/t5732-protocol-v2-bundle-uri-http.sh
new file mode 100755
index 00000000000..fcc1cf3faef
--- /dev/null
+++ b/t/t5732-protocol-v2-bundle-uri-http.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+T5730_PROTOCOL=http
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_done
diff --git a/transport-helper.c b/transport-helper.c
index e95267a4ab5..3ea7c2bb5ad 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
 	return ret;
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	get_helper(transport);
+
+	if (process_connect(transport, 0)) {
+		do_take_over(transport);
+		return transport->vtable->get_bundle_uri(transport);
+	}
+
+	return -1;
+}
+
 static struct transport_vtable vtable = {
 	.set_option	= set_helper_option,
 	.get_refs_list	= get_refs_list,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs,
 	.push_refs	= push_refs,
 	.connect	= connect_helper,
diff --git a/transport-internal.h b/transport-internal.h
index c4ca0b733ac..90ea749e5cf 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -26,6 +26,13 @@ struct transport_vtable {
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
 				     struct transport_ls_refs_options *transport_options);
 
+	/**
+	 * Populates the remote side's bundle-uri under protocol v2,
+	 * if the "bundle-uri" capability was advertised. Returns 0 if
+	 * OK, negative values on error.
+	 */
+	int (*get_bundle_uri)(struct transport *transport);
+
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
diff --git a/transport.c b/transport.c
index e7b97194c10..a020adc1f56 100644
--- a/transport.c
+++ b/transport.c
@@ -22,6 +22,7 @@
 #include "protocol.h"
 #include "object-store.h"
 #include "color.h"
+#include "bundle-uri.h"
 
 static int transport_use_color = -1;
 static char transport_colors[][COLOR_MAXLEN] = {
@@ -359,6 +360,25 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	return handshake(transport, for_push, options, 1);
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	struct git_transport_data *data = transport->data;
+	struct packet_reader reader;
+	int stateless_rpc = transport->stateless_rpc;
+
+	if (!transport->bundles) {
+		CALLOC_ARRAY(transport->bundles, 1);
+		init_bundle_list(transport->bundles);
+	}
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	return get_remote_bundle_uri(data->fd[1], &reader,
+				     transport->bundles, stateless_rpc);
+}
+
 static int fetch_refs_via_pack(struct transport *transport,
 			       int nr_heads, struct ref **to_fetch)
 {
@@ -902,6 +922,7 @@ static int disconnect_git(struct transport *transport)
 
 static struct transport_vtable taken_over_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.disconnect	= disconnect_git
@@ -1054,6 +1075,7 @@ static struct transport_vtable bundle_vtable = {
 
 static struct transport_vtable builtin_smart_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.connect	= connect_git,
@@ -1068,6 +1090,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	ret->progress = isatty(2);
 	string_list_init_dup(&ret->pack_lockfiles);
 
+	CALLOC_ARRAY(ret->bundles, 1);
+	init_bundle_list(ret->bundles);
+
 	if (!remote)
 		BUG("No remote provided to transport_get()");
 
@@ -1482,6 +1507,34 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
+int transport_get_remote_bundle_uri(struct transport *transport)
+{
+	const struct transport_vtable *vtable = transport->vtable;
+
+	/* Check config only once. */
+	if (transport->got_remote_bundle_uri++)
+		return 0;
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
+	/*
+	 * This is intentionally below the transport.injectBundleURI,
+	 * we want to be able to inject into protocol v0, or into the
+	 * dialog of a server who doesn't support this.
+	 */
+	if (!vtable->get_bundle_uri)
+		return error(_("bundle-uri operation not supported by protocol"));
+
+	if (vtable->get_bundle_uri(transport) < 0)
+		return error(_("could not retrieve server-advertised bundle-uri list"));
+	return 0;
+}
+
 void transport_unlock_pack(struct transport *transport, unsigned int flags)
 {
 	int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER);
@@ -1512,6 +1565,8 @@ int transport_disconnect(struct transport *transport)
 		ret = transport->vtable->disconnect(transport);
 	if (transport->got_remote_refs)
 		free_refs((void *)transport->remote_refs);
+	clear_bundle_list(transport->bundles);
+	free(transport->bundles);
 	free(transport);
 	return ret;
 }
diff --git a/transport.h b/transport.h
index b5bf7b3e704..85150f504fb 100644
--- a/transport.h
+++ b/transport.h
@@ -62,6 +62,7 @@ enum transport_family {
 	TRANSPORT_FAMILY_IPV6
 };
 
+struct bundle_list;
 struct transport {
 	const struct transport_vtable *vtable;
 
@@ -76,6 +77,18 @@ struct transport {
 	 */
 	unsigned got_remote_refs : 1;
 
+	/**
+	 * Indicates whether we already called get_bundle_uri_list(); set by
+	 * transport.c::transport_get_remote_bundle_uri().
+	 */
+	unsigned got_remote_bundle_uri : 1;
+
+	/*
+	 * The results of "command=bundle-uri", if both sides support
+	 * the "bundle-uri" capability.
+	 */
+	struct bundle_list *bundles;
+
 	/*
 	 * Transports that call take-over destroys the data specific to
 	 * the transport type while doing so, and cannot be reused.
@@ -281,6 +294,12 @@ void transport_ls_refs_options_release(struct transport_ls_refs_options *opts);
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    struct transport_ls_refs_options *transport_options);
 
+/**
+ * Retrieve bundle URI(s) from a remote. Populates "struct
+ * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ */
+int transport_get_remote_bundle_uri(struct transport *transport);
+
 /*
  * Fetch the hash algorithm used by a remote.
  *
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 3/9] bundle-uri client: add helper for testing server
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-01  1:07 ` [PATCH 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-01  1:07 ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-01  1:07 ` [PATCH 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

Since in the "git clone" case we'll have already done the handshake(),
but not here, introduce a "got_advertisement" state along with
"got_remote_heads". It seems to me that the "got_remote_heads" is
badly named in the first place, and the whole logic of eagerly getting
ls-refs on handshake() or not could be refactored somewhat, but let's
not do that now, and instead just add another self-documenting state
variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c                       |  2 +-
 t/helper/test-bundle-uri.c            | 46 +++++++++++++++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh | 63 ++++++++++++++++++++++-----
 transport.c                           | 43 ++++++++++++++----
 transport.h                           |  6 ++-
 5 files changed, 139 insertions(+), 21 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index edf98295af2..22b1e506452 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1271,7 +1271,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 * Populate transport->got_remote_bundle_uri and
 	 * transport->bundle_uri. We might get nothing.
 	 */
-	transport_get_remote_bundle_uri(transport);
+	transport_get_remote_bundle_uri(transport, 1);
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 25afd393428..ffb975b7b4f 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -3,6 +3,10 @@
 #include "bundle-uri.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "transport.h"
+#include "ref-filter.h"
+#include "remote.h"
+#include "refs.h"
 
 enum input_mode {
 	KEY_VALUE_PAIRS,
@@ -68,6 +72,46 @@ usage:
 	usage_with_options(usage, options);
 }
 
+static int cmd_ls_remote(int argc, const char **argv)
+{
+	const char *uploadpack = NULL;
+	struct string_list server_options = STRING_LIST_INIT_DUP;
+	const char *dest;
+	struct remote *remote;
+	struct transport *transport;
+	int status = 0;
+
+	dest = argc > 1 ? argv[1] : NULL;
+
+	remote = remote_get(dest);
+	if (!remote) {
+		if (dest)
+			die(_("bad repository '%s'"), dest);
+		die(_("no remote configured to get bundle URIs from"));
+	}
+	if (!remote->url_nr)
+		die(_("remote '%s' has no configured URL"), dest);
+
+	transport = transport_get(remote, NULL);
+	if (uploadpack)
+		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
+	if (server_options.nr)
+		transport->server_options = &server_options;
+
+	if (transport_get_remote_bundle_uri(transport, 0) < 0) {
+		error(_("could not get the bundle-uri list"));
+		status = 1;
+		goto cleanup;
+	}
+
+	print_bundle_list(stdout, transport->bundles);
+
+cleanup:
+	if (transport_disconnect(transport))
+		return 1;
+	return status;
+}
+
 int cmd__bundle_uri(int argc, const char **argv)
 {
 	const char *usage[] = {
@@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
 	if (!strcmp(argv[1], "parse-config"))
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
+	if (!strcmp(argv[1], "ls-remote"))
+		return cmd_ls_remote(argc - 1, argv + 1);
 	error("there is no test-tool bundle-uri tool '%s'", argv[1]);
 
 usage:
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index 27294e9c976..c327544641b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -34,7 +34,9 @@ esac
 test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
 	git init "$T5730_PARENT" &&
 	test_commit -C "$T5730_PARENT" one &&
-	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true &&
+	git -C "$T5730_PARENT" config bundle.version 1 &&
+	git -C "$T5730_PARENT" config bundle.mode all
 '
 
 # Poor man's URI escaping. Good enough for the test suite whose trash
@@ -61,9 +63,8 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundl
 	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
-	git \
-		-c protocol.version=2 \
-		ls-remote --symref "$T5730_URI" \
+	test-tool bundle-uri \
+		ls-remote "$T5730_URI" \
 		>actual 2>err &&
 
 	# Server responded using protocol v2
@@ -76,12 +77,11 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
 	test_when_finished "rm -f log" &&
 
 	test_config -C "$T5730_PARENT" \
-		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
-	git \
-		-c protocol.version=2 \
-		ls-remote --symref "$T5730_URI" \
+	test-tool bundle-uri \
+		ls-remote "$T5730_URI" \
 		>actual 2>err &&
 
 	# Server responded using protocol v2
@@ -94,8 +94,8 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
 test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
 	test_when_finished "rm -f log" &&
 
-	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
-		"$T5730_BUNDLE_URI_ESCAPED" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
 
 	cat >err.expect <<-\EOF &&
 	Cloning into '"'"'child'"'"'...
@@ -146,3 +146,46 @@ test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protoc
 	grep "clone> test-bad-client$" log >sent-bad-request &&
 	test_file_not_empty sent-bad-request
 '
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and extra data" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	# Extra data should be ignored
+	test_config -C "$T5730_PARENT" bundle.only.extra bogus &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
diff --git a/transport.c b/transport.c
index a020adc1f56..86460f5be28 100644
--- a/transport.c
+++ b/transport.c
@@ -198,6 +198,7 @@ struct git_transport_data {
 	struct git_transport_options options;
 	struct child_process *conn;
 	int fd[2];
+	unsigned got_advertisement : 1;
 	unsigned got_remote_heads : 1;
 	enum protocol_version version;
 	struct oid_array extra_have;
@@ -346,6 +347,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 		BUG("unknown protocol version");
 	}
 	data->got_remote_heads = 1;
+	data->got_advertisement = 1;
 	transport->hash_algo = reader.hash_algo;
 
 	if (reader.line_peeked)
@@ -371,6 +373,33 @@ static int get_bundle_uri(struct transport *transport)
 		init_bundle_list(transport->bundles);
 	}
 
+	if (!data->got_advertisement) {
+		struct ref *refs;
+		struct git_transport_data *data = transport->data;
+		enum protocol_version version;
+
+		refs = handshake(transport, 0, NULL, 0);
+		version = data->version;
+
+		switch (version) {
+		case protocol_v2:
+			assert(!refs);
+			break;
+		case protocol_v0:
+		case protocol_v1:
+		case protocol_unknown_version:
+			assert(refs);
+			break;
+		}
+	}
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
 	packet_reader_init(&reader, data->fd[0], NULL, 0,
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
@@ -1507,7 +1536,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
-int transport_get_remote_bundle_uri(struct transport *transport)
+int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 {
 	const struct transport_vtable *vtable = transport->vtable;
 
@@ -1515,20 +1544,16 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 	if (transport->got_remote_bundle_uri++)
 		return 0;
 
-	/*
-	 * "Support" protocol v0 and v2 without bundle-uri support by
-	 * silently degrading to a NOOP.
-	 */
-	if (!server_supports_v2("bundle-uri", 0))
-		return 0;
-
 	/*
 	 * This is intentionally below the transport.injectBundleURI,
 	 * we want to be able to inject into protocol v0, or into the
 	 * dialog of a server who doesn't support this.
 	 */
-	if (!vtable->get_bundle_uri)
+	if (!vtable->get_bundle_uri) {
+		if (quiet)
+			return -1;
 		return error(_("bundle-uri operation not supported by protocol"));
+	}
 
 	if (vtable->get_bundle_uri(transport) < 0)
 		return error(_("could not retrieve server-advertised bundle-uri list"));
diff --git a/transport.h b/transport.h
index 85150f504fb..dd0115b83bf 100644
--- a/transport.h
+++ b/transport.h
@@ -297,8 +297,12 @@ const struct ref *transport_get_remote_refs(struct transport *transport,
 /**
  * Retrieve bundle URI(s) from a remote. Populates "struct
  * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ *
+ * With `quiet=1` it will not complain if the serve doesn't support
+ * the protocol, but only if we discover the server uses it, and
+ * encounter issues then.
  */
-int transport_get_remote_bundle_uri(struct transport *transport);
+int transport_get_remote_bundle_uri(struct transport *transport, int quiet);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 4/9] bundle-uri: serve bundle.* keys from config
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-01  1:07 ` Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                          | 16 +++++++++++-
 t/lib-t5730-protocol-v2-bundle-uri.sh | 35 +++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 2201b604b11..3469f1aaa98 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -585,6 +585,16 @@ cached:
 	return advertise_bundle_uri;
 }
 
+static int config_to_packet_line(const char *key, const char *value, void *data)
+{
+	struct packet_reader *writer = data;
+
+	if (!strncmp(key, "bundle.", 7))
+		packet_write_fmt(writer->fd, "%s=%s", key, value);
+
+	return 0;
+}
+
 int bundle_uri_command(struct repository *r,
 		       struct packet_reader *request)
 {
@@ -596,7 +606,11 @@ int bundle_uri_command(struct repository *r,
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("bundle-uri: expected flush after arguments"));
 
-	/* TODO: Implement the communication */
+	/*
+	 * Read all "bundle.*" config lines to the client as key=value
+	 * packet lines.
+	 */
+	git_config(config_to_packet_line, &writer);
 
 	packet_writer_flush(&writer);
 
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index c327544641b..000fcc5e20b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -158,6 +158,8 @@ test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $T5730_BUNDLE_URI_ESCAPED
 	EOF
 	GIT_TRACE_PACKET="$PWD/log" \
 	test-tool bundle-uri \
@@ -181,6 +183,39 @@ test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and ext
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $T5730_BUNDLE_URI_ESCAPED
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 with list" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle1.uri "$T5730_BUNDLE_URI_ESCAPED-1.bdl" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle2.uri "$T5730_BUNDLE_URI_ESCAPED-2.bdl" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle3.uri "$T5730_BUNDLE_URI_ESCAPED-3.bdl" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "bundle1"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-1.bdl
+	[bundle "bundle2"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-2.bdl
+	[bundle "bundle3"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-3.bdl
 	EOF
 	GIT_TRACE_PACKET="$PWD/log" \
 	test-tool bundle-uri \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 5/9] bundle-uri client: add boolean transfer.bundleURI setting
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
@ 2022-11-01  1:07 ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-01  1:07 ` [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory() Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

To enable this setting by default in the correct tests, add a
GIT_TEST_BUNDLE_URI environment variable.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/config/transfer.txt     |  6 ++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh |  3 +++
 transport.c                           | 10 +++++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index 264812cca4d..c3ac767d1e4 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -115,3 +115,9 @@ transfer.unpackLimit::
 transfer.advertiseSID::
 	Boolean. When true, client and server processes will advertise their
 	unique session IDs to their remote counterpart. Defaults to false.
+
+transfer.bundleURI::
+	When `true`, local `git clone` commands will request bundle
+	information from the remote server (if advertised) and download
+	bundles before continuing the clone through the Git protocol.
+	Defaults to `false`.
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index 000fcc5e20b..872bc39ad1b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -1,5 +1,8 @@
 # Included from t573*-protocol-v2-bundle-uri-*.sh
 
+GIT_TEST_BUNDLE_URI=1
+export GIT_TEST_BUNDLE_URI
+
 T5730_PARENT=
 T5730_URI=
 T5730_BUNDLE_URI=
diff --git a/transport.c b/transport.c
index 86460f5be28..b33180226ae 100644
--- a/transport.c
+++ b/transport.c
@@ -1538,6 +1538,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 
 int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 {
+	int value = 0;
 	const struct transport_vtable *vtable = transport->vtable;
 
 	/* Check config only once. */
@@ -1545,10 +1546,13 @@ int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 		return 0;
 
 	/*
-	 * This is intentionally below the transport.injectBundleURI,
-	 * we want to be able to inject into protocol v0, or into the
-	 * dialog of a server who doesn't support this.
+	 * Don't use bundle-uri at all, if configured not to. Only proceed
+	 * if GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
 	 */
+	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&
+	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
+		return 0;
+
 	if (!vtable->get_bundle_uri) {
 		if (quiet)
 			return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory()
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-01  1:07 ` Derrick Stolee via GitGitGadget
  2022-11-03  9:28   ` Phillip Wood
  2022-11-03  9:49   ` Ævar Arnfjörð Bjarmason
  2022-11-01  1:07 ` [PATCH 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.

Re-add the method, this time in strbuf.c. The method requirements are
slightly modified to allow a trailing slash, in which case nothing is
done. The return value is the number of byte removed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 strbuf.c | 9 +++++++++
 strbuf.h | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 0890b1405c5..b5cb324c431 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1200,3 +1200,12 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 	free(path2);
 	return res;
 }
+
+size_t strbuf_parent_directory(struct strbuf *buf)
+{
+	size_t len = buf->len;
+	size_t offset = offset_1st_component(buf->buf);
+	char *path_sep = find_last_dir_sep(buf->buf + offset);
+	strbuf_setlen(buf, path_sep ? path_sep - buf->buf : offset);
+	return len - buf->len;
+}
diff --git a/strbuf.h b/strbuf.h
index 76965a17d44..8a964a08c31 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -664,6 +664,13 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
 int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			      const char *const *env);
 
+/*
+ * Remove the deepest subdirectory in the provided path string. If path
+ * contains a trailing separator, then the path is considered a directory
+ * and nothing is modified.
+ */
+size_t strbuf_parent_directory(struct strbuf *buf);
+
 void strbuf_add_lines(struct strbuf *sb,
 		      const char *prefix,
 		      const char *buf,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 7/9] bundle-uri: allow relative URLs in bundle lists
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory() Derrick Stolee via GitGitGadget
@ 2022-11-01  1:07 ` Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles.
This allows easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 16 ++++++++++-
 bundle-uri.h                |  9 +++++++
 t/helper/test-bundle-uri.c  |  2 ++
 t/t5750-bundle-uri-parse.sh | 54 +++++++++++++++++++++++++++++++++++++
 transport.c                 |  3 +++
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 3469f1aaa98..0f3902bbd2b 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -7,6 +7,7 @@
 #include "hashmap.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "remote.h"
 
 static int compare_bundles(const void *hashmap_cmp_fn_data,
 			   const struct hashmap_entry *he1,
@@ -49,6 +50,7 @@ void clear_bundle_list(struct bundle_list *list)
 
 	for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
 	hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
+	free(list->baseURI);
 }
 
 int for_all_bundles_in_list(struct bundle_list *list,
@@ -163,7 +165,7 @@ static int bundle_list_update(const char *key, const char *value,
 	if (!strcmp(subkey, "uri")) {
 		if (bundle->uri)
 			return -1;
-		bundle->uri = xstrdup(value);
+		bundle->uri = relative_url(list->baseURI, value, NULL);
 		return 0;
 	}
 
@@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
+	if (!list->baseURI) {
+		struct strbuf baseURI = STRBUF_INIT;
+		strbuf_addstr(&baseURI, uri);
+
+		/*
+		 * If the URI does not end with a trailing slash, then
+		 * remove the filename portion of the path. This is
+		 * important for relative URIs.
+		 */
+		strbuf_parent_directory(&baseURI);
+		list->baseURI = strbuf_detach(&baseURI, NULL);
+	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
diff --git a/bundle-uri.h b/bundle-uri.h
index 357111ecce8..7905e56732c 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -61,6 +61,15 @@ struct bundle_list {
 	int version;
 	enum bundle_list_mode mode;
 	struct hashmap bundles;
+
+	/**
+	 * The baseURI of a bundle_list is used as the base for any
+	 * relative URIs advertised by the bundle list at that location.
+	 *
+	 * When the list is generated from a Git server, then use that
+	 * server's location.
+	 */
+	char *baseURI;
 };
 
 void init_bundle_list(struct bundle_list *list);
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index ffb975b7b4f..5aa0b494ce3 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 	init_bundle_list(&list);
 
+	list.baseURI = xstrdup("<uri>");
+
 	switch (mode) {
 	case KEY_VALUE_PAIRS:
 		if (argc != 1)
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index c2fe3f9c5a5..ed5262a8d2b 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -30,6 +30,30 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'bundle_uri_parse_line(): relative URIs' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-key-values in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
 	cat >in <<-\EOF &&
 	=bogus-value
@@ -136,6 +160,36 @@ test_expect_success 'parse config format: just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'parse config format: relative URIs' '
+	cat >in <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = bundle.bdl
+	[bundle "two"]
+		uri = ../bundle.bdl
+	[bundle "three"]
+		uri = sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-config in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'parse config format edge cases: empty key or value' '
 	cat >in1 <<-\EOF &&
 	= bogus-value
diff --git a/transport.c b/transport.c
index b33180226ae..2c4ff0c2023 100644
--- a/transport.c
+++ b/transport.c
@@ -1553,6 +1553,9 @@ int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
 		return 0;
 
+	if (!transport->bundles->baseURI)
+		transport->bundles->baseURI = xstrdup(transport->url);
+
 	if (!vtable->get_bundle_uri) {
 		if (quiet)
 			return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 8/9] bundle-uri: download bundles from an advertised list
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-11-01  1:07 ` Derrick Stolee via GitGitGadget
  2022-11-01  1:07 ` [PATCH 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  9 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 verb. To actually download
and unbundle the advertised bundles, we need a different mechanism.

Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.

In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c | 21 +++++++++++++++++++++
 bundle-uri.h | 11 +++++++++++
 2 files changed, 32 insertions(+)

diff --git a/bundle-uri.c b/bundle-uri.c
index 0f3902bbd2b..0b0081ed269 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -577,6 +577,27 @@ cleanup:
 	return result;
 }
 
+int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
+{
+	int result;
+	struct bundle_list global_list;
+
+	init_bundle_list(&global_list);
+
+	/* If a bundle is added to this global list, then it is required. */
+	global_list.mode = BUNDLE_MODE_ALL;
+
+	if ((result = download_bundle_list(r, list, &global_list, 0)))
+		goto cleanup;
+
+	result = unbundle_all_bundles(r, &global_list);
+
+cleanup:
+	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
+	clear_bundle_list(&global_list);
+	return result;
+}
+
 /**
  * API for serve.c.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 7905e56732c..a75b68d2f5a 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -102,6 +102,17 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * Given a bundle list that was already advertised (likely by the
+ * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
+ * bundles according to the bundle strategy of that list.
+ *
+ * Returns non-zero if no bundle information is found at the given 'uri'.
+ */
+int fetch_bundle_list(struct repository *r,
+		      const char *uri,
+		      struct bundle_list *list);
+
 /**
  * API for serve.c.
  */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 9/9] clone: unbundle the advertised bundles
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (7 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-11-01  1:07 ` Derrick Stolee via GitGitGadget
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  9 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-01  1:07 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 verb, when advertised _and_ when
the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c  | 26 +++++++++++++++++----
 t/t5601-clone.sh | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 22b1e506452..09f10477ed6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1267,11 +1267,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
-	/*
-	 * Populate transport->got_remote_bundle_uri and
-	 * transport->bundle_uri. We might get nothing.
-	 */
-	transport_get_remote_bundle_uri(transport, 1);
+	if (!bundle_uri) {
+		/*
+		* Populate transport->got_remote_bundle_uri and
+		* transport->bundle_uri. We might get nothing.
+		*/
+		transport_get_remote_bundle_uri(transport, 1);
+
+		if (transport->bundles &&
+		    hashmap_get_size(&transport->bundles->bundles)) {
+			/* At this point, we need the_repository to match the cloned repo. */
+			if (repo_init(the_repository, git_dir, work_tree))
+				warning(_("failed to initialize the repo, skipping bundle URI"));
+			if (fetch_bundle_list(the_repository,
+					      remote->url[0],
+					      transport->bundles))
+				warning(_("failed to fetch advertised bundles"));
+		} else {
+			clear_bundle_list(transport->bundles);
+			FREE_AND_NULL(transport->bundles);
+		}
+	}
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 45f0803ed4d..d1d8139751e 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -795,6 +795,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
 	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
 '
 
+test_expect_success 'auto-discover bundle URI from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo2.git repo2 &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
+test_expect_success 'auto-discover multiple bundles from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	test_commit -C src new &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.mode all &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.new.uri "$HTTPD_URL/new.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo3.git repo3 &&
+
+	# We should fetch _both_ bundles
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory()
  2022-11-01  1:07 ` [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory() Derrick Stolee via GitGitGadget
@ 2022-11-03  9:28   ` Phillip Wood
  2022-11-03  9:49   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 87+ messages in thread
From: Phillip Wood @ 2022-11-03  9:28 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Hi Stolee

On 01/11/2022 01:07, Derrick Stolee via GitGitGadget wrote:
> +size_t strbuf_parent_directory(struct strbuf *buf)
> +{
> +	size_t len = buf->len;
> +	size_t offset = offset_1st_component(buf->buf);
> +	char *path_sep = find_last_dir_sep(buf->buf + offset);
> +	strbuf_setlen(buf, path_sep ? path_sep - buf->buf : offset);
> +	return len - buf->len;
> +}
> diff --git a/strbuf.h b/strbuf.h
> index 76965a17d44..8a964a08c31 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -664,6 +664,13 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
>   int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>   			      const char *const *env);
>   
> +/*
> + * Remove the deepest subdirectory in the provided path string. If path
> + * contains a trailing separator, then the path is considered a directory
> + * and nothing is modified.
> + */

I found the name and description a bit confusing, if I've understood 
correctly it isn't really removing the deepest subdirectory but removes 
the file name if there is one from the path. As such perhaps 
strbuf_strip_filename() might make it clearer that it is a no-op if the 
path ends with a slash. It returns the number of characters removed but 
that is undocumented.

Best Wishes

Phillip

> +size_t strbuf_parent_directory(struct strbuf *buf);
> +
>   void strbuf_add_lines(struct strbuf *sb,
>   		      const char *prefix,
>   		      const char *buf,

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory()
  2022-11-01  1:07 ` [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory() Derrick Stolee via GitGitGadget
  2022-11-03  9:28   ` Phillip Wood
@ 2022-11-03  9:49   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-03  9:49 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee


On Tue, Nov 01 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>

> +/*

Please use a "/**" opening comment in strbuf.h, i.e. an API-doc comment.

I see just above what you're adding we got it wrong :(, but most docs
for functions in that file correctly use that.

Some recent background at:
https://lore.kernel.org/git/220928.86pmffwmft.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton
  2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-08 17:08   ` SZEDER Gábor
  2022-11-11  1:59   ` Victoria Dye
  1 sibling, 0 replies; 87+ messages in thread
From: SZEDER Gábor @ 2022-11-08 17:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

On Tue, Nov 01, 2022 at 01:07:26AM +0000, Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> An earlier version of this patch [1] used a different transfer format
> than the "key=value" pairs in the current implementation. The change was
> made to unify the protocol v2 verb with the bundle lists provided by

s/verb/command/ here, and in a couple of later commit messages and an
in-code comment as well.

As mentioned before on a previous bundle-uri patch series, protocol v2
has commands, not verbs.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton
  2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-08 17:08   ` SZEDER Gábor
@ 2022-11-11  1:59   ` Victoria Dye
  2022-11-16 14:08     ` Derrick Stolee
  1 sibling, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-11  1:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> +
> +PROTOCOL for bundle-uri
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +A `bundle-uri` request takes no arguments, and as noted above does not
> +currently advertise a capability value. Both may be added in the
> +future.
> +
> +When the client issues a `command=bundle-uri` the response is a list of

nit: comma after `command=bundle-uri`

I misread this a couple times dropping the "the", so it read like the
`command=bundle-uri` was the *response*, not the request. I think the comma
would help avoid that?

> +key-value pairs provided as packet lines with value `<key>=<value>`. The
> +meaning of these key-value pairs are provided by the config keys in the
> +`bundle.*` namespace (see linkgit:git-config[1]).

What does this ("the meaning of these key-value pairs are provided by the
config keys...") mean? Are the response keys in the `bundle.*` namespace? Or
do the client-side `bundle.*` keys provide some kind of translation of what
the keys mean?

> +
> +Clients are still expected to fully parse the line according to the
> +above format, lines that do not conform to the format SHOULD be
> +discarded. The user MAY be warned in such a case.

Why "still" - is there some reason they *wouldn't* parse the response line? 

Is "the above format" referring to `<key>=<value>` in general, or restricted
to/guaranteed that the `<key>`'s defined by the `bundle.*` namespace? I'm
guessing "still expected to fully parse" == "MUST parse" (using
MUST/SHOULD/MAY nomeclature), it would help to call that out explicitly to
be consistent with the rest of this doc.

> +
> +bundle-uri CLIENT AND SERVER EXPECTATIONS
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +URI CONTENTS::
> +The advertised URIs MUST be in one of two possible formats.
> ++
> +The first possible format is a bundle file that `git bundle verify`

I don't think "format" is the right word to describe this (I'd reserve
"format" for the literal format of the URI string). Maybe something like
this?

	The advertised URIs MUST contain one of two types of content.

	The advertised URI may contain a bundle file that `git bundle
	verify` would accept...

	...
	
	Alternatively, the advertised URI may provide a plaintext file...

> +would accept. I.e. they MUST contain one or more reference tips for
> +use by the client, MUST indicate prerequisites (in any) with standard
> +"-" prefixes, and MUST indicate their "object-format", if
> +applicable. Create "*.bundle" files with `git bundle create`.

The last sentence doesn't add anything that you don't know from the `git
bundle verify` note in the first doesn't already tell you, and feels like a
bit of a non-sequitur as a result. Although, it tangentially raises a
question: do bundle files *have to* have the '.bundle' suffix to pass `git
bundle verify`? If not, are they expected to when coming from these URIs?

> ++
> +The second possible format is a plaintext file that `git config --list`
> +would accept (with the `--file` option). The key-value pairs in this list
> +are in the `bundle.*` namespace (see linkgit:git-config[1]).
> +
> +bundle-uri CLIENT ERROR RECOVERY::
> +A client MUST above all gracefully degrade on errors, whether that
> +error is because of bad missing/data in the bundle URI(s), because
> +that client is too dumb to e.g. understand and fully parse out bundle
> +headers and their prerequisite relationships, or something else.
> ++
> +Server operators should feel confident in turning on "bundle-uri" and
> +not worry if e.g. their CDN goes down that clones or fetches will run
> +into hard failures. Even if the server bundle bundle(s) are
> +incomplete, or bad in some way the client should still end up with a
> +functioning repository, just as if it had chosen not to use this
> +protocol extension.
> ++
> +All subsequent discussion on client and server interaction MUST keep
> +this in mind.
> +
> +bundle-uri SERVER TO CLIENT::
> +The ordering of the returned bundle uris is not significant. Clients
> +MUST parse their headers to discover their contained OIDS and
> +prerequisites. A client MUST consider the content of the bundle(s)
> +themselves and their header as the ultimate source of truth.
> ++
> +A server MAY even return bundle(s) that don't have any direct
> +relationship to the repository being cloned (either through accident,
> +or intentional "clever" configuration), and expect a client to sort
> +out what data they'd like from the bundle(s), if any.
> +
> +bundle-uri CLIENT TO SERVER::
> +The client SHOULD provide reference tips found in the bundle header(s)
> +as 'have' lines in any subsequent `fetch` request. A client MAY also
> +ignore the bundle(s) entirely if doing so is deemed worse for some
> +reason, e.g. if the bundles can't be downloaded, it doesn't like the
> +tips it finds etc.
> +
> +WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
> +If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
> +of the bundle(s) the client finds that the ref tips it wants can be
> +retrieved entirety from advertised bundle(s), it MAY disconnect. The

s/entirety/entirely

And to clarify, by "it MAY disconnect", you mean it may disconnect from the
main repository server? Or the bundle server? 

> +results of such a 'clone' or 'fetch' should be indistinguishable from
> +the state attained without using bundle-uri.
> +
> +EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
> +A client MAY perform an early disconnect while still downloading the
> +bundle(s) (having streamed and parsed their headers). In such a case
> +the client MUST gracefully recover from any errors related to
> +finishing the download and validation of the bundle(s).
> ++
> +I.e. a client might need to re-connect and issue a 'fetch' command,
> +and possibly fall back to not making use of 'bundle-uri' at all.
> ++
> +This "MAY" behavior is specified as such (and not a "SHOULD") on the
> +assumption that a server advertising bundle uris is more likely than
> +not to be serving up a relatively large repository, and to be pointing
> +to URIs that have a good chance of being in working order. A client
> +MAY e.g. look at the payload size of the bundles as a heuristic to see
> +if an early disconnect is worth it, should falling back on a full
> +"fetch" dialog be necessary.
> +
> +WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
> +A client SHOULD commence a negotiation of a PACK from the server via
> +the "fetch" command using the OID tips found in advertised bundles,
> +even if's still in the process of downloading those bundle(s).
> ++
> +This allows for aggressive early disconnects from any interactive
> +server dialog. The client blindly trusts that the advertised OID tips
> +are relevant, and issues them as 'have' lines, it then requests any
> +tips it would like (usually from the "ls-refs" advertisement) via
> +'want' lines. The server will then compute a (hopefully small) PACK
> +with the expected difference between the tips from the bundle(s) and
> +the data requested.
> ++
> +The only connection the client then needs to keep active is to the
> +concurrently downloading static bundle(s), when those and the
> +incremental PACK are retrieved they should be inflated and
> +validated. Any errors at this point should be gracefully recovered
> +from, see above.
> +
> +bundle-uri PROTOCOL FEATURES
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +As noted above the `<key>=<value>` definitions are documented by the
> +`bundle.*` config namespace.

Same comment as earlier - this is a confusing way to phrase this. If you
mean "the keys are part of the `bundle.*` namespace documented in
linkgit:git-config[1]", I think you can just say that directly. If not, it
would help to clarify the relationship between the `bundle.*` namespace and
these keys.

> +
> +In particular, the `bundle.version` key specifies an integer value. The
> +only accepted value at the moment is `1`, but if the client sees an
> +unexpected value here then the client MUST ignore the bundle list.
> +
> +As long as `bundle.version` is understood, all other unknown keys MAY be
> +ignored by the client. The server will guarantee compatibility with older
> +clients, though newer clients may be better able to use the extra keys to
> +minimize downloads.
> +
> +Any backwards-incompatible addition of pre-URI key-value will be
> +guarded by a new `bundle.version` value or values in 'bundle-uri'
> +capability advertisement itself, and/or by new future `bundle-uri`
> +request arguments.
> +
> +Some example key-value pairs that are not currently implemented but could
> +be implemented in the future include:
> +
> + * Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
> +   size of the bundle file.
> +
> + * Advertise that one or more bundle files are the same (to e.g. have
> +   clients round-robin or otherwise choose one of N possible files).
> +
> + * A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
> +   expressing the common case of a bundle with one tip and no
> +   prerequisites, or one tip and one prerequisite.
> ++
> +This would allow for optimizing the common case of servers who'd like
> +to provide one "big bundle" containing only their "main" branch,
> +and/or incremental updates thereof.
> ++
> +A client receiving such a a response MAY assume that they can skip
> +retrieving the header from a bundle at the indicated URI, and thus
> +save themselves and the server(s) the request(s) needed to inspect the
> +headers of that bundle or bundles.

Overall, this document is quite thorough, especially with respect to edge
cases/error handling. I found it a bit confusing at times (at least
partially due to my unfamiliarity with protocol v2), including some
potentially ambiguous phrasing or scenarios (especially those in the
disconnection & error recovery sections) that are difficult to clearly
describe in generic terms.

I think some sections (especially "PROTOCOL for bundle-uri" and "bundle-uri
CLIENT AND SERVER EXPECTATIONS") would benefit from examples of what "good"
and "bad" request/response values & behaviors look like; they would help
illustrate some of those more complex situations. The rest of the patch (the
implementation & tests) looked good to me. 

Thanks for your continued work on this, I'm really excited to see the next
steps of bundle servers in this series!


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton
  2022-11-11  1:59   ` Victoria Dye
@ 2022-11-16 14:08     ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-11-16 14:08 UTC (permalink / raw)
  To: Victoria Dye,
	Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/10/22 8:59 PM, Victoria Dye wrote:
> Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
>> +
>> +PROTOCOL for bundle-uri
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +A `bundle-uri` request takes no arguments, and as noted above does not
>> +currently advertise a capability value. Both may be added in the
>> +future.
>> +
>> +When the client issues a `command=bundle-uri` the response is a list of
> 
> nit: comma after `command=bundle-uri`
> 
> I misread this a couple times dropping the "the", so it read like the
> `command=bundle-uri` was the *response*, not the request. I think the comma
> would help avoid that?

I think this should be

  When the client issues a `command=bundle-uri` request, the response is...

>> +key-value pairs provided as packet lines with value `<key>=<value>`. The
>> +meaning of these key-value pairs are provided by the config keys in the
>> +`bundle.*` namespace (see linkgit:git-config[1]).
> 
> What does this ("the meaning of these key-value pairs are provided by the
> config keys...") mean? Are the response keys in the `bundle.*` namespace? Or
> do the client-side `bundle.*` keys provide some kind of translation of what
> the keys mean?

I can elaborate more here, but the intention is that the protocol defines
only how these key-value pairs are delivered, and how the client assigns
meaning to those values and acts upon them is defined elsewhere.

>> +
>> +Clients are still expected to fully parse the line according to the
>> +above format, lines that do not conform to the format SHOULD be
>> +discarded. The user MAY be warned in such a case.
> 
> Why "still" - is there some reason they *wouldn't* parse the response line? 

"still" is not needed here.

> Is "the above format" referring to `<key>=<value>` in general, or restricted
> to/guaranteed that the `<key>`'s defined by the `bundle.*` namespace? I'm
> guessing "still expected to fully parse" == "MUST parse" (using
> MUST/SHOULD/MAY nomeclature), it would help to call that out explicitly to
> be consistent with the rest of this doc.

Using MUST simplifies things a lot.

>> +
>> +bundle-uri CLIENT AND SERVER EXPECTATIONS
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +URI CONTENTS::
>> +The advertised URIs MUST be in one of two possible formats.
>> ++
>> +The first possible format is a bundle file that `git bundle verify`
> 
> I don't think "format" is the right word to describe this (I'd reserve
> "format" for the literal format of the URI string). Maybe something like
> this?
> 
> 	The advertised URIs MUST contain one of two types of content.

How about...

"The content at the advertised URIs MUST be one of two types" ?

> 	The advertised URI may contain a bundle file that `git bundle
> 	verify` would accept...
> 
> 	...
> 	
> 	Alternatively, the advertised URI may provide a plaintext file...
> 
>> +would accept. I.e. they MUST contain one or more reference tips for
>> +use by the client, MUST indicate prerequisites (in any) with standard
>> +"-" prefixes, and MUST indicate their "object-format", if
>> +applicable. Create "*.bundle" files with `git bundle create`.
> 
> The last sentence doesn't add anything that you don't know from the `git
> bundle verify` note in the first doesn't already tell you, and feels like a
> bit of a non-sequitur as a result. Although, it tangentially raises a
> question: do bundle files *have to* have the '.bundle' suffix to pass `git
> bundle verify`? If not, are they expected to when coming from these URIs?
 
The files do not need that extension. This sentence can be removed.

>> +WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
>> +If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
>> +of the bundle(s) the client finds that the ref tips it wants can be
>> +retrieved entirety from advertised bundle(s), it MAY disconnect. The
> 
> s/entirety/entirely

Thanks.

> And to clarify, by "it MAY disconnect", you mean it may disconnect from the
> main repository server? Or the bundle server? 

The main repository server, since the bundle server is not speaking
the Git protocol, but there is definitely room to clarify here.

>> +bundle-uri PROTOCOL FEATURES
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +As noted above the `<key>=<value>` definitions are documented by the
>> +`bundle.*` config namespace.
> 
> Same comment as earlier - this is a confusing way to phrase this. If you
> mean "the keys are part of the `bundle.*` namespace documented in
> linkgit:git-config[1]", I think you can just say that directly. If not, it
> would help to clarify the relationship between the `bundle.*` namespace and
> these keys.

Will do.

> Overall, this document is quite thorough, especially with respect to edge
> cases/error handling. I found it a bit confusing at times (at least
> partially due to my unfamiliarity with protocol v2), including some
> potentially ambiguous phrasing or scenarios (especially those in the
> disconnection & error recovery sections) that are difficult to clearly
> describe in generic terms.
> 
> I think some sections (especially "PROTOCOL for bundle-uri" and "bundle-uri
> CLIENT AND SERVER EXPECTATIONS") would benefit from examples of what "good"
> and "bad" request/response values & behaviors look like; they would help
> illustrate some of those more complex situations. The rest of the patch (the
> implementation & tests) looked good to me. 

This is an interesting idea, although the examples of "good" and "bad" are
probably best left as the test cases. Looking through the rest of this
document, this section is already much more verbose than the others, so I
hesitate to add these examples at this point. Perhaps there is room to
improve the whole document with such examples as a follow-up.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2
  2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                   ` (8 preceding siblings ...)
  2022-11-01  1:07 ` [PATCH 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51 ` Derrick Stolee via GitGitGadget
  2022-11-16 19:51   ` [PATCH v2 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
                     ` (9 more replies)
  9 siblings, 10 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

This is based on the recent master batch that included ds/bundle-uri-....

Now that git clone --bundle-uri can download a bundle list from a plaintex
file in config format, we can use the same set of key-value pairs to
advertise a bundle list over protocol v2. At the end of this series:

 1. A server can advertise bundles when uploadPack.advertiseBundleURIs is
    enabled. The bundle list comes from the server's local config,
    specifically the bundle.* namespace.
 2. A client can notice a server's bundle-uri advertisement and request the
    bundle list if transfer.bundleURI is enabled. The bundles are downloaded
    as if the list was advertised from the --bundle-uri option.

Many patches in this series were adapted from Ævar's v2 RFC [1]. He is
retained as author and I added myself as co-author only if the modifications
were significant.

[1]
https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

 * Patches 1-5 are mostly taken from [1], again with mostly minor updates.
   The one major difference is the packet line format being a single
   key=value format instead of a sequence of pairs. This also means that
   Patch 4 is entirely new since it feeds these pairs directly from the
   server's config.

 * Patches 6-9 finish off the ability for the client to notice the
   capability, request the values, and download bundles before continuing
   with the rest of the download.

One thing that is not handled here but could be handled in a future change
is to disconnect from the origin Git server while downloading the bundle
URIs, then reconnecting afterwards. This does not make any difference for
HTTPS, but SSH may benefit from the reduced connection time. The git clone
--bundle-uri option did not suffer from this because the bundles are
downloaded before the server connection begins.

After this series, there is one more before the original scope of the plan
is complete: using creation tokens as a heuristic. See [2] for the RFC
version of those patches.

[2] https://github.com/derrickstolee/git/pull/22


Updates in v2
=============

 * Commit messages now refer to protocol v2 "commands" not "verbs".
 * Several edits were made to gitprotocol-v2.txt thanks to Victoria's
   thorough review.
 * strbuf_parent_directory() is renamed strbuf_strip_file_from_path() to
   make it more clear how it behaves when ending with a slash.

Thanks,

 * Stolee

Derrick Stolee (5):
  bundle-uri: serve bundle.* keys from config
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: allow relative URLs in bundle lists
  bundle-uri: download bundles from an advertised list
  clone: unbundle the advertised bundles

Ævar Arnfjörð Bjarmason (4):
  protocol v2: add server-side "bundle-uri" skeleton
  bundle-uri client: add minimal NOOP client
  bundle-uri client: add helper for testing server
  bundle-uri client: add boolean transfer.bundleURI setting

 Documentation/config/transfer.txt      |   6 +
 Documentation/gitprotocol-v2.txt       | 201 ++++++++++++++++++++++
 builtin/clone.c                        |  23 +++
 bundle-uri.c                           |  91 +++++++++-
 bundle-uri.h                           |  27 +++
 connect.c                              |  47 +++++
 remote.h                               |   5 +
 serve.c                                |   6 +
 strbuf.c                               |   9 +
 strbuf.h                               |  12 ++
 t/helper/test-bundle-uri.c             |  48 ++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh  | 229 +++++++++++++++++++++++++
 t/t5601-clone.sh                       |  59 +++++++
 t/t5701-git-serve.sh                   |  40 ++++-
 t/t5730-protocol-v2-bundle-uri-file.sh |  36 ++++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 ++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 ++
 t/t5750-bundle-uri-parse.sh            |  54 ++++++
 transport-helper.c                     |  13 ++
 transport-internal.h                   |   7 +
 transport.c                            |  87 ++++++++++
 transport.h                            |  23 +++
 22 files changed, 1055 insertions(+), 2 deletions(-)
 create mode 100644 t/lib-t5730-protocol-v2-bundle-uri.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1400%2Fderrickstolee%2Fbundle-redo%2Fadvertise-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1400/derrickstolee/bundle-redo/advertise-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1400

Range-diff vs v1:

  1:  a02eee98318 !  1:  beae335b855 protocol v2: add server-side "bundle-uri" skeleton
     @@ Commit message
      
          An earlier version of this patch [1] used a different transfer format
          than the "key=value" pairs in the current implementation. The change was
     -    made to unify the protocol v2 verb with the bundle lists provided by
     +    made to unify the protocol v2 command with the bundle lists provided by
          independent bundle servers. Further, the standard allows for the server
          to advertise a URI that contains a bundle list. This allows users
          automatically discovering bundle providers that are loosely associated
     @@ Documentation/gitprotocol-v2.txt: and associated requested information, each sep
      +currently advertise a capability value. Both may be added in the
      +future.
      +
     -+When the client issues a `command=bundle-uri` the response is a list of
     -+key-value pairs provided as packet lines with value `<key>=<value>`. The
     -+meaning of these key-value pairs are provided by the config keys in the
     -+`bundle.*` namespace (see linkgit:git-config[1]).
     ++When the client issues a `command=bundle-uri` request, the response is a
     ++list of key-value pairs provided as packet lines with value
     ++`<key>=<value>`. Each `<key>` should be interpreted as a config key from
     ++the `bundle.*` namespace to construct a list of bundles. These keys are
     ++grouped by a `bundle.<id>.` subsection, where each key corresponding to a
     ++given `<id>` contributes attributes to the bundle defined by that `<id>`.
     ++See linkgit:git-config[1] for the specific details of these keys and how
     ++the Git client will interpret their values.
      +
     -+Clients are still expected to fully parse the line according to the
     -+above format, lines that do not conform to the format SHOULD be
     -+discarded. The user MAY be warned in such a case.
     ++Clients MUST parse the line according to the above format, lines that do
     ++not conform to the format SHOULD be discarded. The user MAY be warned in
     ++such a case.
      +
      +bundle-uri CLIENT AND SERVER EXPECTATIONS
      +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      +
      +URI CONTENTS::
     -+The advertised URIs MUST be in one of two possible formats.
     ++The content at the advertised URIs MUST be one of two types.
      ++
     -+The first possible format is a bundle file that `git bundle verify`
     ++The advertised URI may contain a bundle file that `git bundle verify`
      +would accept. I.e. they MUST contain one or more reference tips for
      +use by the client, MUST indicate prerequisites (in any) with standard
      +"-" prefixes, and MUST indicate their "object-format", if
     -+applicable. Create "*.bundle" files with `git bundle create`.
     ++applicable.
      ++
     -+The second possible format is a plaintext file that `git config --list`
     -+would accept (with the `--file` option). The key-value pairs in this list
     -+are in the `bundle.*` namespace (see linkgit:git-config[1]).
     ++The advertised URI may alternatively contain a plaintext file that `git
     ++config --list` would accept (with the `--file` option). The key-value
     ++pairs in this list are in the `bundle.*` namespace (see
     ++linkgit:git-config[1]).
      +
      +bundle-uri CLIENT ERROR RECOVERY::
      +A client MUST above all gracefully degrade on errors, whether that
     @@ Documentation/gitprotocol-v2.txt: and associated requested information, each sep
      +WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
      +If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
      +of the bundle(s) the client finds that the ref tips it wants can be
     -+retrieved entirety from advertised bundle(s), it MAY disconnect. The
     -+results of such a 'clone' or 'fetch' should be indistinguishable from
     -+the state attained without using bundle-uri.
     ++retrieved entirely from advertised bundle(s), the client MAY disconnect
     ++from the Git server. The results of such a 'clone' or 'fetch' should be
     ++indistinguishable from the state attained without using bundle-uri.
      +
      +EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
      +A client MAY perform an early disconnect while still downloading the
     @@ Documentation/gitprotocol-v2.txt: and associated requested information, each sep
      +bundle-uri PROTOCOL FEATURES
      +^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      +
     -+As noted above the `<key>=<value>` definitions are documented by the
     -+`bundle.*` config namespace.
     ++The client constructs a bundle list from the `<key>=<value>` pairs
     ++provided by the server. These pairs are part of the `bundle.*` namespace
     ++as documented in linkgit:git-config[1]. In this section, we discuss some
     ++of these keys and describe the actions the client will do in response to
     ++this information.
      +
      +In particular, the `bundle.version` key specifies an integer value. The
      +only accepted value at the moment is `1`, but if the client sees an
  2:  64dd9bf41de =  2:  0d85aef965d bundle-uri client: add minimal NOOP client
  3:  ae0003bb39b =  3:  c3269a24b57 bundle-uri client: add helper for testing server
  4:  431cd585184 =  4:  cd906f6d981 bundle-uri: serve bundle.* keys from config
  5:  c877f7c033d =  5:  93397468931 bundle-uri client: add boolean transfer.bundleURI setting
  6:  2200a70d279 !  6:  7d86852c015 strbuf: reintroduce strbuf_parent_directory()
     @@ Metadata
      Author: Derrick Stolee <derrickstolee@github.com>
      
       ## Commit message ##
     -    strbuf: reintroduce strbuf_parent_directory()
     +    strbuf: introduce strbuf_strip_file_from_path()
      
          The strbuf_parent_directory() method was added as a static method in
          contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
     @@ Commit message
          65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
          there is a need for a similar method in the bundle URI feature.
      
     -    Re-add the method, this time in strbuf.c. The method requirements are
     -    slightly modified to allow a trailing slash, in which case nothing is
     -    done. The return value is the number of byte removed.
     +    Re-add the method, this time in strbuf.c, but with a new name:
     +    strbuf_strip_file_from_path(). The method requirements are slightly
     +    modified to allow a trailing slash, in which case nothing is done, which
     +    makes the name change valuable. The return value is the number of bytes
     +    removed.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ strbuf.c: int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
       	return res;
       }
      +
     -+size_t strbuf_parent_directory(struct strbuf *buf)
     ++size_t strbuf_strip_file_from_path(struct strbuf *buf)
      +{
      +	size_t len = buf->len;
      +	size_t offset = offset_1st_component(buf->buf);
      +	char *path_sep = find_last_dir_sep(buf->buf + offset);
     -+	strbuf_setlen(buf, path_sep ? path_sep - buf->buf : offset);
     ++	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
      +	return len - buf->len;
      +}
      
     @@ strbuf.h: int launch_sequence_editor(const char *path, struct strbuf *buffer,
       			      const char *const *env);
       
      +/*
     -+ * Remove the deepest subdirectory in the provided path string. If path
     ++ * Remove the filename from the provided path string. If the path
      + * contains a trailing separator, then the path is considered a directory
     -+ * and nothing is modified.
     ++ * and nothing is modified. Returns the number of characters removed from
     ++ * the path.
     ++ *
     ++ * Examples:
     ++ * - "/path/to/file" -> "/path/to/" (returns: 4)
     ++ * - "/path/to/dir/" -> "/path/to/dir/" (returns: 0)
      + */
     -+size_t strbuf_parent_directory(struct strbuf *buf);
     ++size_t strbuf_strip_file_from_path(struct strbuf *buf);
      +
       void strbuf_add_lines(struct strbuf *sb,
       		      const char *prefix,
  7:  3550f6fb91b !  7:  186e112d821 bundle-uri: allow relative URLs in bundle lists
     @@ bundle-uri.c: int bundle_uri_parse_config_format(const char *uri,
      +		 * remove the filename portion of the path. This is
      +		 * important for relative URIs.
      +		 */
     -+		strbuf_parent_directory(&baseURI);
     ++		strbuf_strip_file_from_path(&baseURI);
      +		list->baseURI = strbuf_detach(&baseURI, NULL);
      +	}
       	result = git_config_from_file_with_options(config_to_bundle_list,
  8:  e002affe4bf !  8:  f254da46a2c bundle-uri: download bundles from an advertised list
     @@ Commit message
      
          The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
          'git clone', but is not helpful when the clone operation discovers a
     -    list of URIs from the bundle-uri protocol v2 verb. To actually download
     -    and unbundle the advertised bundles, we need a different mechanism.
     +    list of URIs from the bundle-uri protocol v2 command. To actually
     +    download and unbundle the advertised bundles, we need a different
     +    mechanism.
      
          Create the new fetch_bundle_list() method which is very similar to
          fetch_bundle_uri() except that it relies on download_bundle_list()
  9:  1c034bba744 !  9:  b62b4b17481 clone: unbundle the advertised bundles
     @@ Commit message
          clone: unbundle the advertised bundles
      
          A previous change introduced the transport methods to acquire a bundle
     -    list from the 'bundle-uri' protocol v2 verb, when advertised _and_ when
     -    the client has chosen to enable the feature.
     +    list from the 'bundle-uri' protocol v2 command, when advertised _and_
     +    when the client has chosen to enable the feature.
      
          Teach Git to download and unbundle the data advertised by those bundles
          during 'git clone'.

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v2 1/9] protocol v2: add server-side "bundle-uri" skeleton
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51   ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-16 19:51   ` [PATCH v2 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/gitprotocol-v2.txt | 201 +++++++++++++++++++++++++++++++
 bundle-uri.c                     |  36 ++++++
 bundle-uri.h                     |   7 ++
 serve.c                          |   6 +
 t/t5701-git-serve.sh             |  40 +++++-
 5 files changed, 289 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 59bf41cefb9..10bd2d40cec 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -578,6 +578,207 @@ and associated requested information, each separated by a single space.
 
 	obj-info = obj-id SP obj-size
 
+bundle-uri
+~~~~~~~~~~
+
+If the 'bundle-uri' capability is advertised, the server supports the
+`bundle-uri' command.
+
+The capability is currently advertised with no value (i.e. not
+"bundle-uri=somevalue"), a value may be added in the future for
+supporting command-wide extensions. Clients MUST ignore any unknown
+capability values and proceed with the 'bundle-uri` dialog they
+support.
+
+The 'bundle-uri' command is intended to be issued before `fetch` to
+get URIs to bundle files (see linkgit:git-bundle[1]) to "seed" and
+inform the subsequent `fetch` command.
+
+The client CAN issue `bundle-uri` before or after any other valid
+command. To be useful to clients it's expected that it'll be issued
+after an `ls-refs` and before `fetch`, but CAN be issued at any time
+in the dialog.
+
+DISCUSSION of bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The intent of the feature is optimize for server resource consumption
+in the common case by changing the common case of fetching a very
+large PACK during linkgit:git-clone[1] into a smaller incremental
+fetch.
+
+It also allows servers to achieve better caching in combination with
+an `uploadpack.packObjectsHook` (see linkgit:git-config[1]).
+
+By having new clones or fetches be a more predictable and common
+negotiation against the tips of recently produces *.bundle file(s).
+Servers might even pre-generate the results of such negotiations for
+the `uploadpack.packObjectsHook` as new pushes come in.
+
+One way that servers could take advantage of these bundles is that the
+server would anticipate that fresh clones will download a known bundle,
+followed by catching up to the current state of the repository using ref
+tips found in that bundle (or bundles).
+
+PROTOCOL for bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A `bundle-uri` request takes no arguments, and as noted above does not
+currently advertise a capability value. Both may be added in the
+future.
+
+When the client issues a `command=bundle-uri` request, the response is a
+list of key-value pairs provided as packet lines with value
+`<key>=<value>`. Each `<key>` should be interpreted as a config key from
+the `bundle.*` namespace to construct a list of bundles. These keys are
+grouped by a `bundle.<id>.` subsection, where each key corresponding to a
+given `<id>` contributes attributes to the bundle defined by that `<id>`.
+See linkgit:git-config[1] for the specific details of these keys and how
+the Git client will interpret their values.
+
+Clients MUST parse the line according to the above format, lines that do
+not conform to the format SHOULD be discarded. The user MAY be warned in
+such a case.
+
+bundle-uri CLIENT AND SERVER EXPECTATIONS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+URI CONTENTS::
+The content at the advertised URIs MUST be one of two types.
++
+The advertised URI may contain a bundle file that `git bundle verify`
+would accept. I.e. they MUST contain one or more reference tips for
+use by the client, MUST indicate prerequisites (in any) with standard
+"-" prefixes, and MUST indicate their "object-format", if
+applicable.
++
+The advertised URI may alternatively contain a plaintext file that `git
+config --list` would accept (with the `--file` option). The key-value
+pairs in this list are in the `bundle.*` namespace (see
+linkgit:git-config[1]).
+
+bundle-uri CLIENT ERROR RECOVERY::
+A client MUST above all gracefully degrade on errors, whether that
+error is because of bad missing/data in the bundle URI(s), because
+that client is too dumb to e.g. understand and fully parse out bundle
+headers and their prerequisite relationships, or something else.
++
+Server operators should feel confident in turning on "bundle-uri" and
+not worry if e.g. their CDN goes down that clones or fetches will run
+into hard failures. Even if the server bundle bundle(s) are
+incomplete, or bad in some way the client should still end up with a
+functioning repository, just as if it had chosen not to use this
+protocol extension.
++
+All subsequent discussion on client and server interaction MUST keep
+this in mind.
+
+bundle-uri SERVER TO CLIENT::
+The ordering of the returned bundle uris is not significant. Clients
+MUST parse their headers to discover their contained OIDS and
+prerequisites. A client MUST consider the content of the bundle(s)
+themselves and their header as the ultimate source of truth.
++
+A server MAY even return bundle(s) that don't have any direct
+relationship to the repository being cloned (either through accident,
+or intentional "clever" configuration), and expect a client to sort
+out what data they'd like from the bundle(s), if any.
+
+bundle-uri CLIENT TO SERVER::
+The client SHOULD provide reference tips found in the bundle header(s)
+as 'have' lines in any subsequent `fetch` request. A client MAY also
+ignore the bundle(s) entirely if doing so is deemed worse for some
+reason, e.g. if the bundles can't be downloaded, it doesn't like the
+tips it finds etc.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
+If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
+of the bundle(s) the client finds that the ref tips it wants can be
+retrieved entirely from advertised bundle(s), the client MAY disconnect
+from the Git server. The results of such a 'clone' or 'fetch' should be
+indistinguishable from the state attained without using bundle-uri.
+
+EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
+A client MAY perform an early disconnect while still downloading the
+bundle(s) (having streamed and parsed their headers). In such a case
+the client MUST gracefully recover from any errors related to
+finishing the download and validation of the bundle(s).
++
+I.e. a client might need to re-connect and issue a 'fetch' command,
+and possibly fall back to not making use of 'bundle-uri' at all.
++
+This "MAY" behavior is specified as such (and not a "SHOULD") on the
+assumption that a server advertising bundle uris is more likely than
+not to be serving up a relatively large repository, and to be pointing
+to URIs that have a good chance of being in working order. A client
+MAY e.g. look at the payload size of the bundles as a heuristic to see
+if an early disconnect is worth it, should falling back on a full
+"fetch" dialog be necessary.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
+A client SHOULD commence a negotiation of a PACK from the server via
+the "fetch" command using the OID tips found in advertised bundles,
+even if's still in the process of downloading those bundle(s).
++
+This allows for aggressive early disconnects from any interactive
+server dialog. The client blindly trusts that the advertised OID tips
+are relevant, and issues them as 'have' lines, it then requests any
+tips it would like (usually from the "ls-refs" advertisement) via
+'want' lines. The server will then compute a (hopefully small) PACK
+with the expected difference between the tips from the bundle(s) and
+the data requested.
++
+The only connection the client then needs to keep active is to the
+concurrently downloading static bundle(s), when those and the
+incremental PACK are retrieved they should be inflated and
+validated. Any errors at this point should be gracefully recovered
+from, see above.
+
+bundle-uri PROTOCOL FEATURES
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The client constructs a bundle list from the `<key>=<value>` pairs
+provided by the server. These pairs are part of the `bundle.*` namespace
+as documented in linkgit:git-config[1]. In this section, we discuss some
+of these keys and describe the actions the client will do in response to
+this information.
+
+In particular, the `bundle.version` key specifies an integer value. The
+only accepted value at the moment is `1`, but if the client sees an
+unexpected value here then the client MUST ignore the bundle list.
+
+As long as `bundle.version` is understood, all other unknown keys MAY be
+ignored by the client. The server will guarantee compatibility with older
+clients, though newer clients may be better able to use the extra keys to
+minimize downloads.
+
+Any backwards-incompatible addition of pre-URI key-value will be
+guarded by a new `bundle.version` value or values in 'bundle-uri'
+capability advertisement itself, and/or by new future `bundle-uri`
+request arguments.
+
+Some example key-value pairs that are not currently implemented but could
+be implemented in the future include:
+
+ * Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
+   size of the bundle file.
+
+ * Advertise that one or more bundle files are the same (to e.g. have
+   clients round-robin or otherwise choose one of N possible files).
+
+ * A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
+   expressing the common case of a bundle with one tip and no
+   prerequisites, or one tip and one prerequisite.
++
+This would allow for optimizing the common case of servers who'd like
+to provide one "big bundle" containing only their "main" branch,
+and/or incremental updates thereof.
++
+A client receiving such a a response MAY assume that they can skip
+retrieving the header from a bundle at the indicated URI, and thus
+save themselves and the server(s) the request(s) needed to inspect the
+headers of that bundle or bundles.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/bundle-uri.c b/bundle-uri.c
index 79a914f961b..32022595964 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -563,6 +563,42 @@ cleanup:
 	return result;
 }
 
+/**
+ * API for serve.c.
+ */
+
+int bundle_uri_advertise(struct repository *r, struct strbuf *value)
+{
+	static int advertise_bundle_uri = -1;
+
+	if (advertise_bundle_uri != -1)
+		goto cached;
+
+	advertise_bundle_uri = 0;
+	git_config_get_maybe_bool("uploadpack.advertisebundleuris", &advertise_bundle_uri);
+
+cached:
+	return advertise_bundle_uri;
+}
+
+int bundle_uri_command(struct repository *r,
+		       struct packet_reader *request)
+{
+	struct packet_writer writer;
+	packet_writer_init(&writer, 1);
+
+	while (packet_reader_read(request) == PACKET_READ_NORMAL)
+		die(_("bundle-uri: unexpected argument: '%s'"), request->line);
+	if (request->status != PACKET_READ_FLUSH)
+		die(_("bundle-uri: expected flush after arguments"));
+
+	/* TODO: Implement the communication */
+
+	packet_writer_flush(&writer);
+
+	return 0;
+}
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 4dbc269823c..357111ecce8 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -4,6 +4,7 @@
 #include "hashmap.h"
 #include "strbuf.h"
 
+struct packet_reader;
 struct repository;
 struct string_list;
 
@@ -92,6 +93,12 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * API for serve.c.
+ */
+int bundle_uri_advertise(struct repository *r, struct strbuf *value);
+int bundle_uri_command(struct repository *r, struct packet_reader *request);
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/serve.c b/serve.c
index 733347f602a..cbf4a143cfe 100644
--- a/serve.c
+++ b/serve.c
@@ -7,6 +7,7 @@
 #include "protocol-caps.h"
 #include "serve.h"
 #include "upload-pack.h"
+#include "bundle-uri.h"
 
 static int advertise_sid = -1;
 static int client_hash_algo = GIT_HASH_SHA1;
@@ -135,6 +136,11 @@ static struct protocol_capability capabilities[] = {
 		.advertise = always_advertise,
 		.command = cap_object_info,
 	},
+	{
+		.name = "bundle-uri",
+		.advertise = bundle_uri_advertise,
+		.command = bundle_uri_command,
+	},
 };
 
 void protocol_v2_advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 1896f671cb3..f21e5e9d33d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -13,7 +13,7 @@ test_expect_success 'test capability advertisement' '
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
-	cat >expect <<-EOF &&
+	cat >expect.base <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs=unborn
@@ -21,8 +21,11 @@ test_expect_success 'test capability advertisement' '
 	server-option
 	object-format=$(test_oid algo)
 	object-info
+	EOF
+	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -342,4 +345,39 @@ test_expect_success 'basics of object-info' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with uploadpack.advertiseBundleURIs' '
+	test_config uploadpack.advertiseBundleURIs true &&
+
+	cat >expect.extra <<-EOF &&
+	bundle-uri
+	EOF
+	cat expect.base \
+	    expect.extra \
+	    expect.trailer >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'basics of bundle-uri: dies if not enabled' '
+	test-tool pkt-line pack >in <<-EOF &&
+	command=bundle-uri
+	0000
+	EOF
+
+	cat >err.expect <<-\EOF &&
+	fatal: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	cat >expect <<-\EOF &&
+	ERR serve: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	test_must_fail test-tool serve-v2 --stateless-rpc <in >out 2>err.actual &&
+	test_cmp err.expect err.actual &&
+	test_must_be_empty out
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 2/9] bundle-uri client: add minimal NOOP client
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-11-16 19:51   ` [PATCH v2 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-16 19:51   ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-29  0:57     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Set up all the needed client parts of the "bundle-uri" protocol
extension, without actually doing anything with the bundle URIs.

I.e. if the server says it supports "bundle-uri" we'll issue a
command=bundle-uri after command=ls-refs when we're cloning. We'll
parse the returned output using the code already tested for in
t5750-bundle-uri-parse.sh.

What we aren't doing is actually acting on that data, i.e. downloading
the bundle(s) before we get to doing the command=fetch, and adjusting
our negotiation dialog appropriately. I'll do that in subsequent
commits.

There's a question of what level of encapsulation we should use here,
I've opted to use connect.h in clone.c, but we could also e.g. make
transport_get_remote_refs() invoke this, i.e. make it implicitly get
the bundle-uri list for later steps.

This approach means that we don't "support" this in "git fetch" for
now. I'm starting with the case of initial clones, although as noted
in preceding commits to the protocol documentation nothing about this
approach precludes getting bundles on incremental fetches.

For the t5732-protocol-v2-bundle-uri-http.sh it's not easy to set
environment variables for git-upload-pack (it's started by Apache), so
let's skip the test under T5730_HTTP, and add unused T5730_{FILE,GIT}
prerequisites for consistency and future use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c                        |   7 ++
 bundle-uri.c                           |   4 +
 connect.c                              |  47 ++++++++
 remote.h                               |   5 +
 t/lib-t5730-protocol-v2-bundle-uri.sh  | 148 +++++++++++++++++++++++++
 t/t5730-protocol-v2-bundle-uri-file.sh |  36 ++++++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 +++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 +++
 transport-helper.c                     |  13 +++
 transport-internal.h                   |   7 ++
 transport.c                            |  55 +++++++++
 transport.h                            |  19 ++++
 12 files changed, 375 insertions(+)
 create mode 100644 t/lib-t5730-protocol-v2-bundle-uri.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh

diff --git a/builtin/clone.c b/builtin/clone.c
index 547d6464b3c..edf98295af2 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -27,6 +27,7 @@
 #include "iterator.h"
 #include "sigchain.h"
 #include "branch.h"
+#include "connect.h"
 #include "remote.h"
 #include "run-command.h"
 #include "connected.h"
@@ -1266,6 +1267,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
+	/*
+	 * Populate transport->got_remote_bundle_uri and
+	 * transport->bundle_uri. We might get nothing.
+	 */
+	transport_get_remote_bundle_uri(transport);
+
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 
diff --git a/bundle-uri.c b/bundle-uri.c
index 32022595964..2201b604b11 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -571,6 +571,10 @@ int bundle_uri_advertise(struct repository *r, struct strbuf *value)
 {
 	static int advertise_bundle_uri = -1;
 
+	if (value &&
+	    git_env_bool("GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE", 0))
+		strbuf_addstr(value, "test-unknown-capability-value");
+
 	if (advertise_bundle_uri != -1)
 		goto cached;
 
diff --git a/connect.c b/connect.c
index 5ea53deda23..d39effb7492 100644
--- a/connect.c
+++ b/connect.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "protocol.h"
 #include "alias.h"
+#include "bundle-uri.h"
 
 static char *server_capabilities_v1;
 static struct strvec server_capabilities_v2 = STRVEC_INIT;
@@ -491,6 +492,52 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	}
 }
 
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc)
+{
+	int line_nr = 1;
+
+	/* Assert bundle-uri support */
+	server_supports_v2("bundle-uri", 1);
+
+	/* (Re-)send capabilities */
+	send_capabilities(fd_out, reader);
+
+	/* Send command */
+	packet_write_fmt(fd_out, "command=bundle-uri\n");
+	packet_delim(fd_out);
+
+	/* Send options */
+	if (git_env_bool("GIT_TEST_PROTOCOL_BAD_BUNDLE_URI", 0))
+		packet_write_fmt(fd_out, "test-bad-client\n");
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *line = reader->line;
+		line_nr++;
+
+		if (!bundle_uri_parse_line(bundles, line))
+			continue;
+
+		return error(_("error on bundle-uri response line %d: %s"),
+			     line_nr, line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		return error(_("expected flush after bundle-uri listing"));
+
+	/*
+	 * Might die(), but obscure enough that that's OK, e.g. in
+	 * serve.c we'll call BUG() on its equivalent (the
+	 * PACKET_READ_RESPONSE_END check).
+	 */
+	check_stateless_delimiter(stateless_rpc, reader,
+				  _("expected response end packet after ref listing"));
+
+	return 0;
+}
+
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     struct transport_ls_refs_options *transport_options,
diff --git a/remote.h b/remote.h
index 1c4621b414b..1ebbe42792e 100644
--- a/remote.h
+++ b/remote.h
@@ -234,6 +234,11 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
+/* Used for protocol v2 in order to retrieve refs from a remote */
+struct bundle_list;
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc);
+
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
 /*
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
new file mode 100644
index 00000000000..27294e9c976
--- /dev/null
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -0,0 +1,148 @@
+# Included from t573*-protocol-v2-bundle-uri-*.sh
+
+T5730_PARENT=
+T5730_URI=
+T5730_BUNDLE_URI=
+case "$T5730_PROTOCOL" in
+file)
+	T5730_PARENT=file_parent
+	T5730_URI="file://$PWD/file_parent"
+	T5730_BUNDLE_URI="$T5730_URI/fake.bdl"
+	test_set_prereq T5730_FILE
+	;;
+git)
+	. "$TEST_DIRECTORY"/lib-git-daemon.sh
+	start_git_daemon --export-all --enable=receive-pack
+	T5730_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
+	T5730_URI="$GIT_DAEMON_URL/parent"
+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq T5730_GIT
+	;;
+http)
+	. "$TEST_DIRECTORY"/lib-httpd.sh
+	start_httpd
+	T5730_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
+	T5730_URI="$HTTPD_URL/smart/http_parent"
+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq T5730_HTTP
+	;;
+*)
+	BUG "Need to pass valid T5730_PROTOCOL (was $T5730_PROTOCOL)"
+	;;
+esac
+
+test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
+	git init "$T5730_PARENT" &&
+	test_commit -C "$T5730_PARENT" one &&
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
+'
+
+# Poor man's URI escaping. Good enough for the test suite whose trash
+# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
+# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
+test_uri_escape() {
+	sed 's/ /%20/g'
+}
+
+case "$T5730_PROTOCOL" in
+http)
+	test_expect_success "setup config for $T5730_PROTOCOL:// tests" '
+		git -C "$T5730_PARENT" config http.receivepack true
+	'
+	;;
+*)
+	;;
+esac
+T5730_BUNDLE_URI_ESCAPED=$(echo "$T5730_BUNDLE_URI" | test_uri_escape)
+
+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundle-uri" '
+	test_when_finished "rm -f log" &&
+	test_when_finished "git -C \"$T5730_PARENT\" config uploadpack.advertiseBundleURIs true" &&
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	! grep bundle-uri log
+'
+
+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bundle-uri" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep bundle-uri log
+'
+
+test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
+		"$T5730_BUNDLE_URI_ESCAPED" &&
+
+	cat >err.expect <<-\EOF &&
+	Cloning into '"'"'child'"'"'...
+	EOF
+	case "$T5730_PROTOCOL" in
+	file)
+		cat >fatal-bundle-uri.expect <<-\EOF
+		fatal: bundle-uri: unexpected argument: '"'"'test-bad-client'"'"'
+		EOF
+		;;
+	*)
+		cat >fatal.expect <<-\EOF
+		fatal: read error: Connection reset by peer
+		EOF
+		;;
+	esac &&
+
+	test_when_finished "rm -rf child" &&
+	test_must_fail ok=sigpipe env \
+		GIT_TRACE_PACKET="$PWD/log" \
+		GIT_TEST_PROTOCOL_BAD_BUNDLE_URI=true \
+		git -c protocol.version=2 \
+		clone "$T5730_URI" child \
+		>out 2>err &&
+	test_must_be_empty out &&
+
+	grep -v -e ^fatal: -e ^error: err >err.actual &&
+	test_cmp err.expect err.actual &&
+
+	case "$T5730_PROTOCOL" in
+	file)
+		# Due to general race conditions with client/server replies we
+		# may or may not get "fatal: the remote end hung up
+		# expectedly" here
+		grep "^fatal: bundle-uri:" err >fatal-bundle-uri.actual &&
+		test_cmp fatal-bundle-uri.expect fatal-bundle-uri.actual
+		;;
+	*)
+		grep "^fatal:" err >fatal.actual &&
+		# Due to the same race conditions this might be
+		# "fatal: read error: Connection reset by peer", "fatal: the remote end
+		# hung up unexpectedly" etc.
+		cat fatal.actual &&
+		test_file_not_empty fatal.actual
+		;;
+	esac &&
+
+	grep "clone> test-bad-client$" log >sent-bad-request &&
+	test_file_not_empty sent-bad-request
+'
diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
new file mode 100755
index 00000000000..89203d3a23c
--- /dev/null
+++ b/t/t5730-protocol-v2-bundle-uri-file.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'file://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+T5730_PROTOCOL=file
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_expect_success "unknown capability value with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE=true \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$T5730_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	grep "> bundle-uri=test-unknown-capability-value" log
+'
+
+test_done
diff --git a/t/t5731-protocol-v2-bundle-uri-git.sh b/t/t5731-protocol-v2-bundle-uri-git.sh
new file mode 100755
index 00000000000..282847b311f
--- /dev/null
+++ b/t/t5731-protocol-v2-bundle-uri-git.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+T5730_PROTOCOL=git
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_done
diff --git a/t/t5732-protocol-v2-bundle-uri-http.sh b/t/t5732-protocol-v2-bundle-uri-http.sh
new file mode 100755
index 00000000000..fcc1cf3faef
--- /dev/null
+++ b/t/t5732-protocol-v2-bundle-uri-http.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+T5730_PROTOCOL=http
+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
+
+test_done
diff --git a/transport-helper.c b/transport-helper.c
index e95267a4ab5..3ea7c2bb5ad 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
 	return ret;
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	get_helper(transport);
+
+	if (process_connect(transport, 0)) {
+		do_take_over(transport);
+		return transport->vtable->get_bundle_uri(transport);
+	}
+
+	return -1;
+}
+
 static struct transport_vtable vtable = {
 	.set_option	= set_helper_option,
 	.get_refs_list	= get_refs_list,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs,
 	.push_refs	= push_refs,
 	.connect	= connect_helper,
diff --git a/transport-internal.h b/transport-internal.h
index c4ca0b733ac..90ea749e5cf 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -26,6 +26,13 @@ struct transport_vtable {
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
 				     struct transport_ls_refs_options *transport_options);
 
+	/**
+	 * Populates the remote side's bundle-uri under protocol v2,
+	 * if the "bundle-uri" capability was advertised. Returns 0 if
+	 * OK, negative values on error.
+	 */
+	int (*get_bundle_uri)(struct transport *transport);
+
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
diff --git a/transport.c b/transport.c
index e7b97194c10..a020adc1f56 100644
--- a/transport.c
+++ b/transport.c
@@ -22,6 +22,7 @@
 #include "protocol.h"
 #include "object-store.h"
 #include "color.h"
+#include "bundle-uri.h"
 
 static int transport_use_color = -1;
 static char transport_colors[][COLOR_MAXLEN] = {
@@ -359,6 +360,25 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	return handshake(transport, for_push, options, 1);
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	struct git_transport_data *data = transport->data;
+	struct packet_reader reader;
+	int stateless_rpc = transport->stateless_rpc;
+
+	if (!transport->bundles) {
+		CALLOC_ARRAY(transport->bundles, 1);
+		init_bundle_list(transport->bundles);
+	}
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	return get_remote_bundle_uri(data->fd[1], &reader,
+				     transport->bundles, stateless_rpc);
+}
+
 static int fetch_refs_via_pack(struct transport *transport,
 			       int nr_heads, struct ref **to_fetch)
 {
@@ -902,6 +922,7 @@ static int disconnect_git(struct transport *transport)
 
 static struct transport_vtable taken_over_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.disconnect	= disconnect_git
@@ -1054,6 +1075,7 @@ static struct transport_vtable bundle_vtable = {
 
 static struct transport_vtable builtin_smart_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.connect	= connect_git,
@@ -1068,6 +1090,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	ret->progress = isatty(2);
 	string_list_init_dup(&ret->pack_lockfiles);
 
+	CALLOC_ARRAY(ret->bundles, 1);
+	init_bundle_list(ret->bundles);
+
 	if (!remote)
 		BUG("No remote provided to transport_get()");
 
@@ -1482,6 +1507,34 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
+int transport_get_remote_bundle_uri(struct transport *transport)
+{
+	const struct transport_vtable *vtable = transport->vtable;
+
+	/* Check config only once. */
+	if (transport->got_remote_bundle_uri++)
+		return 0;
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
+	/*
+	 * This is intentionally below the transport.injectBundleURI,
+	 * we want to be able to inject into protocol v0, or into the
+	 * dialog of a server who doesn't support this.
+	 */
+	if (!vtable->get_bundle_uri)
+		return error(_("bundle-uri operation not supported by protocol"));
+
+	if (vtable->get_bundle_uri(transport) < 0)
+		return error(_("could not retrieve server-advertised bundle-uri list"));
+	return 0;
+}
+
 void transport_unlock_pack(struct transport *transport, unsigned int flags)
 {
 	int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER);
@@ -1512,6 +1565,8 @@ int transport_disconnect(struct transport *transport)
 		ret = transport->vtable->disconnect(transport);
 	if (transport->got_remote_refs)
 		free_refs((void *)transport->remote_refs);
+	clear_bundle_list(transport->bundles);
+	free(transport->bundles);
 	free(transport);
 	return ret;
 }
diff --git a/transport.h b/transport.h
index b5bf7b3e704..85150f504fb 100644
--- a/transport.h
+++ b/transport.h
@@ -62,6 +62,7 @@ enum transport_family {
 	TRANSPORT_FAMILY_IPV6
 };
 
+struct bundle_list;
 struct transport {
 	const struct transport_vtable *vtable;
 
@@ -76,6 +77,18 @@ struct transport {
 	 */
 	unsigned got_remote_refs : 1;
 
+	/**
+	 * Indicates whether we already called get_bundle_uri_list(); set by
+	 * transport.c::transport_get_remote_bundle_uri().
+	 */
+	unsigned got_remote_bundle_uri : 1;
+
+	/*
+	 * The results of "command=bundle-uri", if both sides support
+	 * the "bundle-uri" capability.
+	 */
+	struct bundle_list *bundles;
+
 	/*
 	 * Transports that call take-over destroys the data specific to
 	 * the transport type while doing so, and cannot be reused.
@@ -281,6 +294,12 @@ void transport_ls_refs_options_release(struct transport_ls_refs_options *opts);
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    struct transport_ls_refs_options *transport_options);
 
+/**
+ * Retrieve bundle URI(s) from a remote. Populates "struct
+ * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ */
+int transport_get_remote_bundle_uri(struct transport *transport);
+
 /*
  * Fetch the hash algorithm used by a remote.
  *
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 3/9] bundle-uri client: add helper for testing server
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-11-16 19:51   ` [PATCH v2 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-16 19:51   ` [PATCH v2 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-16 19:51   ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-29  0:59     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

Since in the "git clone" case we'll have already done the handshake(),
but not here, introduce a "got_advertisement" state along with
"got_remote_heads". It seems to me that the "got_remote_heads" is
badly named in the first place, and the whole logic of eagerly getting
ls-refs on handshake() or not could be refactored somewhat, but let's
not do that now, and instead just add another self-documenting state
variable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c                       |  2 +-
 t/helper/test-bundle-uri.c            | 46 +++++++++++++++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh | 63 ++++++++++++++++++++++-----
 transport.c                           | 43 ++++++++++++++----
 transport.h                           |  6 ++-
 5 files changed, 139 insertions(+), 21 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index edf98295af2..22b1e506452 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1271,7 +1271,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 * Populate transport->got_remote_bundle_uri and
 	 * transport->bundle_uri. We might get nothing.
 	 */
-	transport_get_remote_bundle_uri(transport);
+	transport_get_remote_bundle_uri(transport, 1);
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 25afd393428..ffb975b7b4f 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -3,6 +3,10 @@
 #include "bundle-uri.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "transport.h"
+#include "ref-filter.h"
+#include "remote.h"
+#include "refs.h"
 
 enum input_mode {
 	KEY_VALUE_PAIRS,
@@ -68,6 +72,46 @@ usage:
 	usage_with_options(usage, options);
 }
 
+static int cmd_ls_remote(int argc, const char **argv)
+{
+	const char *uploadpack = NULL;
+	struct string_list server_options = STRING_LIST_INIT_DUP;
+	const char *dest;
+	struct remote *remote;
+	struct transport *transport;
+	int status = 0;
+
+	dest = argc > 1 ? argv[1] : NULL;
+
+	remote = remote_get(dest);
+	if (!remote) {
+		if (dest)
+			die(_("bad repository '%s'"), dest);
+		die(_("no remote configured to get bundle URIs from"));
+	}
+	if (!remote->url_nr)
+		die(_("remote '%s' has no configured URL"), dest);
+
+	transport = transport_get(remote, NULL);
+	if (uploadpack)
+		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
+	if (server_options.nr)
+		transport->server_options = &server_options;
+
+	if (transport_get_remote_bundle_uri(transport, 0) < 0) {
+		error(_("could not get the bundle-uri list"));
+		status = 1;
+		goto cleanup;
+	}
+
+	print_bundle_list(stdout, transport->bundles);
+
+cleanup:
+	if (transport_disconnect(transport))
+		return 1;
+	return status;
+}
+
 int cmd__bundle_uri(int argc, const char **argv)
 {
 	const char *usage[] = {
@@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
 	if (!strcmp(argv[1], "parse-config"))
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
+	if (!strcmp(argv[1], "ls-remote"))
+		return cmd_ls_remote(argc - 1, argv + 1);
 	error("there is no test-tool bundle-uri tool '%s'", argv[1]);
 
 usage:
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index 27294e9c976..c327544641b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -34,7 +34,9 @@ esac
 test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
 	git init "$T5730_PARENT" &&
 	test_commit -C "$T5730_PARENT" one &&
-	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true &&
+	git -C "$T5730_PARENT" config bundle.version 1 &&
+	git -C "$T5730_PARENT" config bundle.mode all
 '
 
 # Poor man's URI escaping. Good enough for the test suite whose trash
@@ -61,9 +63,8 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundl
 	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
-	git \
-		-c protocol.version=2 \
-		ls-remote --symref "$T5730_URI" \
+	test-tool bundle-uri \
+		ls-remote "$T5730_URI" \
 		>actual 2>err &&
 
 	# Server responded using protocol v2
@@ -76,12 +77,11 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
 	test_when_finished "rm -f log" &&
 
 	test_config -C "$T5730_PARENT" \
-		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
-	git \
-		-c protocol.version=2 \
-		ls-remote --symref "$T5730_URI" \
+	test-tool bundle-uri \
+		ls-remote "$T5730_URI" \
 		>actual 2>err &&
 
 	# Server responded using protocol v2
@@ -94,8 +94,8 @@ test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
 test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
 	test_when_finished "rm -f log" &&
 
-	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
-		"$T5730_BUNDLE_URI_ESCAPED" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
 
 	cat >err.expect <<-\EOF &&
 	Cloning into '"'"'child'"'"'...
@@ -146,3 +146,46 @@ test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protoc
 	grep "clone> test-bad-client$" log >sent-bad-request &&
 	test_file_not_empty sent-bad-request
 '
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and extra data" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
+
+	# Extra data should be ignored
+	test_config -C "$T5730_PARENT" bundle.only.extra bogus &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
diff --git a/transport.c b/transport.c
index a020adc1f56..86460f5be28 100644
--- a/transport.c
+++ b/transport.c
@@ -198,6 +198,7 @@ struct git_transport_data {
 	struct git_transport_options options;
 	struct child_process *conn;
 	int fd[2];
+	unsigned got_advertisement : 1;
 	unsigned got_remote_heads : 1;
 	enum protocol_version version;
 	struct oid_array extra_have;
@@ -346,6 +347,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 		BUG("unknown protocol version");
 	}
 	data->got_remote_heads = 1;
+	data->got_advertisement = 1;
 	transport->hash_algo = reader.hash_algo;
 
 	if (reader.line_peeked)
@@ -371,6 +373,33 @@ static int get_bundle_uri(struct transport *transport)
 		init_bundle_list(transport->bundles);
 	}
 
+	if (!data->got_advertisement) {
+		struct ref *refs;
+		struct git_transport_data *data = transport->data;
+		enum protocol_version version;
+
+		refs = handshake(transport, 0, NULL, 0);
+		version = data->version;
+
+		switch (version) {
+		case protocol_v2:
+			assert(!refs);
+			break;
+		case protocol_v0:
+		case protocol_v1:
+		case protocol_unknown_version:
+			assert(refs);
+			break;
+		}
+	}
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
 	packet_reader_init(&reader, data->fd[0], NULL, 0,
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
@@ -1507,7 +1536,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
-int transport_get_remote_bundle_uri(struct transport *transport)
+int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 {
 	const struct transport_vtable *vtable = transport->vtable;
 
@@ -1515,20 +1544,16 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 	if (transport->got_remote_bundle_uri++)
 		return 0;
 
-	/*
-	 * "Support" protocol v0 and v2 without bundle-uri support by
-	 * silently degrading to a NOOP.
-	 */
-	if (!server_supports_v2("bundle-uri", 0))
-		return 0;
-
 	/*
 	 * This is intentionally below the transport.injectBundleURI,
 	 * we want to be able to inject into protocol v0, or into the
 	 * dialog of a server who doesn't support this.
 	 */
-	if (!vtable->get_bundle_uri)
+	if (!vtable->get_bundle_uri) {
+		if (quiet)
+			return -1;
 		return error(_("bundle-uri operation not supported by protocol"));
+	}
 
 	if (vtable->get_bundle_uri(transport) < 0)
 		return error(_("could not retrieve server-advertised bundle-uri list"));
diff --git a/transport.h b/transport.h
index 85150f504fb..dd0115b83bf 100644
--- a/transport.h
+++ b/transport.h
@@ -297,8 +297,12 @@ const struct ref *transport_get_remote_refs(struct transport *transport,
 /**
  * Retrieve bundle URI(s) from a remote. Populates "struct
  * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ *
+ * With `quiet=1` it will not complain if the serve doesn't support
+ * the protocol, but only if we discover the server uses it, and
+ * encounter issues then.
  */
-int transport_get_remote_bundle_uri(struct transport *transport);
+int transport_get_remote_bundle_uri(struct transport *transport, int quiet);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-16 19:51   ` Derrick Stolee via GitGitGadget
  2022-11-29  1:00     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                          | 16 +++++++++++-
 t/lib-t5730-protocol-v2-bundle-uri.sh | 35 +++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 2201b604b11..3469f1aaa98 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -585,6 +585,16 @@ cached:
 	return advertise_bundle_uri;
 }
 
+static int config_to_packet_line(const char *key, const char *value, void *data)
+{
+	struct packet_reader *writer = data;
+
+	if (!strncmp(key, "bundle.", 7))
+		packet_write_fmt(writer->fd, "%s=%s", key, value);
+
+	return 0;
+}
+
 int bundle_uri_command(struct repository *r,
 		       struct packet_reader *request)
 {
@@ -596,7 +606,11 @@ int bundle_uri_command(struct repository *r,
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("bundle-uri: expected flush after arguments"));
 
-	/* TODO: Implement the communication */
+	/*
+	 * Read all "bundle.*" config lines to the client as key=value
+	 * packet lines.
+	 */
+	git_config(config_to_packet_line, &writer);
 
 	packet_writer_flush(&writer);
 
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index c327544641b..000fcc5e20b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -158,6 +158,8 @@ test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $T5730_BUNDLE_URI_ESCAPED
 	EOF
 	GIT_TRACE_PACKET="$PWD/log" \
 	test-tool bundle-uri \
@@ -181,6 +183,39 @@ test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and ext
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $T5730_BUNDLE_URI_ESCAPED
+	EOF
+	GIT_TRACE_PACKET="$PWD/log" \
+	test-tool bundle-uri \
+		ls-remote \
+		"$T5730_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+
+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 with list" '
+	test_when_finished "rm -f log" &&
+
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle1.uri "$T5730_BUNDLE_URI_ESCAPED-1.bdl" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle2.uri "$T5730_BUNDLE_URI_ESCAPED-2.bdl" &&
+	test_config -C "$T5730_PARENT" \
+		bundle.bundle3.uri "$T5730_BUNDLE_URI_ESCAPED-3.bdl" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "bundle1"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-1.bdl
+	[bundle "bundle2"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-2.bdl
+	[bundle "bundle3"]
+		uri = $T5730_BUNDLE_URI_ESCAPED-3.bdl
 	EOF
 	GIT_TRACE_PACKET="$PWD/log" \
 	test-tool bundle-uri \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51   ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-11-29  1:03     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

To enable this setting by default in the correct tests, add a
GIT_TEST_BUNDLE_URI environment variable.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/config/transfer.txt     |  6 ++++++
 t/lib-t5730-protocol-v2-bundle-uri.sh |  3 +++
 transport.c                           | 10 +++++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index 264812cca4d..c3ac767d1e4 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -115,3 +115,9 @@ transfer.unpackLimit::
 transfer.advertiseSID::
 	Boolean. When true, client and server processes will advertise their
 	unique session IDs to their remote counterpart. Defaults to false.
+
+transfer.bundleURI::
+	When `true`, local `git clone` commands will request bundle
+	information from the remote server (if advertised) and download
+	bundles before continuing the clone through the Git protocol.
+	Defaults to `false`.
diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
index 000fcc5e20b..872bc39ad1b 100644
--- a/t/lib-t5730-protocol-v2-bundle-uri.sh
+++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
@@ -1,5 +1,8 @@
 # Included from t573*-protocol-v2-bundle-uri-*.sh
 
+GIT_TEST_BUNDLE_URI=1
+export GIT_TEST_BUNDLE_URI
+
 T5730_PARENT=
 T5730_URI=
 T5730_BUNDLE_URI=
diff --git a/transport.c b/transport.c
index 86460f5be28..b33180226ae 100644
--- a/transport.c
+++ b/transport.c
@@ -1538,6 +1538,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 
 int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 {
+	int value = 0;
 	const struct transport_vtable *vtable = transport->vtable;
 
 	/* Check config only once. */
@@ -1545,10 +1546,13 @@ int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 		return 0;
 
 	/*
-	 * This is intentionally below the transport.injectBundleURI,
-	 * we want to be able to inject into protocol v0, or into the
-	 * dialog of a server who doesn't support this.
+	 * Don't use bundle-uri at all, if configured not to. Only proceed
+	 * if GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
 	 */
+	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&
+	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
+		return 0;
+
 	if (!vtable->get_bundle_uri) {
 		if (quiet)
 			return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path()
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-16 19:51   ` Derrick Stolee via GitGitGadget
  2022-11-29  1:03     ` Victoria Dye
  2022-12-02 18:32     ` Ævar Arnfjörð Bjarmason
  2022-11-16 19:51   ` [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  9 siblings, 2 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.

Re-add the method, this time in strbuf.c, but with a new name:
strbuf_strip_file_from_path(). The method requirements are slightly
modified to allow a trailing slash, in which case nothing is done, which
makes the name change valuable. The return value is the number of bytes
removed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 strbuf.c |  9 +++++++++
 strbuf.h | 12 ++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 0890b1405c5..8d1e2e8bb61 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1200,3 +1200,12 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 	free(path2);
 	return res;
 }
+
+size_t strbuf_strip_file_from_path(struct strbuf *buf)
+{
+	size_t len = buf->len;
+	size_t offset = offset_1st_component(buf->buf);
+	char *path_sep = find_last_dir_sep(buf->buf + offset);
+	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
+	return len - buf->len;
+}
diff --git a/strbuf.h b/strbuf.h
index 76965a17d44..4822b713786 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -664,6 +664,18 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
 int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			      const char *const *env);
 
+/*
+ * Remove the filename from the provided path string. If the path
+ * contains a trailing separator, then the path is considered a directory
+ * and nothing is modified. Returns the number of characters removed from
+ * the path.
+ *
+ * Examples:
+ * - "/path/to/file" -> "/path/to/" (returns: 4)
+ * - "/path/to/dir/" -> "/path/to/dir/" (returns: 0)
+ */
+size_t strbuf_strip_file_from_path(struct strbuf *buf);
+
 void strbuf_add_lines(struct strbuf *sb,
 		      const char *prefix,
 		      const char *buf,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51   ` Derrick Stolee via GitGitGadget
  2022-11-29  1:25     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles.
This allows easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 16 ++++++++++-
 bundle-uri.h                |  9 +++++++
 t/helper/test-bundle-uri.c  |  2 ++
 t/t5750-bundle-uri-parse.sh | 54 +++++++++++++++++++++++++++++++++++++
 transport.c                 |  3 +++
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 3469f1aaa98..ab91bb32e9b 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -7,6 +7,7 @@
 #include "hashmap.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "remote.h"
 
 static int compare_bundles(const void *hashmap_cmp_fn_data,
 			   const struct hashmap_entry *he1,
@@ -49,6 +50,7 @@ void clear_bundle_list(struct bundle_list *list)
 
 	for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
 	hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
+	free(list->baseURI);
 }
 
 int for_all_bundles_in_list(struct bundle_list *list,
@@ -163,7 +165,7 @@ static int bundle_list_update(const char *key, const char *value,
 	if (!strcmp(subkey, "uri")) {
 		if (bundle->uri)
 			return -1;
-		bundle->uri = xstrdup(value);
+		bundle->uri = relative_url(list->baseURI, value, NULL);
 		return 0;
 	}
 
@@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
+	if (!list->baseURI) {
+		struct strbuf baseURI = STRBUF_INIT;
+		strbuf_addstr(&baseURI, uri);
+
+		/*
+		 * If the URI does not end with a trailing slash, then
+		 * remove the filename portion of the path. This is
+		 * important for relative URIs.
+		 */
+		strbuf_strip_file_from_path(&baseURI);
+		list->baseURI = strbuf_detach(&baseURI, NULL);
+	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
diff --git a/bundle-uri.h b/bundle-uri.h
index 357111ecce8..7905e56732c 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -61,6 +61,15 @@ struct bundle_list {
 	int version;
 	enum bundle_list_mode mode;
 	struct hashmap bundles;
+
+	/**
+	 * The baseURI of a bundle_list is used as the base for any
+	 * relative URIs advertised by the bundle list at that location.
+	 *
+	 * When the list is generated from a Git server, then use that
+	 * server's location.
+	 */
+	char *baseURI;
 };
 
 void init_bundle_list(struct bundle_list *list);
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index ffb975b7b4f..5aa0b494ce3 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 	init_bundle_list(&list);
 
+	list.baseURI = xstrdup("<uri>");
+
 	switch (mode) {
 	case KEY_VALUE_PAIRS:
 		if (argc != 1)
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index c2fe3f9c5a5..ed5262a8d2b 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -30,6 +30,30 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'bundle_uri_parse_line(): relative URIs' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-key-values in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
 	cat >in <<-\EOF &&
 	=bogus-value
@@ -136,6 +160,36 @@ test_expect_success 'parse config format: just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'parse config format: relative URIs' '
+	cat >in <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = bundle.bdl
+	[bundle "two"]
+		uri = ../bundle.bdl
+	[bundle "three"]
+		uri = sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-config in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'parse config format edge cases: empty key or value' '
 	cat >in1 <<-\EOF &&
 	= bogus-value
diff --git a/transport.c b/transport.c
index b33180226ae..2c4ff0c2023 100644
--- a/transport.c
+++ b/transport.c
@@ -1553,6 +1553,9 @@ int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
 	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
 		return 0;
 
+	if (!transport->bundles->baseURI)
+		transport->bundles->baseURI = xstrdup(transport->url);
+
 	if (!vtable->get_bundle_uri) {
 		if (quiet)
 			return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 8/9] bundle-uri: download bundles from an advertised list
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51   ` Derrick Stolee via GitGitGadget
  2022-11-29  1:51     ` Victoria Dye
  2022-11-16 19:51   ` [PATCH v2 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  9 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 command. To actually
download and unbundle the advertised bundles, we need a different
mechanism.

Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.

In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c | 21 +++++++++++++++++++++
 bundle-uri.h | 11 +++++++++++
 2 files changed, 32 insertions(+)

diff --git a/bundle-uri.c b/bundle-uri.c
index ab91bb32e9b..5914d220c43 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -577,6 +577,27 @@ cleanup:
 	return result;
 }
 
+int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
+{
+	int result;
+	struct bundle_list global_list;
+
+	init_bundle_list(&global_list);
+
+	/* If a bundle is added to this global list, then it is required. */
+	global_list.mode = BUNDLE_MODE_ALL;
+
+	if ((result = download_bundle_list(r, list, &global_list, 0)))
+		goto cleanup;
+
+	result = unbundle_all_bundles(r, &global_list);
+
+cleanup:
+	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
+	clear_bundle_list(&global_list);
+	return result;
+}
+
 /**
  * API for serve.c.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 7905e56732c..a75b68d2f5a 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -102,6 +102,17 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * Given a bundle list that was already advertised (likely by the
+ * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
+ * bundles according to the bundle strategy of that list.
+ *
+ * Returns non-zero if no bundle information is found at the given 'uri'.
+ */
+int fetch_bundle_list(struct repository *r,
+		      const char *uri,
+		      struct bundle_list *list);
+
 /**
  * API for serve.c.
  */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 9/9] clone: unbundle the advertised bundles
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (7 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-11-16 19:51   ` Derrick Stolee via GitGitGadget
  2022-11-29  1:59     ` Victoria Dye
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  9 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-11-16 19:51 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c  | 26 +++++++++++++++++----
 t/t5601-clone.sh | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 22b1e506452..09f10477ed6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1267,11 +1267,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
-	/*
-	 * Populate transport->got_remote_bundle_uri and
-	 * transport->bundle_uri. We might get nothing.
-	 */
-	transport_get_remote_bundle_uri(transport, 1);
+	if (!bundle_uri) {
+		/*
+		* Populate transport->got_remote_bundle_uri and
+		* transport->bundle_uri. We might get nothing.
+		*/
+		transport_get_remote_bundle_uri(transport, 1);
+
+		if (transport->bundles &&
+		    hashmap_get_size(&transport->bundles->bundles)) {
+			/* At this point, we need the_repository to match the cloned repo. */
+			if (repo_init(the_repository, git_dir, work_tree))
+				warning(_("failed to initialize the repo, skipping bundle URI"));
+			if (fetch_bundle_list(the_repository,
+					      remote->url[0],
+					      transport->bundles))
+				warning(_("failed to fetch advertised bundles"));
+		} else {
+			clear_bundle_list(transport->bundles);
+			FREE_AND_NULL(transport->bundles);
+		}
+	}
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 45f0803ed4d..d1d8139751e 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -795,6 +795,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
 	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
 '
 
+test_expect_success 'auto-discover bundle URI from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo2.git repo2 &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
+test_expect_success 'auto-discover multiple bundles from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	test_commit -C src new &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.mode all &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.new.uri "$HTTPD_URL/new.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo3.git repo3 &&
+
+	# We should fetch _both_ bundles
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 2/9] bundle-uri client: add minimal NOOP client
  2022-11-16 19:51   ` [PATCH v2 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-29  0:57     ` Victoria Dye
  2022-12-02 15:00       ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  0:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>  <avarab@gmail.com>
> 
> Set up all the needed client parts of the "bundle-uri" protocol
> extension, without actually doing anything with the bundle URIs.
> 
> I.e. if the server says it supports "bundle-uri" we'll issue a
> command=bundle-uri after command=ls-refs when we're cloning. We'll
> parse the returned output using the code already tested for in
> t5750-bundle-uri-parse.sh.
> 
> What we aren't doing is actually acting on that data, i.e. downloading
> the bundle(s) before we get to doing the command=fetch, and adjusting
> our negotiation dialog appropriately. I'll do that in subsequent
> commits.

Makes sense.

> 
> There's a question of what level of encapsulation we should use here,
> I've opted to use connect.h in clone.c, but we could also e.g. make
> transport_get_remote_refs() invoke this, i.e. make it implicitly get
> the bundle-uri list for later steps.

I'm not sure I follow what this sentence is saying. Looking at the
implementation below [1], you've added a call to
'transport_get_remote_bundle_uri()' in 'clone.c', but that's defined in
'transport.h' (which is already included in 'clone.c'). Why is 'connect.h'
needed at all?

> 
> This approach means that we don't "support" this in "git fetch" for
> now. I'm starting with the case of initial clones, although as noted
> in preceding commits to the protocol documentation nothing about this
> approach precludes getting bundles on incremental fetches.

This explanation seems more complicated than necessary. I think it's
sufficient to say "The no-op client is initially used only in 'clone' to
test the basic functionality. The bundle URI client will be integrated into
fetch, pull, etc. in later patches".

> 
> For the t5732-protocol-v2-bundle-uri-http.sh it's not easy to set
> environment variables for git-upload-pack (it's started by Apache), so
> let's skip the test under T5730_HTTP, and add unused T5730_{FILE,GIT}
> prerequisites for consistency and future use.

"skip the test" doesn't explain *which* test is skipped (and it doesn't look
like you skip all of them). I think you're referring to "bad client with
$T5730_PROTOCOL:// using protocol v2" specifically?

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/clone.c                        |   7 ++
>  bundle-uri.c                           |   4 +
>  connect.c                              |  47 ++++++++
>  remote.h                               |   5 +
>  t/lib-t5730-protocol-v2-bundle-uri.sh  | 148 +++++++++++++++++++++++++
>  t/t5730-protocol-v2-bundle-uri-file.sh |  36 ++++++
>  t/t5731-protocol-v2-bundle-uri-git.sh  |  17 +++
>  t/t5732-protocol-v2-bundle-uri-http.sh |  17 +++
>  transport-helper.c                     |  13 +++
>  transport-internal.h                   |   7 ++
>  transport.c                            |  55 +++++++++
>  transport.h                            |  19 ++++
>  12 files changed, 375 insertions(+)
>  create mode 100644 t/lib-t5730-protocol-v2-bundle-uri.sh
>  create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
>  create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
>  create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh
> 
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 547d6464b3c..edf98295af2 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -27,6 +27,7 @@
>  #include "iterator.h"
>  #include "sigchain.h"
>  #include "branch.h"
> +#include "connect.h"
>  #include "remote.h"
>  #include "run-command.h"
>  #include "connected.h"
> @@ -1266,6 +1267,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (refs)
>  		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
>  
> +	/*
> +	 * Populate transport->got_remote_bundle_uri and
> +	 * transport->bundle_uri. We might get nothing.
> +	 */
> +	transport_get_remote_bundle_uri(transport);

[1] 

> +
>  	if (mapped_refs) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
>  
> diff --git a/bundle-uri.c b/bundle-uri.c
> index 32022595964..2201b604b11 100644
> --- a/bundle-uri.c
> +++ b/bundle-uri.c
> @@ -571,6 +571,10 @@ int bundle_uri_advertise(struct repository *r, struct strbuf *value)
>  {
>  	static int advertise_bundle_uri = -1;
>  
> +	if (value &&
> +	    git_env_bool("GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE", 0))
> +		strbuf_addstr(value, "test-unknown-capability-value");

It looks like 'GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE' is being used
to "mock" server responses to test certain behavior on the client side. I'm
somewhat uncomfortable with how this mixes test-specific code paths with
application code, and AFAICT nothing similar is done for other
advertise/command functions in protocol v2. Is there a way to set up tests
to intercept the client requests and customize the response? 

> +
>  	if (advertise_bundle_uri != -1)
>  		goto cached;
>  
> diff --git a/connect.c b/connect.c
> index 5ea53deda23..d39effb7492 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -15,6 +15,7 @@
>  #include "version.h"
>  #include "protocol.h"
>  #include "alias.h"
> +#include "bundle-uri.h"
>  
>  static char *server_capabilities_v1;
>  static struct strvec server_capabilities_v2 = STRVEC_INIT;
> @@ -491,6 +492,52 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
>  	}
>  }
>  
> +int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
> +			  struct bundle_list *bundles, int stateless_rpc)
> +{
> +	int line_nr = 1;
> +
> +	/* Assert bundle-uri support */
> +	server_supports_v2("bundle-uri", 1);
> +
> +	/* (Re-)send capabilities */
> +	send_capabilities(fd_out, reader);
> +
> +	/* Send command */
> +	packet_write_fmt(fd_out, "command=bundle-uri\n");
> +	packet_delim(fd_out);
> +
> +	/* Send options */
> +	if (git_env_bool("GIT_TEST_PROTOCOL_BAD_BUNDLE_URI", 0))
> +		packet_write_fmt(fd_out, "test-bad-client\n");

Same comment as on 'GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE'. There's
no precedent that I can find for a test variable like this in 'connect.c', and
"in the middle of client code" doesn't seem like an ideal place for it. 

If there really isn't another way of doing this, could the addition of these
'GIT_TEST' variables and their associated tests be split out into a
dedicated commit? That would at least separate the test code paths from the
application code in the commit history.

> +	packet_flush(fd_out);
> +
> +	/* Process response from server */
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +		const char *line = reader->line;
> +		line_nr++;
> +
> +		if (!bundle_uri_parse_line(bundles, line))
> +			continue;
> +
> +		return error(_("error on bundle-uri response line %d: %s"),
> +			     line_nr, line);
> +	}
> +
> +	if (reader->status != PACKET_READ_FLUSH)
> +		return error(_("expected flush after bundle-uri listing"));
> +
> +	/*
> +	 * Might die(), but obscure enough that that's OK, e.g. in
> +	 * serve.c we'll call BUG() on its equivalent (the
> +	 * PACKET_READ_RESPONSE_END check).
> +	 */
> +	check_stateless_delimiter(stateless_rpc, reader,
> +				  _("expected response end packet after ref listing"));
> +
> +	return 0;

The rest of this looks fine to me.

> +}
> +
>  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     struct transport_ls_refs_options *transport_options,
> diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
> new file mode 100644
> index 00000000000..27294e9c976
> --- /dev/null
> +++ b/t/lib-t5730-protocol-v2-bundle-uri.sh

nit: this set of tests is used in more than just 't5730', so the name is
somewhat misleading. It also seems a bit overly-specific to include
"protocol-v2" in the filename (the tests themselves mention "protocol v2"
when it's relevant). What about something like 'lib-proto-bundle-uri.sh'
(using "proto" to mimic 'lib-proto-disable.sh')?

> @@ -0,0 +1,148 @@
> +# Included from t573*-protocol-v2-bundle-uri-*.sh
> +
> +T5730_PARENT=
> +T5730_URI=
> +T5730_BUNDLE_URI=

Similar to the filename nit - these variables are used in tests outside of
't5730', so their names are not quite accurate to their usage. 

> +# Poor man's URI escaping. Good enough for the test suite whose trash
> +# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
> +# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
> +test_uri_escape() {
> +	sed 's/ /%20/g'
> +}

This is a good opportunity to unify on a single implementation rather than
to have multiple similar ones floating around. Can this be moved into
'test-lib.sh' (or 'test-lib-functions.sh'?), with 't9119' and 't9120'
updated to use the new 'test_uri_escape()'?

> diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
> new file mode 100755
> index 00000000000..89203d3a23c
> --- /dev/null
> +++ b/t/t5730-protocol-v2-bundle-uri-file.sh
> @@ -0,0 +1,36 @@
> +#!/bin/sh
> +
> +test_description="Test bundle-uri with protocol v2 and 'file://' transport"
> +
> +TEST_NO_CREATE_REPO=1
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +# Test protocol v2 with 'file://' transport
> +#
> +T5730_PROTOCOL=file
> +. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
> +
> +test_expect_success "unknown capability value with $T5730_PROTOCOL:// using protocol v2" '

Why is this test only run for the 'file://' transport protocol? And, why
isn't it in 'lib-t5730-protocol-v2-bundle-uri.sh'? If nothing else, moving
this test to that file (even if it needs to be conditional on a specific
protocol) puts the '$T5730_PARENT', '$T5730_BUNDLE_URI_ESCAPED' and
'$T5730_URI' variables in scope for better readability.

> +	test_when_finished "rm -f log" &&
> +
> +	test_config -C "$T5730_PARENT" \
> +		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
> +
> +	GIT_TRACE_PACKET="$PWD/log" \
> +	GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE=true \
> +	git \
> +		-c protocol.version=2 \
> +		ls-remote --symref "$T5730_URI" \
> +		>actual 2>err &&

This test never does anything with the content of 'actual' or 'err'. Should
it? If not, they probably shouldn't be redirected from stdout/stderr, since
the messages might be valuable when debugging.

> +
> +	# Server responded using protocol v2
> +	grep "< version 2" log &&
> +
> +	grep "> bundle-uri=test-unknown-capability-value" log
> +'
> +
> +test_done
> diff --git a/transport-helper.c b/transport-helper.c
> index e95267a4ab5..3ea7c2bb5ad 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
>  	return ret;
>  }
>  
> +static int get_bundle_uri(struct transport *transport)
> +{
> +	get_helper(transport);
> +
> +	if (process_connect(transport, 0)) {
> +		do_take_over(transport);
> +		return transport->vtable->get_bundle_uri(transport);
> +	}
> +
> +	return -1;
> +}
> +
>  static struct transport_vtable vtable = {
>  	.set_option	= set_helper_option,
>  	.get_refs_list	= get_refs_list,
> +	.get_bundle_uri = get_bundle_uri,
>  	.fetch_refs	= fetch_refs,
>  	.push_refs	= push_refs,
>  	.connect	= connect_helper,
> diff --git a/transport-internal.h b/transport-internal.h
> index c4ca0b733ac..90ea749e5cf 100644
> --- a/transport-internal.h
> +++ b/transport-internal.h
> @@ -26,6 +26,13 @@ struct transport_vtable {
>  	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
>  				     struct transport_ls_refs_options *transport_options);
>  
> +	/**
> +	 * Populates the remote side's bundle-uri under protocol v2,
> +	 * if the "bundle-uri" capability was advertised. Returns 0 if
> +	 * OK, negative values on error.

Double-checked the call stack to make sure this is true, and it is (the
return value is always either a hardcoded '0' or 'error()'.

> +	 */
> +	int (*get_bundle_uri)(struct transport *transport);
> +
>  	/**
>  	 * Fetch the objects for the given refs. Note that this gets
>  	 * an array, and should ignore the list structure.
> diff --git a/transport.c b/transport.c
> index e7b97194c10..a020adc1f56 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -902,6 +922,7 @@ static int disconnect_git(struct transport *transport)
>  
>  static struct transport_vtable taken_over_vtable = {
>  	.get_refs_list	= get_refs_via_connect,
> +	.get_bundle_uri = get_bundle_uri,
>  	.fetch_refs	= fetch_refs_via_pack,
>  	.push_refs	= git_transport_push,
>  	.disconnect	= disconnect_git
> @@ -1054,6 +1075,7 @@ static struct transport_vtable bundle_vtable = {
>  
>  static struct transport_vtable builtin_smart_vtable = {
>  	.get_refs_list	= get_refs_via_connect,
> +	.get_bundle_uri = get_bundle_uri,
>  	.fetch_refs	= fetch_refs_via_pack,
>  	.push_refs	= git_transport_push,
>  	.connect	= connect_git,

I think I follow what this is doing (if I'm reading correctly) - add
'get_bundle_uri' to all of the vtables (including in 'transport-helper.c')
*except* 'bundle_vtable', since we aren't requesting bundle URIs from an
already-identified bundle URI.

> @@ -1482,6 +1507,34 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
>  	return rc;
>  }
>  
> +int transport_get_remote_bundle_uri(struct transport *transport)
> +{
> +	const struct transport_vtable *vtable = transport->vtable;
> +
> +	/* Check config only once. */
> +	if (transport->got_remote_bundle_uri++)
> +		return 0;

We're only ever going to read the bundle list once per command (or, at least
once per 'transport' instance), so if 'transport_get_remote_bundle_uri()'
has already been called, we can safely assume the correct results (if any)
are in the 'transport' structure.

> +
> +	/*
> +	 * "Support" protocol v0 and v2 without bundle-uri support by
> +	 * silently degrading to a NOOP.
> +	 */
> +	if (!server_supports_v2("bundle-uri", 0))
> +		return 0;
> +
> +	/*
> +	 * This is intentionally below the transport.injectBundleURI,
> +	 * we want to be able to inject into protocol v0, or into the
> +	 * dialog of a server who doesn't support this.
> +	 */
> +	if (!vtable->get_bundle_uri)
> +		return error(_("bundle-uri operation not supported by protocol"));
> +
> +	if (vtable->get_bundle_uri(transport) < 0)

As you noted earlier, 'get_bundle_uri()' always returns a value <= 0, so
this check works.

> +		return error(_("could not retrieve server-advertised bundle-uri list"));
> +	return 0;
> +}
> +
>  void transport_unlock_pack(struct transport *transport, unsigned int flags)
>  {
>  	int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/9] bundle-uri client: add helper for testing server
  2022-11-16 19:51   ` [PATCH v2 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-29  0:59     ` Victoria Dye
  2022-12-02 15:28       ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  0:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>  <avarab@gmail.com>
> 
> Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
> for issuing protocol v2 "bundle-uri" commands to a server, and to the
> parsing routines in bundle-uri.c.
> 
> Since in the "git clone" case we'll have already done the handshake(),
> but not here, introduce a "got_advertisement" state along with
> "got_remote_heads". It seems to me that the "got_remote_heads" is
> badly named in the first place, and the whole logic of eagerly getting
> ls-refs on handshake() or not could be refactored somewhat, but let's
> not do that now, and instead just add another self-documenting state
> variable.

Maybe I'm missing something, but why not just rename 'got_remote_heads' to
something like 'finished_handshake' rather than adding 'got_advertisement'
(since, AFAICT, it's always identical in value to 'got_remote_heads').

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>

This commit also introduces the 'quiet' flag to
'transport_get_remote_bundle_uri()', but there's no mention in the commit
message. The message also doesn't explain the changes to existing tests
(adding 'bundle.*' settings, swapping out 'git ls-remote' for the new
'test-tool bundle-uri ls-remote' in existing tests, etc.). I think these are
all relevant to fully understanding the patch, so could you mention them in
your next reroll?

> ---
>  builtin/clone.c                       |  2 +-
>  t/helper/test-bundle-uri.c            | 46 +++++++++++++++++++
>  t/lib-t5730-protocol-v2-bundle-uri.sh | 63 ++++++++++++++++++++++-----
>  transport.c                           | 43 ++++++++++++++----
>  transport.h                           |  6 ++-
>  5 files changed, 139 insertions(+), 21 deletions(-)
> 
> diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
> index 25afd393428..ffb975b7b4f 100644
> --- a/t/helper/test-bundle-uri.c
> +++ b/t/helper/test-bundle-uri.c
> @@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
>  		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
>  	if (!strcmp(argv[1], "parse-config"))
>  		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
> +	if (!strcmp(argv[1], "ls-remote"))
> +		return cmd_ls_remote(argc - 1, argv + 1);

With this helper being added, I'm not sure if/why 'clone' was needed to test
the bundle URIs in patch 2 (I assumed integrating with a command was the
only way to test it, which is why I didn't mention this in my review [1]).
In the spirit of having commits avoid "doing more than one thing" could
these patches be reorganized into something like:

1. Add the no-op client & some basic tests around fetching the bundle URI
   list using this test helper.
2. Add the 'transport_get_remote_bundle_uri()' call to 'clone()' with
   clone-specific tests.

It probably wouldn't make the patches much shorter, but it would help avoid
the churn of test changes & changing assumptions around 'quiet' &
'got_advertisement' in this patch.

[1] https://lore.kernel.org/git/ca410bed-e8d1-415f-5235-b64fe18bed27@github.com/

>  	error("there is no test-tool bundle-uri tool '%s'", argv[1]);
>  
>  usage:
> diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
> index 27294e9c976..c327544641b 100644
> --- a/t/lib-t5730-protocol-v2-bundle-uri.sh
> +++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
> @@ -34,7 +34,9 @@ esac
>  test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
>  	git init "$T5730_PARENT" &&
>  	test_commit -C "$T5730_PARENT" one &&
> -	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
> +	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true &&
> +	git -C "$T5730_PARENT" config bundle.version 1 &&
> +	git -C "$T5730_PARENT" config bundle.mode all

Why are these config settings added here? I don't see them used anywhere?

> diff --git a/transport.c b/transport.c
> index a020adc1f56..86460f5be28 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -371,6 +373,33 @@ static int get_bundle_uri(struct transport *transport)
>  		init_bundle_list(transport->bundles);
>  	}
>  
> +	if (!data->got_advertisement) {
> +		struct ref *refs;
> +		struct git_transport_data *data = transport->data;
> +		enum protocol_version version;
> +
> +		refs = handshake(transport, 0, NULL, 0);
> +		version = data->version;
> +
> +		switch (version) {
> +		case protocol_v2:
> +			assert(!refs);
> +			break;
> +		case protocol_v0:
> +		case protocol_v1:
> +		case protocol_unknown_version:
> +			assert(refs);
> +			break;

Why were these 'refs' assertions added? What are they intended to validate?

> +		}
> +	}
> +
> +	/*
> +	 * "Support" protocol v0 and v2 without bundle-uri support by
> +	 * silently degrading to a NOOP.
> +	 */
> +	if (!server_supports_v2("bundle-uri", 0))
> +		return 0;

I was originally confused as to why this was moved out of
'transport_get_remote_bundle_uri()', but it looks like the answer is "we
were previously relying on the handshake being done by the time we called
'transport_get_remote_bundle_uri()', but we can't anymore."

> +
>  	packet_reader_init(&reader, data->fd[0], NULL, 0,
>  			   PACKET_READ_CHOMP_NEWLINE |
>  			   PACKET_READ_GENTLE_ON_EOF);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config
  2022-11-16 19:51   ` [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
@ 2022-11-29  1:00     ` Victoria Dye
  0 siblings, 0 replies; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:00 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> Implement the "bundle-uri" protocol v2 capability by populating the
> key=value packet lines from the local Git config. The list of bundles is
> provided from the keys beginning with "bundle.".
> 
> In the future, we may want to filter this list to be more specific to
> the exact known keys that the server intends to share, but for
> flexibility at the moment we will assume that the config values are
> well-formed.

This patch looks good - the implementation is pretty straightforward ("send
a config value if its key matches 'bundle.*'), and the tests cover both
single and multiple bundle URIs returned by the server.

> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle-uri.c                          | 16 +++++++++++-
>  t/lib-t5730-protocol-v2-bundle-uri.sh | 35 +++++++++++++++++++++++++++
>  2 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
> index c327544641b..000fcc5e20b 100644
> --- a/t/lib-t5730-protocol-v2-bundle-uri.sh
> +++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
> @@ -158,6 +158,8 @@ test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
>  	[bundle]
>  		version = 1
>  		mode = all
> +	[bundle "only"]
> +		uri = $T5730_BUNDLE_URI_ESCAPED


Ah, okay, this explains why the 'bundle.only.uri' config was added to the
test in the last patch [1]. But, if the config is only being served in this
patch, shouldn't that test change be moved to this patch? 

[1] https://lore.kernel.org/git/c3269a24b5780023cbb4d173cb9cfb10c5a4b0d8.1668628303.git.gitgitgadget@gmail.com/

>  	EOF
>  	GIT_TRACE_PACKET="$PWD/log" \
>  	test-tool bundle-uri \

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting
  2022-11-16 19:51   ` [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-11-29  1:03     ` Victoria Dye
  2022-12-02 15:38       ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>  <avarab@gmail.com>
> 
> The yet-to-be introduced client support for bundle-uri will always
> fall back on a full clone, but we'd still like to be able to ignore a
> server's bundle-uri advertisement entirely.
> 
> The new transfer.bundleURI config option defaults to 'false', but a user
> can set it to 'true' to enable checking for bundle URIs from the origin
> Git server using protocol v2.

Thanks for adding this, an "opt-in" approach seems reasonable for
introducing this feature.

> 
> To enable this setting by default in the correct tests, add a
> GIT_TEST_BUNDLE_URI environment variable.

This makes sense. I'm less concerned with this environment variable than
those in patch 2 [1], since it's in line with the test variables that
enable/disable whole features ('GIT_TEST_SPLIT_INDEX',
'GIT_TEST_COMMIT_GRAPH', etc.). 

The only thing feedback I can think of would be that this patch could be
moved to earlier in the series (that is, immediately after creating
'transport_get_remote_bundle_uri()'). That said, I don't feel strongly
either way.

[1] https://lore.kernel.org/git/ca410bed-e8d1-415f-5235-b64fe18bed27@github.com/

> 
> Co-authored-by: Derrick Stolee <derrickstolee@github.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>

The implementation and documentation below align with the commit message.
Looks good!

> ---
>  Documentation/config/transfer.txt     |  6 ++++++
>  t/lib-t5730-protocol-v2-bundle-uri.sh |  3 +++
>  transport.c                           | 10 +++++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
> index 264812cca4d..c3ac767d1e4 100644
> --- a/Documentation/config/transfer.txt
> +++ b/Documentation/config/transfer.txt
> @@ -115,3 +115,9 @@ transfer.unpackLimit::
>  transfer.advertiseSID::
>  	Boolean. When true, client and server processes will advertise their
>  	unique session IDs to their remote counterpart. Defaults to false.
> +
> +transfer.bundleURI::
> +	When `true`, local `git clone` commands will request bundle
> +	information from the remote server (if advertised) and download
> +	bundles before continuing the clone through the Git protocol.
> +	Defaults to `false`.
> diff --git a/t/lib-t5730-protocol-v2-bundle-uri.sh b/t/lib-t5730-protocol-v2-bundle-uri.sh
> index 000fcc5e20b..872bc39ad1b 100644
> --- a/t/lib-t5730-protocol-v2-bundle-uri.sh
> +++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
> @@ -1,5 +1,8 @@
>  # Included from t573*-protocol-v2-bundle-uri-*.sh
>  
> +GIT_TEST_BUNDLE_URI=1
> +export GIT_TEST_BUNDLE_URI
> +
>  T5730_PARENT=
>  T5730_URI=
>  T5730_BUNDLE_URI=
> diff --git a/transport.c b/transport.c
> index 86460f5be28..b33180226ae 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1538,6 +1538,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
>  
>  int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
>  {
> +	int value = 0;
>  	const struct transport_vtable *vtable = transport->vtable;
>  
>  	/* Check config only once. */
> @@ -1545,10 +1546,13 @@ int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
>  		return 0;
>  
>  	/*
> -	 * This is intentionally below the transport.injectBundleURI,
> -	 * we want to be able to inject into protocol v0, or into the
> -	 * dialog of a server who doesn't support this.
> +	 * Don't use bundle-uri at all, if configured not to. Only proceed
> +	 * if GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
>  	 */
> +	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&> +	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
> +		return 0;
> +
>  	if (!vtable->get_bundle_uri) {
>  		if (quiet)
>  			return -1;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path()
  2022-11-16 19:51   ` [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
@ 2022-11-29  1:03     ` Victoria Dye
  2022-12-02 15:40       ` Derrick Stolee
  2022-12-02 18:32     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:03 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The strbuf_parent_directory() method was added as a static method in
> contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
> config and starts maintenance, 2021-12-03) and then removed in
> 65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
> there is a need for a similar method in the bundle URI feature.
> 
> Re-add the method, this time in strbuf.c, but with a new name:
> strbuf_strip_file_from_path(). The method requirements are slightly
> modified to allow a trailing slash, in which case nothing is done, which
> makes the name change valuable. The return value is the number of bytes
> removed.

*Extremely* minor point, but why return anything at all? The call in the
next patch doesn't use the return value, and some similar-in-spirit 'strbuf'
functions (like 'strbuf_trim()') return nothing. 

I don't think this is worth changing if you can imagine using that return
value for something eventually; just wanted to point it out as something to
(optionally) consider if you re-roll for something else anyway.

> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  strbuf.c |  9 +++++++++
>  strbuf.h | 12 ++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/strbuf.c b/strbuf.c
> index 0890b1405c5..8d1e2e8bb61 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1200,3 +1200,12 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>  	free(path2);
>  	return res;
>  }
> +
> +size_t strbuf_strip_file_from_path(struct strbuf *buf)
> +{
> +	size_t len = buf->len;
> +	size_t offset = offset_1st_component(buf->buf);
> +	char *path_sep = find_last_dir_sep(buf->buf + offset);
> +	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
> +	return len - buf->len;
> +}
> diff --git a/strbuf.h b/strbuf.h
> index 76965a17d44..4822b713786 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -664,6 +664,18 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
>  int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>  			      const char *const *env);
>  
> +/*
> + * Remove the filename from the provided path string. If the path
> + * contains a trailing separator, then the path is considered a directory
> + * and nothing is modified. Returns the number of characters removed from
> + * the path.
> + *
> + * Examples:
> + * - "/path/to/file" -> "/path/to/" (returns: 4)
> + * - "/path/to/dir/" -> "/path/to/dir/" (returns: 0)
> + */
> +size_t strbuf_strip_file_from_path(struct strbuf *buf);
> +
>  void strbuf_add_lines(struct strbuf *sb,
>  		      const char *prefix,
>  		      const char *buf,


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists
  2022-11-16 19:51   ` [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-11-29  1:25     ` Victoria Dye
  2022-12-02 16:03       ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:25 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> Bundle providers may want to distribute that data across multiple CDNs.
> This might require a change in the base URI, all the way to the domain
> name. If all bundles require an absolute URI in their 'uri' value, then
> every push to a CDN would require altering the table of contents to
> match the expected domain and exact location within it.
> 
> Allow a bundle list to specify a relative URI for the bundles.
> This allows easier distribution of bundle data.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle-uri.c                | 16 ++++++++++-
>  bundle-uri.h                |  9 +++++++
>  t/helper/test-bundle-uri.c  |  2 ++
>  t/t5750-bundle-uri-parse.sh | 54 +++++++++++++++++++++++++++++++++++++
>  transport.c                 |  3 +++
>  5 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/bundle-uri.c b/bundle-uri.c
> @@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
>  		.error_action = CONFIG_ERROR_ERROR,
>  	};
>  
> +	if (!list->baseURI) {
> +		struct strbuf baseURI = STRBUF_INIT;
> +		strbuf_addstr(&baseURI, uri);
> +
> +		/*
> +		 * If the URI does not end with a trailing slash, then
> +		 * remove the filename portion of the path. This is
> +		 * important for relative URIs.
> +		 */
> +		strbuf_strip_file_from_path(&baseURI);
> +		list->baseURI = strbuf_detach(&baseURI, NULL);

Is the 'baseURI' is set to the URI of the first bundle (ordered by hash)? If
data is distributed across multiple CDNs, couldn't this be a suboptimal
choice? For example, if the first bundle is on 'A.com', but every other
bundle is on 'B.org'?

> +	}
>  	result = git_config_from_file_with_options(config_to_bundle_list,
>  						   filename, list,
>  						   &opts);
> diff --git a/bundle-uri.h b/bundle-uri.h
> index 357111ecce8..7905e56732c 100644
> --- a/bundle-uri.h
> +++ b/bundle-uri.h
> @@ -61,6 +61,15 @@ struct bundle_list {
>  	int version;
>  	enum bundle_list_mode mode;
>  	struct hashmap bundles;
> +
> +	/**
> +	 * The baseURI of a bundle_list is used as the base for any
> +	 * relative URIs advertised by the bundle list at that location.
> +	 *
> +	 * When the list is generated from a Git server, then use that
> +	 * server's location.

Hmmm, I think I'm missing something with my earlier comment. I thought the
'uri' argument to 'bundle_uri_parse_config_format()' was an individual
bundle's URI? What's the "server's location" in this context?

> +	 */
> +	char *baseURI;
>  };
>  
>  void init_bundle_list(struct bundle_list *list);
> diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
> index ffb975b7b4f..5aa0b494ce3 100644
> --- a/t/helper/test-bundle-uri.c
> +++ b/t/helper/test-bundle-uri.c
> @@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
>  
>  	init_bundle_list(&list);
>  
> +	list.baseURI = xstrdup("<uri>");

Using a hardcoded value here leads to pretty different behavior in
'test-bundle-uri.c' vs. starting with an unset 'list.baseURI' in something
like 'clone'. Why does this need to be set to '<uri>' for the tests?

> +
>  	switch (mode) {
>  	case KEY_VALUE_PAIRS:
>  		if (argc != 1)
> diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
> index c2fe3f9c5a5..ed5262a8d2b 100755
> --- a/t/t5750-bundle-uri-parse.sh
> +++ b/t/t5750-bundle-uri-parse.sh
> @@ -30,6 +30,30 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
>  	test_cmp_config_output expect actual
>  '
>  
> +test_expect_success 'bundle_uri_parse_line(): relative URIs' '
> +	cat >in <<-\EOF &&
> +	bundle.one.uri=bundle.bdl
> +	bundle.two.uri=../bundle.bdl
> +	bundle.three.uri=sub/dir/bundle.bdl
> +	EOF
> +
> +	cat >expect <<-\EOF &&
> +	[bundle]
> +		version = 1
> +		mode = all
> +	[bundle "one"]
> +		uri = <uri>/bundle.bdl
> +	[bundle "two"]
> +		uri = bundle.bdl

This seems a little strange, but it looks like '<uri>/../bundle.bdl'
normalizes to 'bundle.bdl' because '<uri>' is treated like a regular path
element (like a directory). 

Out of curiosity, what would happen if 'bundle.two.uri' was
'../../bundle.bdl'?

> +	[bundle "three"]
> +		uri = <uri>/sub/dir/bundle.bdl
> +	EOF
> +
> +	test-tool bundle-uri parse-key-values in >actual 2>err &&
> +	test_must_be_empty err &&
> +	test_cmp_config_output expect actual
> +'
> +
>  test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
>  	cat >in <<-\EOF &&
>  	=bogus-value

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 8/9] bundle-uri: download bundles from an advertised list
  2022-11-16 19:51   ` [PATCH v2 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-11-29  1:51     ` Victoria Dye
  0 siblings, 0 replies; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:51 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
> 'git clone', but is not helpful when the clone operation discovers a
> list of URIs from the bundle-uri protocol v2 command. To actually
> download and unbundle the advertised bundles, we need a different
> mechanism.
> 
> Create the new fetch_bundle_list() method which is very similar to
> fetch_bundle_uri() except that it relies on download_bundle_list()
> instead of fetch_bundle_uri_internal(). The download_bundle_list()
> method will recursively call fetch_bundle_uri_internal() if any of the
> advertised URIs serve a bundle list instead of a bundle. This will also
> follow the bundle.list.mode setting from the input list: "any" will
> download only one such URI while "all" will download data from all of
> the URIs.
> 
> In an identical way to fetch_bundle_uri(), the bundles are unbundled
> after all of the bundle lists have been expanded and all necessary URIs.

This explanation is clear and matches the implementation below. I'll admit
it's a bit difficult to wrap my head around what's going on but, from what I
understand, it does what it needs to do to set up for the next patch.

There's no way to test this change in this patch (since
'fetch_bundle_list()' isn't called anywhere yet), but I think that's fine;
making it testable would probably make the patch too long/complicated to
follow.

> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle-uri.c | 21 +++++++++++++++++++++
>  bundle-uri.h | 11 +++++++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/bundle-uri.c b/bundle-uri.c
> index ab91bb32e9b..5914d220c43 100644
> --- a/bundle-uri.c
> +++ b/bundle-uri.c
> @@ -577,6 +577,27 @@ cleanup:
>  	return result;
>  }
>  
> +int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
> +{
> +	int result;
> +	struct bundle_list global_list;
> +
> +	init_bundle_list(&global_list);
> +
> +	/* If a bundle is added to this global list, then it is required. */
> +	global_list.mode = BUNDLE_MODE_ALL;
> +
> +	if ((result = download_bundle_list(r, list, &global_list, 0)))
> +		goto cleanup;
> +
> +	result = unbundle_all_bundles(r, &global_list);
> +
> +cleanup:
> +	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
> +	clear_bundle_list(&global_list);
> +	return result;
> +}
> +
>  /**
>   * API for serve.c.
>   */
> diff --git a/bundle-uri.h b/bundle-uri.h
> index 7905e56732c..a75b68d2f5a 100644
> --- a/bundle-uri.h
> +++ b/bundle-uri.h
> @@ -102,6 +102,17 @@ int bundle_uri_parse_config_format(const char *uri,
>   */
>  int fetch_bundle_uri(struct repository *r, const char *uri);
>  
> +/**
> + * Given a bundle list that was already advertised (likely by the
> + * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
> + * bundles according to the bundle strategy of that list.
> + *
> + * Returns non-zero if no bundle information is found at the given 'uri'.
> + */
> +int fetch_bundle_list(struct repository *r,
> +		      const char *uri,
> +		      struct bundle_list *list);
> +
>  /**
>   * API for serve.c.
>   */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 9/9] clone: unbundle the advertised bundles
  2022-11-16 19:51   ` [PATCH v2 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
@ 2022-11-29  1:59     ` Victoria Dye
  2022-12-02 16:16       ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-11-29  1:59 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> A previous change introduced the transport methods to acquire a bundle
> list from the 'bundle-uri' protocol v2 command, when advertised _and_
> when the client has chosen to enable the feature.
> 
> Teach Git to download and unbundle the data advertised by those bundles
> during 'git clone'.
> 
> Also, since the --bundle-uri option exists, we do not want to mix the
> advertised bundles with the user-specified bundles.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/clone.c  | 26 +++++++++++++++++----
>  t/t5601-clone.sh | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 80 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 22b1e506452..09f10477ed6 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1267,11 +1267,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (refs)
>  		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
>  
> -	/*
> -	 * Populate transport->got_remote_bundle_uri and
> -	 * transport->bundle_uri. We might get nothing.
> -	 */
> -	transport_get_remote_bundle_uri(transport, 1);
> +	if (!bundle_uri) {
> +		/*
> +		* Populate transport->got_remote_bundle_uri and
> +		* transport->bundle_uri. We might get nothing.
> +		*/
> +		transport_get_remote_bundle_uri(transport, 1);
> +
> +		if (transport->bundles &&
> +		    hashmap_get_size(&transport->bundles->bundles)) {
> +			/* At this point, we need the_repository to match the cloned repo. */
> +			if (repo_init(the_repository, git_dir, work_tree))
> +				warning(_("failed to initialize the repo, skipping bundle URI"));
> +			if (fetch_bundle_list(the_repository,
> +					      remote->url[0],
> +					      transport->bundles))

If the repo initialization fails, this line is still executed. Should the
condition be 'else if' to avoid that?

Otherwise, all of the added logic looks good to me.

> +				warning(_("failed to fetch advertised bundles"));
> +		} else {
> +			clear_bundle_list(transport->bundles);
> +			FREE_AND_NULL(transport->bundles);
> +		}
> +	}
>  
>  	if (mapped_refs) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index 45f0803ed4d..d1d8139751e 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh

Per the commit message:

> Also, since the --bundle-uri option exists, we do not want to mix the
> advertised bundles with the user-specified bundles.

Could you add a test verifying that '--bundle-uri' causes 'clone' to skip
bundle URI auto-discovery? It's clear from the implementation above that
'clone' is currently doing that as-expected, but it might be nice to have
the test for regression testing purposes.

> @@ -795,6 +795,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
>  	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
>  '
>  
> +test_expect_success 'auto-discover bundle URI from HTTP clone' '
> +	test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
> +	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
> +	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
> +
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
> +		uploadpack.advertiseBundleURIs true &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
> +		bundle.version 1 &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
> +		bundle.mode all &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
> +		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
> +
> +	GIT_TEST_BUNDLE_URI=1 \
> +	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> +		git -c protocol.version=2 clone \
> +		$HTTPD_URL/smart/repo2.git repo2 &&
> +	cat >pattern <<-EOF &&
> +	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
> +	EOF
> +	grep -f pattern trace.txt
> +'
> +
> +test_expect_success 'auto-discover multiple bundles from HTTP clone' '
> +	test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
> +
> +	test_commit -C src new &&
> +	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
> +	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
> +
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
> +		uploadpack.advertiseBundleURIs true &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
> +		bundle.version 1 &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
> +		bundle.mode all &&
> +
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
> +		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
> +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
> +		bundle.new.uri "$HTTPD_URL/new.bundle" &&
> +
> +	GIT_TEST_BUNDLE_URI=1 \
> +	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> +		git -c protocol.version=2 clone \
> +		$HTTPD_URL/smart/repo3.git repo3 &&
> +
> +	# We should fetch _both_ bundles
> +	cat >pattern <<-EOF &&
> +	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
> +	EOF
> +	grep -f pattern trace.txt &&
> +	cat >pattern <<-EOF &&
> +	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
> +	EOF
> +	grep -f pattern trace.txt
> +'
> +
>  # DO NOT add non-httpd-specific tests here, because the last part of this
>  # test script is only executed when httpd is available and enabled.
>  


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 2/9] bundle-uri client: add minimal NOOP client
  2022-11-29  0:57     ` Victoria Dye
@ 2022-12-02 15:00       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 15:00 UTC (permalink / raw)
  To: Victoria Dye,
	Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 7:57 PM, Victoria Dye wrote:
> Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
>> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>>  <avarab@gmail.com>

>> There's a question of what level of encapsulation we should use here,
>> I've opted to use connect.h in clone.c, but we could also e.g. make
>> transport_get_remote_refs() invoke this, i.e. make it implicitly get
>> the bundle-uri list for later steps.
>
> I'm not sure I follow what this sentence is saying. Looking at the
> implementation below [1], you've added a call to
> 'transport_get_remote_bundle_uri()' in 'clone.c', but that's defined in
> 'transport.h' (which is already included in 'clone.c'). Why is 'connect.h'
> needed at all?

It's not. Good catch!

>> This approach means that we don't "support" this in "git fetch" for
>> now. I'm starting with the case of initial clones, although as noted
>> in preceding commits to the protocol documentation nothing about this
>> approach precludes getting bundles on incremental fetches.
>
> This explanation seems more complicated than necessary. I think it's
> sufficient to say "The no-op client is initially used only in 'clone' to
> test the basic functionality. The bundle URI client will be integrated into
> fetch, pull, etc. in later patches".

Definitely can be reworded. Specifically, fetches need more functionality
(coming in part V) before there is value in that integration.

>> For the t5732-protocol-v2-bundle-uri-http.sh it's not easy to set
>> environment variables for git-upload-pack (it's started by Apache), so
>> let's skip the test under T5730_HTTP, and add unused T5730_{FILE,GIT}
>> prerequisites for consistency and future use.
>
> "skip the test" doesn't explain *which* test is skipped (and it doesn't look
> like you skip all of them). I think you're referring to "bad client with
> $T5730_PROTOCOL:// using protocol v2" specifically?

Will make the specific test to skip more clear.

>> diff --git a/bundle-uri.c b/bundle-uri.c
>> index 32022595964..2201b604b11 100644
>> --- a/bundle-uri.c
>> +++ b/bundle-uri.c
>> @@ -571,6 +571,10 @@ int bundle_uri_advertise(struct repository *r, struct strbuf *value)
>>  {
>>  	static int advertise_bundle_uri = -1;
>>
>> +	if (value &&
>> +	    git_env_bool("GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE", 0))
>> +		strbuf_addstr(value, "test-unknown-capability-value");
>
> It looks like 'GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE' is being used
> to "mock" server responses to test certain behavior on the client side. I'm
> somewhat uncomfortable with how this mixes test-specific code paths with
> application code, and AFAICT nothing similar is done for other
> advertise/command functions in protocol v2. Is there a way to set up tests
> to intercept the client requests and customize the response?

I'm going to investigate how we can test similar functionality within the
test-tool code instead, if possible.

>> +int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
>> +			  struct bundle_list *bundles, int stateless_rpc)
>> +{
>> +	int line_nr = 1;
>> +
>> +	/* Assert bundle-uri support */
>> +	server_supports_v2("bundle-uri", 1);
>> +
>> +	/* (Re-)send capabilities */
>> +	send_capabilities(fd_out, reader);
>> +
>> +	/* Send command */
>> +	packet_write_fmt(fd_out, "command=bundle-uri\n");
>> +	packet_delim(fd_out);
>> +
>> +	/* Send options */
>> +	if (git_env_bool("GIT_TEST_PROTOCOL_BAD_BUNDLE_URI", 0))
>> +		packet_write_fmt(fd_out, "test-bad-client\n");
>
> Same comment as on 'GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE'. There's
> no precedent that I can find for a test variable like this in 'connect.c', and
> "in the middle of client code" doesn't seem like an ideal place for it.
>
> If there really isn't another way of doing this, could the addition of these
> 'GIT_TEST' variables and their associated tests be split out into a
> dedicated commit? That would at least separate the test code paths from the
> application code in the commit history.

I'll definitely split them out, making this change much more about the
test boilerplate. In addition, most of the test boilerplate actually works
without the 'git clone' update, so this can be split into three commits:

1. Create the test infrastructure to check that the server advertises
   the 'bundle-uri' command appropriately.

2. Implement the basic client that issues and parses the 'bundle-uri'
   command. Add the request to 'git clone' and add a test that verifies
   that the client makes the request.

3. Add the extra error condition tests.

>> +++ b/t/lib-t5730-protocol-v2-bundle-uri.sh
>
> nit: this set of tests is used in more than just 't5730', so the name is
> somewhat misleading. It also seems a bit overly-specific to include
> "protocol-v2" in the filename (the tests themselves mention "protocol v2"
> when it's relevant). What about something like 'lib-proto-bundle-uri.sh'
> (using "proto" to mimic 'lib-proto-disable.sh')?

I agree. I think 'lib-bundle-uri-protocol.sh' is a clearer name.

>> +# Poor man's URI escaping. Good enough for the test suite whose trash
>> +# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
>> +# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
>> +test_uri_escape() {
>> +	sed 's/ /%20/g'
>> +}
>
> This is a good opportunity to unify on a single implementation rather than
> to have multiple similar ones floating around. Can this be moved into
> 'test-lib.sh' (or 'test-lib-functions.sh'?), with 't9119' and 't9120'
> updated to use the new 'test_uri_escape()'?

Will move, although I was not able to find the use in t9120.

>> +test_expect_success "unknown capability value with $T5730_PROTOCOL:// using protocol v2" '
>
> Why is this test only run for the 'file://' transport protocol? And, why
> isn't it in 'lib-t5730-protocol-v2-bundle-uri.sh'? If nothing else, moving
> this test to that file (even if it needs to be conditional on a specific
> protocol) puts the '$T5730_PARENT', '$T5730_BUNDLE_URI_ESCAPED' and
> '$T5730_URI' variables in scope for better readability.

I think this is one of the tests that doesn't work in HTTP, but could be
skipped using a prereq if it is placed in the common test script.

I will rethink this test coverage to see if there is a different way to
check similar behavior without as much insertion into the client/server
code.

>> +int transport_get_remote_bundle_uri(struct transport *transport)
>> +{
>> +	const struct transport_vtable *vtable = transport->vtable;
>> +
>> +	/* Check config only once. */
>> +	if (transport->got_remote_bundle_uri++)
>> +		return 0;
>
> We're only ever going to read the bundle list once per command (or, at least
> once per 'transport' instance), so if 'transport_get_remote_bundle_uri()'
> has already been called, we can safely assume the correct results (if any)
> are in the 'transport' structure.

Yes, although it suffers from a mistake of this form I've seen before:
got_remote_bundle_uri is a single bit, so this only works every other time.
I will fix this.

Thanks for the detailed review!
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/9] bundle-uri client: add helper for testing server
  2022-11-29  0:59     ` Victoria Dye
@ 2022-12-02 15:28       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 15:28 UTC (permalink / raw)
  To: Victoria Dye,
	Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 7:59 PM, Victoria Dye wrote:
> Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
>> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>>  <avarab@gmail.com>
>>
>> Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
>> for issuing protocol v2 "bundle-uri" commands to a server, and to the
>> parsing routines in bundle-uri.c.
>>
>> Since in the "git clone" case we'll have already done the handshake(),
>> but not here, introduce a "got_advertisement" state along with
>> "got_remote_heads". It seems to me that the "got_remote_heads" is
>> badly named in the first place, and the whole logic of eagerly getting
>> ls-refs on handshake() or not could be refactored somewhat, but let's
>> not do that now, and instead just add another self-documenting state
>> variable.
>
> Maybe I'm missing something, but why not just rename 'got_remote_heads' to
> something like 'finished_handshake' rather than adding 'got_advertisement'
> (since, AFAICT, it's always identical in value to 'got_remote_heads').

I think that is a reasonable recommendation.

>> --- a/t/helper/test-bundle-uri.c
>> +++ b/t/helper/test-bundle-uri.c
>> @@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
>>  		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
>>  	if (!strcmp(argv[1], "parse-config"))
>>  		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
>> +	if (!strcmp(argv[1], "ls-remote"))
>> +		return cmd_ls_remote(argc - 1, argv + 1);
>
> With this helper being added, I'm not sure if/why 'clone' was needed to test
> the bundle URIs in patch 2 (I assumed integrating with a command was the
> only way to test it, which is why I didn't mention this in my review [1]).
> In the spirit of having commits avoid "doing more than one thing" could
> these patches be reorganized into something like:
>
> 1. Add the no-op client & some basic tests around fetching the bundle URI
>    list using this test helper.
> 2. Add the 'transport_get_remote_bundle_uri()' call to 'clone()' with
>    clone-specific tests.
>
> It probably wouldn't make the patches much shorter, but it would help avoid
> the churn of test changes & changing assumptions around 'quiet' &
> 'got_advertisement' in this patch.

I will think more on this as I get further into your review and figure out
a way to do the error case tests. At minimum, I've split out some things
so they might be easier to rearrange, but the 'git clone' integration is
(currently) still paired with the implementation in transport.c.

>>  test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
>>  	git init "$T5730_PARENT" &&
>>  	test_commit -C "$T5730_PARENT" one &&
>> -	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
>> +	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true &&
>> +	git -C "$T5730_PARENT" config bundle.version 1 &&
>> +	git -C "$T5730_PARENT" config bundle.mode all
>
> Why are these config settings added here? I don't see them used anywhere?

This can be delayed until the next change that actually reads that config.

>> diff --git a/transport.c b/transport.c
>> index a020adc1f56..86460f5be28 100644
>> --- a/transport.c
>> +++ b/transport.c
>> @@ -371,6 +373,33 @@ static int get_bundle_uri(struct transport *transport)
>>  		init_bundle_list(transport->bundles);
>>  	}
>>
>> +	if (!data->got_advertisement) {
>> +		struct ref *refs;
>> +		struct git_transport_data *data = transport->data;
>> +		enum protocol_version version;
>> +
>> +		refs = handshake(transport, 0, NULL, 0);
>> +		version = data->version;
>> +
>> +		switch (version) {
>> +		case protocol_v2:
>> +			assert(!refs);
>> +			break;
>> +		case protocol_v0:
>> +		case protocol_v1:
>> +		case protocol_unknown_version:
>> +			assert(refs);
>> +			break;
>
> Why were these 'refs' assertions added? What are they intended to validate?

You're right. This is essentially inserting test code into the product
(although the assert()s would be compiled out, I assume). The only differnce
here is that after the handshake, protocol v2 has not executed the 'ls-refs'
command, while the other protocol versions start with a ref advertisement
in the initial response.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting
  2022-11-29  1:03     ` Victoria Dye
@ 2022-12-02 15:38       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 15:38 UTC (permalink / raw)
  To: Victoria Dye,
	Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 8:03 PM, Victoria Dye wrote:
> Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
>> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>>  <avarab@gmail.com>
>>
>> The yet-to-be introduced client support for bundle-uri will always
>> fall back on a full clone, but we'd still like to be able to ignore a
>> server's bundle-uri advertisement entirely.
>>
>> The new transfer.bundleURI config option defaults to 'false', but a user
>> can set it to 'true' to enable checking for bundle URIs from the origin
>> Git server using protocol v2.
> 
> Thanks for adding this, an "opt-in" approach seems reasonable for
> introducing this feature.
> 
>>
>> To enable this setting by default in the correct tests, add a
>> GIT_TEST_BUNDLE_URI environment variable.
> 
> This makes sense. I'm less concerned with this environment variable than
> those in patch 2 [1], since it's in line with the test variables that
> enable/disable whole features ('GIT_TEST_SPLIT_INDEX',
> 'GIT_TEST_COMMIT_GRAPH', etc.). 
> 
> The only thing feedback I can think of would be that this patch could be
> moved to earlier in the series (that is, immediately after creating
> 'transport_get_remote_bundle_uri()'). That said, I don't feel strongly
> either way.

It was simple enough to reorder them, so I've done that.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path()
  2022-11-29  1:03     ` Victoria Dye
@ 2022-12-02 15:40       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 15:40 UTC (permalink / raw)
  To: Victoria Dye, Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 8:03 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The strbuf_parent_directory() method was added as a static method in
>> contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
>> config and starts maintenance, 2021-12-03) and then removed in
>> 65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
>> there is a need for a similar method in the bundle URI feature.
>>
>> Re-add the method, this time in strbuf.c, but with a new name:
>> strbuf_strip_file_from_path(). The method requirements are slightly
>> modified to allow a trailing slash, in which case nothing is done, which
>> makes the name change valuable. The return value is the number of bytes
>> removed.
> 
> *Extremely* minor point, but why return anything at all? The call in the
> next patch doesn't use the return value, and some similar-in-spirit 'strbuf'
> functions (like 'strbuf_trim()') return nothing. 
> 
> I don't think this is worth changing if you can imagine using that return
> value for something eventually; just wanted to point it out as something to
> (optionally) consider if you re-roll for something else anyway.

While I'm here, it's not too hard to remove that and save some lines.
We can always bring that back if someone needs it in the future.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists
  2022-11-29  1:25     ` Victoria Dye
@ 2022-12-02 16:03       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 16:03 UTC (permalink / raw)
  To: Victoria Dye, Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 8:25 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:

>> +	if (!list->baseURI) {
>> +		struct strbuf baseURI = STRBUF_INIT;
>> +		strbuf_addstr(&baseURI, uri);
>> +
>> +		/*
>> +		 * If the URI does not end with a trailing slash, then
>> +		 * remove the filename portion of the path. This is
>> +		 * important for relative URIs.
>> +		 */
>> +		strbuf_strip_file_from_path(&baseURI);
>> +		list->baseURI = strbuf_detach(&baseURI, NULL);
>
> Is the 'baseURI' is set to the URI of the first bundle (ordered by hash)? If
> data is distributed across multiple CDNs, couldn't this be a suboptimal
> choice? For example, if the first bundle is on 'A.com', but every other
> bundle is on 'B.org'?

The baseURI is set to one of two things:

1. The URI used for the clone, specifying the way the client connected to
   the Git server, or

2. The URI used to download the bundle list itself.

This allows the same bundle list file to be distributed to multiple CDNs,
assuming that the bundles themselves will have the same relative position
to the list.

>> +	/**
>> +	 * The baseURI of a bundle_list is used as the base for any
>> +	 * relative URIs advertised by the bundle list at that location.
>> +	 *
>> +	 * When the list is generated from a Git server, then use that
>> +	 * server's location.
>
> Hmmm, I think I'm missing something with my earlier comment. I thought the
> 'uri' argument to 'bundle_uri_parse_config_format()' was an individual
> bundle's URI? What's the "server's location" in this context?

I can work to make this concept clearer by rewording this comment.

>> @@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
>>
>>  	init_bundle_list(&list);
>>
>> +	list.baseURI = xstrdup("<uri>");
>
> Using a hardcoded value here leads to pretty different behavior in
> 'test-bundle-uri.c' vs. starting with an unset 'list.baseURI' in something
> like 'clone'. Why does this need to be set to '<uri>' for the tests?

In this part of the test helper, we are not making a connection to a server
and instead parsing a bundle list file directly. To demonstrate how the
relative paths work during this parsing, we add a bogus baseURI here so
we can clearly see where the relative paths were parsed versus using the
URI as an absolute URI.


>> +test_expect_success 'bundle_uri_parse_line(): relative URIs' '
>> +	cat >in <<-\EOF &&
>> +	bundle.one.uri=bundle.bdl
>> +	bundle.two.uri=../bundle.bdl
>> +	bundle.three.uri=sub/dir/bundle.bdl
>> +	EOF
>> +
>> +	cat >expect <<-\EOF &&
>> +	[bundle]
>> +		version = 1
>> +		mode = all
>> +	[bundle "one"]
>> +		uri = <uri>/bundle.bdl
>> +	[bundle "two"]
>> +		uri = bundle.bdl
>
> This seems a little strange, but it looks like '<uri>/../bundle.bdl'
> normalizes to 'bundle.bdl' because '<uri>' is treated like a regular path
> element (like a directory).
>
> Out of curiosity, what would happen if 'bundle.two.uri' was
> '../../bundle.bdl'?

It will fail! The error message is

	"fatal: cannot strip one component off url '.'"

This is disappointing that an erroneous bundle list could cause a 'git
clone' command to die(), when we want the bundle URI feature to allow the
clone to continue normally even if the bundle downloads fail. I will mark
this for #leftoverbits, since it would involve changing the interface for
chop_last_dir() and relative_url() in remote.c.

At minimum, I will document this with a test case.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 9/9] clone: unbundle the advertised bundles
  2022-11-29  1:59     ` Victoria Dye
@ 2022-12-02 16:16       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-02 16:16 UTC (permalink / raw)
  To: Victoria Dye, Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 11/28/2022 8:59 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> A previous change introduced the transport methods to acquire a bundle
>> list from the 'bundle-uri' protocol v2 command, when advertised _and_
>> when the client has chosen to enable the feature.
>>
>> Teach Git to download and unbundle the data advertised by those bundles
>> during 'git clone'.
>>
>> Also, since the --bundle-uri option exists, we do not want to mix the
>> advertised bundles with the user-specified bundles.
>>
>> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>> ---
>>  builtin/clone.c  | 26 +++++++++++++++++----
>>  t/t5601-clone.sh | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 80 insertions(+), 5 deletions(-)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index 22b1e506452..09f10477ed6 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -1267,11 +1267,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>  	if (refs)
>>  		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
>>
>> -	/*
>> -	 * Populate transport->got_remote_bundle_uri and
>> -	 * transport->bundle_uri. We might get nothing.
>> -	 */
>> -	transport_get_remote_bundle_uri(transport, 1);
>> +	if (!bundle_uri) {
>> +		/*
>> +		* Populate transport->got_remote_bundle_uri and
>> +		* transport->bundle_uri. We might get nothing.
>> +		*/
>> +		transport_get_remote_bundle_uri(transport, 1);
>> +
>> +		if (transport->bundles &&
>> +		    hashmap_get_size(&transport->bundles->bundles)) {
>> +			/* At this point, we need the_repository to match the cloned repo. */
>> +			if (repo_init(the_repository, git_dir, work_tree))
>> +				warning(_("failed to initialize the repo, skipping bundle URI"));
>> +			if (fetch_bundle_list(the_repository,
>> +					      remote->url[0],
>> +					      transport->bundles))
>
> If the repo initialization fails, this line is still executed. Should the
> condition be 'else if' to avoid that?
>
> Otherwise, all of the added logic looks good to me.

Yes, it should. An earlier version of this follows the correct if/else if
pattern.

>> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
>> index 45f0803ed4d..d1d8139751e 100755
>> --- a/t/t5601-clone.sh
>> +++ b/t/t5601-clone.sh
>
> Per the commit message:
>
>> Also, since the --bundle-uri option exists, we do not want to mix the
>> advertised bundles with the user-specified bundles.
>
> Could you add a test verifying that '--bundle-uri' causes 'clone' to skip
> bundle URI auto-discovery? It's clear from the implementation above that
> 'clone' is currently doing that as-expected, but it might be nice to have
> the test for regression testing purposes.

I can add that to the lib-bundle-uri-protocol.sh tests pretty easily.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path()
  2022-11-16 19:51   ` [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
  2022-11-29  1:03     ` Victoria Dye
@ 2022-12-02 18:32     ` Ævar Arnfjörð Bjarmason
  2022-12-05 15:11       ` Derrick Stolee
  1 sibling, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-02 18:32 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee


On Wed, Nov 16 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The strbuf_parent_directory() method was added as a static method in
> contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
> config and starts maintenance, 2021-12-03) and then removed in
> 65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
> there is a need for a similar method in the bundle URI feature.
>
> Re-add the method, this time in strbuf.c, but with a new name:
> strbuf_strip_file_from_path(). The method requirements are slightly
> modified to allow a trailing slash, in which case nothing is done, which
> makes the name change valuable. The return value is the number of bytes
> removed.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  strbuf.c |  9 +++++++++
>  strbuf.h | 12 ++++++++++++
>  2 files changed, 21 insertions(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index 0890b1405c5..8d1e2e8bb61 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1200,3 +1200,12 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>  	free(path2);
>  	return res;
>  }
> +
> +size_t strbuf_strip_file_from_path(struct strbuf *buf)

Nit: Almost every function in this API calls its argument "sb", let's do
that for new functions.

> +{
> +	size_t len = buf->len;
> +	size_t offset = offset_1st_component(buf->buf);

Mm, isn't the return value of offset_1st_component() a boolean? it's
just an "is_dir_sep(buf->buf[0])".

So this works to....

> +	char *path_sep = find_last_dir_sep(buf->buf + offset);

...find the last dir separator starting at either 0 or 1.

But anyway, what sort of string is this expecting to handle where the
last dir separator isn't >=1 offset into the string anyway? Shouldn't we
just exclude the string "/" here? Maybe I'm missing something....


> +	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
> +	return len - buf->len;
> +}

Urm, so isn't this literally one-byte away from being equivalent to a
function that's already in the API?:
strbuf_trim_trailing_dir_sep. I.e. this seems to me to do the same as
this new function.

Context manually adjusted so we can see the only difference is the
"is_dir_sep" v.s. "!is_dir_sep".

There's a few strbuf functions like that, and we should probably
generalize the ctype-like test they share into some callback mechanism,
but in the meantime keeping with the pattern & naming of existing
functions seems better.

But again, I may be missing something.

I removed the comment because if it's the same then the new function is
self-documenting. It doesn't matter if the URI ends in a "/" or not, all
we need to get across is that we're stripping non-dirsep characters from
the URL, whether it ends in one or not.

In terms of correctness: The use of is_dir_sep() seems incorrect to me
here. On Windows won't that end up using is_xplatform_dir_sep(), so
bundle-uri's behavior will differ there, and we'll support \\-paths as
well as /-paths, but elsewhere only /-paths.

Shouldn't this just test "/", not "is_dir_sep()"?

At which point (if the above is correct) we could also call this
strbuf_rtrim_notchr(), and just call strbuf_rtrim_notchr(sb, '/') (but
even better would be a ctype-like callback).

diff --git a/bundle-uri.c b/bundle-uri.c
index 5914d220c43..c3ed04eae0f 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -192,20 +192,15 @@ int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
 	if (!list->baseURI) {
 		struct strbuf baseURI = STRBUF_INIT;
 		strbuf_addstr(&baseURI, uri);
 
-		/*
-		 * If the URI does not end with a trailing slash, then
-		 * remove the filename portion of the path. This is
-		 * important for relative URIs.
-		 */
-		strbuf_strip_file_from_path(&baseURI);
+		strbuf_trim_trailing_not_dir_sep(&baseURI);
 		list->baseURI = strbuf_detach(&baseURI, NULL);
 	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
 
 	if (!result && list->mode == BUNDLE_MODE_NONE) {
diff --git a/strbuf.c b/strbuf.c
index 8d1e2e8bb61..3466552b854 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -117,14 +117,21 @@ void strbuf_rtrim(struct strbuf *sb)
 void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
 {
 	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]))
 		sb->len--;
 	sb->buf[sb->len] = '\0';
 }
 
+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb)
+{
+	while (sb->len > 0 && !is_dir_sep((unsigned char)sb->buf[sb->len - 1]))
+		sb->len--;
+	sb->buf[sb->len] = '\0';
+}
+
 void strbuf_trim_trailing_newline(struct strbuf *sb)
 {
 	if (sb->len > 0 && sb->buf[sb->len - 1] == '\n') {
 		if (--sb->len > 0 && sb->buf[sb->len - 1] == '\r')
 			--sb->len;
 		sb->buf[sb->len] = '\0';
 	}
@@ -1196,16 +1203,7 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			res = error_errno(_("could not edit '%s'"), path);
 		unlink(path);
 	}
 
 	free(path2);
 	return res;
 }
-
-size_t strbuf_strip_file_from_path(struct strbuf *buf)
-{
-	size_t len = buf->len;
-	size_t offset = offset_1st_component(buf->buf);
-	char *path_sep = find_last_dir_sep(buf->buf + offset);
-	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
-	return len - buf->len;
-}
diff --git a/strbuf.h b/strbuf.h
index 4822b713786..b5929ecc8dd 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -185,14 +185,16 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
  */
 void strbuf_trim(struct strbuf *sb);
 void strbuf_rtrim(struct strbuf *sb);
 void strbuf_ltrim(struct strbuf *sb);
 
 /* Strip trailing directory separators */
 void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
+/* Strip trailing non-directory separators */
+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb);
 
 /* Strip trailing LF or CR/LF */
 void strbuf_trim_trailing_newline(struct strbuf *sb);
 
 /**
  * Replace the contents of the strbuf with a reencoded form.  Returns -1
  * on error, 0 on success.

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-02 18:32     ` Ævar Arnfjörð Bjarmason
@ 2022-12-05 15:11       ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-05 15:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/2/2022 1:32 PM, Ævar Arnfjörð Bjarmason wrote:

>> +size_t strbuf_strip_file_from_path(struct strbuf *buf)
> 
> Nit: Almost every function in this API calls its argument "sb", let's do
> that for new functions.

Sure.

>> +{
>> +	size_t len = buf->len;
>> +	size_t offset = offset_1st_component(buf->buf);
> 
> Mm, isn't the return value of offset_1st_component() a boolean? it's
> just an "is_dir_sep(buf->buf[0])".
> 
> So this works to....
> 
>> +	char *path_sep = find_last_dir_sep(buf->buf + offset);
> 
> ...find the last dir separator starting at either 0 or 1.
> 
> But anyway, what sort of string is this expecting to handle where the
> last dir separator isn't >=1 offset into the string anyway? Shouldn't we
> just exclude the string "/" here? Maybe I'm missing something....

The difference is all about whether or not we start with a slash _and_ no
other slash appears in the path.

 1. "/root-file": offset becomes 1 and path_sep becomes NULL, so the
    length is reduced to offset (1), resulting in "/"
 2. "local-file": offset becomes 0 and path_sep becomes NULL, so the
    length is reduced to offset (0), resulting in ""

Of course, the first case would be the same if we did not set offset to 1
and then path_sep points to that base slash, so the length calculation
results in "/" anyway. That will simplify things.

void strbuf_strip_file_from_path(struct strbuf *sb)
{
	char *path_sep = find_last_dir_sep(sb->buf);
	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
}

>> +	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
>> +	return len - buf->len;
>> +}
> 
> Urm, so isn't this literally one-byte away from being equivalent to a
> function that's already in the API?:
> strbuf_trim_trailing_dir_sep. I.e. this seems to me to do the same as
> this new function.
> 
> Context manually adjusted so we can see the only difference is the
> "is_dir_sep" v.s. "!is_dir_sep".
> 
> There's a few strbuf functions like that, and we should probably
> generalize the ctype-like test they share into some callback mechanism,
> but in the meantime keeping with the pattern & naming of existing
> functions seems better.

Using callbacks or extra options seems like overkill here.

> I removed the comment because if it's the same then the new function is
> self-documenting. It doesn't matter if the URI ends in a "/" or not, all
> we need to get across is that we're stripping non-dirsep characters from
> the URL, whether it ends in one or not.
> 
> In terms of correctness: The use of is_dir_sep() seems incorrect to me
> here. On Windows won't that end up using is_xplatform_dir_sep(), so
> bundle-uri's behavior will differ there, and we'll support \\-paths as
> well as /-paths, but elsewhere only /-paths.
> 
> Shouldn't this just test "/", not "is_dir_sep()"?

This method is used for file paths, too, so is_dir_sep() is important to
work properly on Windows platforms.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2
  2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                     ` (8 preceding siblings ...)
  2022-11-16 19:51   ` [PATCH v2 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50   ` Derrick Stolee via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
                       ` (12 more replies)
  9 siblings, 13 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

This is based on the recent master batch that included ds/bundle-uri-....

Now that git clone --bundle-uri can download a bundle list from a plaintex
file in config format, we can use the same set of key-value pairs to
advertise a bundle list over protocol v2. At the end of this series:

 1. A server can advertise bundles when uploadPack.advertiseBundleURIs is
    enabled. The bundle list comes from the server's local config,
    specifically the bundle.* namespace.
 2. A client can notice a server's bundle-uri advertisement and request the
    bundle list if transfer.bundleURI is enabled. The bundles are downloaded
    as if the list was advertised from the --bundle-uri option.

Many patches in this series were adapted from Ævar's v2 RFC [1]. He is
retained as author and I added myself as co-author only if the modifications
were significant.

[1]
https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

 * Patches 1-7 are mostly taken from [1], again with mostly minor updates.
   The one major difference is the packet line format being a single
   key=value format instead of a sequence of pairs. (In v3, these commits
   are significantly reorganized from [1].)

 * Patches 8-11 finish off the ability for the client to notice the
   capability, request the values, and download bundles before continuing
   with the rest of the download.

One thing that is not handled here but could be handled in a future change
is to disconnect from the origin Git server while downloading the bundle
URIs, then reconnecting afterwards. This does not make any difference for
HTTPS, but SSH may benefit from the reduced connection time. The git clone
--bundle-uri option did not suffer from this because the bundles are
downloaded before the server connection begins.

After this series, there is one more before the original scope of the plan
is complete: using creation tokens as a heuristic. See [2] for the RFC
version of those patches.

[2] https://github.com/derrickstolee/git/pull/22


Updates in v3
=============

Most of these updates are due to Victoria's very thorough review. Thanks!

 * What was patch 2 was split to be better understood.
 * The new patch 2 is only the new test script infrastructure for testing
   whether or not the server provides the bundle-uri capability. It is
   extended with other more complicated examples in later patches. The name
   was rewritten from lib-t5730-*.sh to lib-bundle-uri-protocol.sh and the
   variable names are renamed with the BUNDLE_URI_ prefix.
 * The new patch 3 performs the basic client interaction with the
   'bundle-uri' command, while still not being fully wired up on the server
   side. The tests do check that the client requests the bundle-uri command
   after seeing it in the server's capabilities. One important difference
   from earlier is that the check for server_supports_v2() was moved into
   the get_bundle_uri() method (underneath the vtable) because we need to
   check the handshake before calling that method. It makes most sense to
   put the handshake call there, so do it from the start.
 * Patch 4 carefully tests how the transfer.bundleURI config blocks the
   client-side request of the bundle-uri command. Later tests will use the
   GIT_TEST_BUNDLE_URI environment variable instead.
 * The new Patch 5 renames got_remote_heads to finished_handshake in 'struct
   git_transport_data' and that's it. That new value is then used in patch 6
   to indicate if we need to request the handshake in the bundle URI logic.
 * Patch 6 creates the ls-remote helper in 'test-tool bundle-uri' as before,
   but now only makes use of the finished_handshake member instead of
   creating a new one. The test helper represents an example consumer of
   transport_get_remote_bundle_uri() without first doing the server-side
   handshake, which motivates several of the placements of code within that
   method and get_bundle_uri() earlier in the series. The "quiet" option is
   also removed to simplify the test helper and to always communicate the
   inner errors to the user.
 * Patch 7 adds the server-side listing of bundle.* config values. The test
   scripts around these config values have been cleaned up since the
   previous version.
 * Patch 8 has another iteration of strbuf_strip_file_from_path() taking the
   feedback from Victoria and Ævar.
 * Patch 9 adds the relative path logic. The definition of the base path is
   clarified in the commit message and comments. An additional test shows
   what happens if the server advertises too many parent paths
   (unfortunately, a die(), and this is marked for cleanup later).
 * Patch 10 is identical to the old patch 8.
 * Patch 11 completes the work by having the client download the bundles
   provided by the server list. It fixes an if/else that should have been an
   if/else-if. A new test checks that the --bundle-uri=X option overrides
   the server advertisement.


Updates in v2
=============

 * Commit messages now refer to protocol v2 "commands" not "verbs".
 * Several edits were made to gitprotocol-v2.txt thanks to Victoria's
   thorough review.
 * strbuf_parent_directory() is renamed strbuf_strip_file_from_path() to
   make it more clear how it behaves when ending with a slash.

Thanks,

 * Stolee

Derrick Stolee (6):
  transport: rename got_remote_heads
  bundle-uri: serve bundle.* keys from config
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: allow relative URLs in bundle lists
  bundle-uri: download bundles from an advertised list
  clone: unbundle the advertised bundles

Ævar Arnfjörð Bjarmason (5):
  protocol v2: add server-side "bundle-uri" skeleton
  t: create test harness for 'bundle-uri' command
  clone: request the 'bundle-uri' command when available
  bundle-uri client: add boolean transfer.bundleURI setting
  bundle-uri client: add helper for testing server

 Documentation/config/transfer.txt      |   6 +
 Documentation/gitprotocol-v2.txt       | 201 +++++++++++++++++++++++
 builtin/clone.c                        |  22 +++
 bundle-uri.c                           |  87 +++++++++-
 bundle-uri.h                           |  32 ++++
 connect.c                              |  44 +++++
 remote.h                               |   5 +
 serve.c                                |   6 +
 strbuf.c                               |   6 +
 strbuf.h                               |  11 ++
 t/helper/test-bundle-uri.c             |  48 ++++++
 t/lib-bundle-uri-protocol.sh           | 212 +++++++++++++++++++++++++
 t/t5601-clone.sh                       |  59 +++++++
 t/t5701-git-serve.sh                   |  40 ++++-
 t/t5730-protocol-v2-bundle-uri-file.sh |  17 ++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 ++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 ++
 t/t5750-bundle-uri-parse.sh            |  82 ++++++++++
 t/t9119-git-svn-info.sh                |   2 +-
 t/test-lib-functions.sh                |   7 +
 transport-helper.c                     |  13 ++
 transport-internal.h                   |   7 +
 transport.c                            |  88 ++++++++--
 transport.h                            |  19 +++
 24 files changed, 1036 insertions(+), 12 deletions(-)
 create mode 100644 t/lib-bundle-uri-protocol.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1400%2Fderrickstolee%2Fbundle-redo%2Fadvertise-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1400/derrickstolee/bundle-redo/advertise-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1400

Range-diff vs v2:

  1:  beae335b855 =  1:  beae335b855 protocol v2: add server-side "bundle-uri" skeleton
  -:  ----------- >  2:  fcdfef2012a t: create test harness for 'bundle-uri' command
  2:  0d85aef965d !  3:  a0188ae39c6 bundle-uri client: add minimal NOOP client
     @@ Metadata
      Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
      
       ## Commit message ##
     -    bundle-uri client: add minimal NOOP client
     +    clone: request the 'bundle-uri' command when available
      
     -    Set up all the needed client parts of the "bundle-uri" protocol
     -    extension, without actually doing anything with the bundle URIs.
     +    Set up all the needed client parts of the 'bundle-uri' protocol v2
     +    command, without actually doing anything with the bundle URIs.
      
     -    I.e. if the server says it supports "bundle-uri" we'll issue a
     -    command=bundle-uri after command=ls-refs when we're cloning. We'll
     -    parse the returned output using the code already tested for in
     -    t5750-bundle-uri-parse.sh.
     +    If the server says it supports 'bundle-uri' teach Git to issue the
     +    'bundle-uri' command after the 'ls-refs' during 'git clone'. The
     +    returned key=value pairs are passed to the bundle list code which is
     +    tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.
      
     -    What we aren't doing is actually acting on that data, i.e. downloading
     -    the bundle(s) before we get to doing the command=fetch, and adjusting
     -    our negotiation dialog appropriately. I'll do that in subsequent
     -    commits.
     +    At this point, Git does nothing with that bundle list. It will not
     +    download any of the bundles. That will come in a later change after
     +    these protocol bits are finalized.
      
     -    There's a question of what level of encapsulation we should use here,
     -    I've opted to use connect.h in clone.c, but we could also e.g. make
     -    transport_get_remote_refs() invoke this, i.e. make it implicitly get
     -    the bundle-uri list for later steps.
     -
     -    This approach means that we don't "support" this in "git fetch" for
     -    now. I'm starting with the case of initial clones, although as noted
     -    in preceding commits to the protocol documentation nothing about this
     -    approach precludes getting bundles on incremental fetches.
     -
     -    For the t5732-protocol-v2-bundle-uri-http.sh it's not easy to set
     -    environment variables for git-upload-pack (it's started by Apache), so
     -    let's skip the test under T5730_HTTP, and add unused T5730_{FILE,GIT}
     -    prerequisites for consistency and future use.
     +    The no-op client is initially used only by 'git clone' to test the basic
     +    functionality, and eventually will bootstrap the initial download of Git
     +    objects during a fresh clone. The bundle URI client will not be
     +    integrated into other fetches until a mechanism is created to select a
     +    subset of bundles for download.
      
          Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
       ## builtin/clone.c ##
     -@@
     - #include "iterator.h"
     - #include "sigchain.h"
     - #include "branch.h"
     -+#include "connect.h"
     - #include "remote.h"
     - #include "run-command.h"
     - #include "connected.h"
      @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
       	if (refs)
       		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
       		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
       
      
     - ## bundle-uri.c ##
     -@@ bundle-uri.c: int bundle_uri_advertise(struct repository *r, struct strbuf *value)
     - {
     - 	static int advertise_bundle_uri = -1;
     - 
     -+	if (value &&
     -+	    git_env_bool("GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE", 0))
     -+		strbuf_addstr(value, "test-unknown-capability-value");
     -+
     - 	if (advertise_bundle_uri != -1)
     - 		goto cached;
     - 
     -
       ## connect.c ##
      @@
       #include "version.h"
     @@ connect.c: static void send_capabilities(int fd_out, struct packet_reader *reade
      +	packet_write_fmt(fd_out, "command=bundle-uri\n");
      +	packet_delim(fd_out);
      +
     -+	/* Send options */
     -+	if (git_env_bool("GIT_TEST_PROTOCOL_BAD_BUNDLE_URI", 0))
     -+		packet_write_fmt(fd_out, "test-bad-client\n");
      +	packet_flush(fd_out);
      +
      +	/* Process response from server */
     @@ remote.h: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
       
       /*
      
     - ## t/lib-t5730-protocol-v2-bundle-uri.sh (new) ##
     -@@
     -+# Included from t573*-protocol-v2-bundle-uri-*.sh
     -+
     -+T5730_PARENT=
     -+T5730_URI=
     -+T5730_BUNDLE_URI=
     -+case "$T5730_PROTOCOL" in
     -+file)
     -+	T5730_PARENT=file_parent
     -+	T5730_URI="file://$PWD/file_parent"
     -+	T5730_BUNDLE_URI="$T5730_URI/fake.bdl"
     -+	test_set_prereq T5730_FILE
     -+	;;
     -+git)
     -+	. "$TEST_DIRECTORY"/lib-git-daemon.sh
     -+	start_git_daemon --export-all --enable=receive-pack
     -+	T5730_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
     -+	T5730_URI="$GIT_DAEMON_URL/parent"
     -+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
     -+	test_set_prereq T5730_GIT
     -+	;;
     -+http)
     -+	. "$TEST_DIRECTORY"/lib-httpd.sh
     -+	start_httpd
     -+	T5730_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
     -+	T5730_URI="$HTTPD_URL/smart/http_parent"
     -+	T5730_BUNDLE_URI="https://example.com/fake.bdl"
     -+	test_set_prereq T5730_HTTP
     -+	;;
     -+*)
     -+	BUG "Need to pass valid T5730_PROTOCOL (was $T5730_PROTOCOL)"
     -+	;;
     -+esac
     -+
     -+test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
     -+	git init "$T5730_PARENT" &&
     -+	test_commit -C "$T5730_PARENT" one &&
     -+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
     -+'
     + ## t/lib-bundle-uri-protocol.sh ##
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
     + 	# Server advertised bundle-uri capability
     + 	grep "< bundle-uri" log
     + '
      +
     -+# Poor man's URI escaping. Good enough for the test suite whose trash
     -+# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
     -+# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
     -+test_uri_escape() {
     -+	sed 's/ /%20/g'
     -+}
     -+
     -+case "$T5730_PROTOCOL" in
     -+http)
     -+	test_expect_success "setup config for $T5730_PROTOCOL:// tests" '
     -+		git -C "$T5730_PARENT" config http.receivepack true
     -+	'
     -+	;;
     -+*)
     -+	;;
     -+esac
     -+T5730_BUNDLE_URI_ESCAPED=$(echo "$T5730_BUNDLE_URI" | test_uri_escape)
     -+
     -+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundle-uri" '
     -+	test_when_finished "rm -f log" &&
     -+	test_when_finished "git -C \"$T5730_PARENT\" config uploadpack.advertiseBundleURIs true" &&
     -+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
     ++test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
     ++	test_when_finished "rm -rf log cloned" &&
      +
      +	GIT_TRACE_PACKET="$PWD/log" \
      +	git \
      +		-c protocol.version=2 \
     -+		ls-remote --symref "$T5730_URI" \
     -+		>actual 2>err &&
     -+
     -+	# Server responded using protocol v2
     -+	grep "< version 2" log &&
     -+
     -+	! grep bundle-uri log
     -+'
     -+
     -+test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bundle-uri" '
     -+	test_when_finished "rm -f log" &&
     -+
     -+	test_config -C "$T5730_PARENT" \
     -+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
     -+
     -+	GIT_TRACE_PACKET="$PWD/log" \
     -+	git \
     -+		-c protocol.version=2 \
     -+		ls-remote --symref "$T5730_URI" \
     ++		clone "$BUNDLE_URI_REPO_URI" cloned \
      +		>actual 2>err &&
      +
      +	# Server responded using protocol v2
      +	grep "< version 2" log &&
      +
      +	# Server advertised bundle-uri capability
     -+	grep bundle-uri log
     -+'
     ++	grep "< bundle-uri" log &&
      +
     -+test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
     -+	test_when_finished "rm -f log" &&
     -+
     -+	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
     -+		"$T5730_BUNDLE_URI_ESCAPED" &&
     -+
     -+	cat >err.expect <<-\EOF &&
     -+	Cloning into '"'"'child'"'"'...
     -+	EOF
     -+	case "$T5730_PROTOCOL" in
     -+	file)
     -+		cat >fatal-bundle-uri.expect <<-\EOF
     -+		fatal: bundle-uri: unexpected argument: '"'"'test-bad-client'"'"'
     -+		EOF
     -+		;;
     -+	*)
     -+		cat >fatal.expect <<-\EOF
     -+		fatal: read error: Connection reset by peer
     -+		EOF
     -+		;;
     -+	esac &&
     -+
     -+	test_when_finished "rm -rf child" &&
     -+	test_must_fail ok=sigpipe env \
     -+		GIT_TRACE_PACKET="$PWD/log" \
     -+		GIT_TEST_PROTOCOL_BAD_BUNDLE_URI=true \
     -+		git -c protocol.version=2 \
     -+		clone "$T5730_URI" child \
     -+		>out 2>err &&
     -+	test_must_be_empty out &&
     -+
     -+	grep -v -e ^fatal: -e ^error: err >err.actual &&
     -+	test_cmp err.expect err.actual &&
     -+
     -+	case "$T5730_PROTOCOL" in
     -+	file)
     -+		# Due to general race conditions with client/server replies we
     -+		# may or may not get "fatal: the remote end hung up
     -+		# expectedly" here
     -+		grep "^fatal: bundle-uri:" err >fatal-bundle-uri.actual &&
     -+		test_cmp fatal-bundle-uri.expect fatal-bundle-uri.actual
     -+		;;
     -+	*)
     -+		grep "^fatal:" err >fatal.actual &&
     -+		# Due to the same race conditions this might be
     -+		# "fatal: read error: Connection reset by peer", "fatal: the remote end
     -+		# hung up unexpectedly" etc.
     -+		cat fatal.actual &&
     -+		test_file_not_empty fatal.actual
     -+		;;
     -+	esac &&
     -+
     -+	grep "clone> test-bad-client$" log >sent-bad-request &&
     -+	test_file_not_empty sent-bad-request
     ++	# Client issued bundle-uri command
     ++	grep "> command=bundle-uri" log
      +'
      
     - ## t/t5730-protocol-v2-bundle-uri-file.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+test_description="Test bundle-uri with protocol v2 and 'file://' transport"
     -+
     -+TEST_NO_CREATE_REPO=1
     -+
     -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
     -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     -+
     -+. ./test-lib.sh
     -+
     -+# Test protocol v2 with 'file://' transport
     -+#
     -+T5730_PROTOCOL=file
     -+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
     -+
     -+test_expect_success "unknown capability value with $T5730_PROTOCOL:// using protocol v2" '
     -+	test_when_finished "rm -f log" &&
     -+
     -+	test_config -C "$T5730_PARENT" \
     -+		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
     -+
     -+	GIT_TRACE_PACKET="$PWD/log" \
     -+	GIT_TEST_BUNDLE_URI_UNKNOWN_CAPABILITY_VALUE=true \
     -+	git \
     -+		-c protocol.version=2 \
     -+		ls-remote --symref "$T5730_URI" \
     -+		>actual 2>err &&
     -+
     -+	# Server responded using protocol v2
     -+	grep "< version 2" log &&
     -+
     -+	grep "> bundle-uri=test-unknown-capability-value" log
     -+'
     -+
     -+test_done
     -
     - ## t/t5731-protocol-v2-bundle-uri-git.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
     -+
     -+TEST_NO_CREATE_REPO=1
     -+
     -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
     -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     -+
     -+. ./test-lib.sh
     -+
     -+# Test protocol v2 with 'git://' transport
     -+#
     -+T5730_PROTOCOL=git
     -+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
     -+
     -+test_done
     -
     - ## t/t5732-protocol-v2-bundle-uri-http.sh (new) ##
     -@@
     -+#!/bin/sh
     -+
     -+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
     -+
     -+TEST_NO_CREATE_REPO=1
     -+
     -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
     -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     -+
     -+. ./test-lib.sh
     -+
     -+# Test protocol v2 with 'git://' transport
     -+#
     -+T5730_PROTOCOL=http
     -+. "$TEST_DIRECTORY"/lib-t5730-protocol-v2-bundle-uri.sh
     -+
     -+test_done
     -
       ## transport-helper.c ##
      @@ transport-helper.c: static struct ref *get_refs_list_using_list(struct transport *transport,
       	return ret;
     @@ transport.c: static struct ref *get_refs_via_connect(struct transport *transport
      +		init_bundle_list(transport->bundles);
      +	}
      +
     ++	/*
     ++	 * "Support" protocol v0 and v2 without bundle-uri support by
     ++	 * silently degrading to a NOOP.
     ++	 */
     ++	if (!server_supports_v2("bundle-uri", 0))
     ++		return 0;
     ++
      +	packet_reader_init(&reader, data->fd[0], NULL, 0,
      +			   PACKET_READ_CHOMP_NEWLINE |
      +			   PACKET_READ_GENTLE_ON_EOF);
     @@ transport.c: int transport_fetch_refs(struct transport *transport, struct ref *r
      +	const struct transport_vtable *vtable = transport->vtable;
      +
      +	/* Check config only once. */
     -+	if (transport->got_remote_bundle_uri++)
     ++	if (transport->got_remote_bundle_uri)
      +		return 0;
     ++	transport->got_remote_bundle_uri = 1;
      +
     -+	/*
     -+	 * "Support" protocol v0 and v2 without bundle-uri support by
     -+	 * silently degrading to a NOOP.
     -+	 */
     -+	if (!server_supports_v2("bundle-uri", 0))
     -+		return 0;
     -+
     -+	/*
     -+	 * This is intentionally below the transport.injectBundleURI,
     -+	 * we want to be able to inject into protocol v0, or into the
     -+	 * dialog of a server who doesn't support this.
     -+	 */
      +	if (!vtable->get_bundle_uri)
      +		return error(_("bundle-uri operation not supported by protocol"));
      +
  5:  93397468931 !  4:  e46118e60f7 bundle-uri client: add boolean transfer.bundleURI setting
     @@ Documentation/config/transfer.txt: transfer.unpackLimit::
      +	bundles before continuing the clone through the Git protocol.
      +	Defaults to `false`.
      
     - ## t/lib-t5730-protocol-v2-bundle-uri.sh ##
     -@@
     - # Included from t573*-protocol-v2-bundle-uri-*.sh
     + ## t/lib-bundle-uri-protocol.sh ##
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
     + '
       
     -+GIT_TEST_BUNDLE_URI=1
     -+export GIT_TEST_BUNDLE_URI
     + test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
     +-	test_when_finished "rm -rf log cloned" &&
     ++	test_when_finished "rm -rf log cloned cloned2" &&
     + 
     + 	GIT_TRACE_PACKET="$PWD/log" \
     ++	GIT_TEST_BUNDLE_URI=0 \
     + 	git \
     + 		-c protocol.version=2 \
     + 		clone "$BUNDLE_URI_REPO_URI" cloned \
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
     + 	# Server advertised bundle-uri capability
     + 	grep "< bundle-uri" log &&
     + 
     ++	# Client did not issue bundle-uri command
     ++	! grep "> command=bundle-uri" log &&
     ++
     ++	GIT_TRACE_PACKET="$PWD/log" \
     ++	git \
     ++		-c transfer.bundleURI=true \
     ++		-c protocol.version=2 \
     ++		clone "$BUNDLE_URI_REPO_URI" cloned2 \
     ++		>actual 2>err &&
      +
     - T5730_PARENT=
     - T5730_URI=
     - T5730_BUNDLE_URI=
     ++	# Server responded using protocol v2
     ++	grep "< version 2" log &&
     ++
     ++	# Server advertised bundle-uri capability
     ++	grep "< bundle-uri" log &&
     ++
     + 	# Client issued bundle-uri command
     + 	grep "> command=bundle-uri" log
     + '
      
       ## transport.c ##
      @@ transport.c: int transport_fetch_refs(struct transport *transport, struct ref *refs)
       
     - int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
     + int transport_get_remote_bundle_uri(struct transport *transport)
       {
      +	int value = 0;
       	const struct transport_vtable *vtable = transport->vtable;
       
       	/* Check config only once. */
     -@@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
     +@@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport)
       		return 0;
     + 	transport->got_remote_bundle_uri = 1;
       
     - 	/*
     --	 * This is intentionally below the transport.injectBundleURI,
     --	 * we want to be able to inject into protocol v0, or into the
     --	 * dialog of a server who doesn't support this.
     -+	 * Don't use bundle-uri at all, if configured not to. Only proceed
     -+	 * if GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
     - 	 */
     ++	/*
     ++	 * Don't request bundle-uri from the server unless configured to
     ++	 * do so by GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
     ++	 */
      +	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&
      +	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
      +		return 0;
      +
     - 	if (!vtable->get_bundle_uri) {
     - 		if (quiet)
     - 			return -1;
     + 	if (!vtable->get_bundle_uri)
     + 		return error(_("bundle-uri operation not supported by protocol"));
     + 
  -:  ----------- >  5:  b009b4f58ea transport: rename got_remote_heads
  3:  c3269a24b57 !  6:  46a58e83caf bundle-uri client: add helper for testing server
     @@ Commit message
          for issuing protocol v2 "bundle-uri" commands to a server, and to the
          parsing routines in bundle-uri.c.
      
     -    Since in the "git clone" case we'll have already done the handshake(),
     -    but not here, introduce a "got_advertisement" state along with
     -    "got_remote_heads". It seems to me that the "got_remote_heads" is
     -    badly named in the first place, and the whole logic of eagerly getting
     -    ls-refs on handshake() or not could be refactored somewhat, but let's
     -    not do that now, and instead just add another self-documenting state
     -    variable.
     +    In the "git clone" case we'll have already done the handshake(),
     +    but not here. Add an extra case to check for this handshake in
     +    get_bundle_uri() for ease of use for future callers. Rename the existing
     +    'got_remote_heads' to 'finished_handshake' to make it more clear what
     +    that bit represents.
      
          Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     - ## builtin/clone.c ##
     -@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     - 	 * Populate transport->got_remote_bundle_uri and
     - 	 * transport->bundle_uri. We might get nothing.
     - 	 */
     --	transport_get_remote_bundle_uri(transport);
     -+	transport_get_remote_bundle_uri(transport, 1);
     - 
     - 	if (mapped_refs) {
     - 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
     -
       ## t/helper/test-bundle-uri.c ##
      @@
       #include "bundle-uri.h"
     @@ t/helper/test-bundle-uri.c: usage:
      +	if (server_options.nr)
      +		transport->server_options = &server_options;
      +
     -+	if (transport_get_remote_bundle_uri(transport, 0) < 0) {
     ++	if (transport_get_remote_bundle_uri(transport) < 0) {
      +		error(_("could not get the bundle-uri list"));
      +		status = 1;
      +		goto cleanup;
     @@ t/helper/test-bundle-uri.c: int cmd__bundle_uri(int argc, const char **argv)
       
       usage:
      
     - ## t/lib-t5730-protocol-v2-bundle-uri.sh ##
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: esac
     - test_expect_success "setup protocol v2 $T5730_PROTOCOL:// tests" '
     - 	git init "$T5730_PARENT" &&
     - 	test_commit -C "$T5730_PARENT" one &&
     --	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true
     -+	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs true &&
     -+	git -C "$T5730_PARENT" config bundle.version 1 &&
     -+	git -C "$T5730_PARENT" config bundle.mode all
     + ## t/lib-bundle-uri-protocol.sh ##
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
     + 	# Client issued bundle-uri command
     + 	grep "> command=bundle-uri" log
       '
     - 
     - # Poor man's URI escaping. Good enough for the test suite whose trash
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: no bundl
     - 	git -C "$T5730_PARENT" config uploadpack.advertiseBundleURIs false &&
     - 
     - 	GIT_TRACE_PACKET="$PWD/log" \
     --	git \
     --		-c protocol.version=2 \
     --		ls-remote --symref "$T5730_URI" \
     -+	test-tool bundle-uri \
     -+		ls-remote "$T5730_URI" \
     - 		>actual 2>err &&
     - 
     - 	# Server responded using protocol v2
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
     - 	test_when_finished "rm -f log" &&
     - 
     - 	test_config -C "$T5730_PARENT" \
     --		uploadpack.bundleURI "$T5730_BUNDLE_URI_ESCAPED" &&
     -+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
     - 
     - 	GIT_TRACE_PACKET="$PWD/log" \
     --	git \
     --		-c protocol.version=2 \
     --		ls-remote --symref "$T5730_URI" \
     -+	test-tool bundle-uri \
     -+		ls-remote "$T5730_URI" \
     - 		>actual 2>err &&
     - 
     - 	# Server responded using protocol v2
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "connect with $T5730_PROTOCOL:// using protocol v2: have bun
     - test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protocol v2" '
     - 	test_when_finished "rm -f log" &&
     - 
     --	test_config -C "$T5730_PARENT" uploadpack.bundleURI \
     --		"$T5730_BUNDLE_URI_ESCAPED" &&
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
     - 
     - 	cat >err.expect <<-\EOF &&
     - 	Cloning into '"'"'child'"'"'...
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success !T5730_HTTP "bad client with $T5730_PROTOCOL:// using protoc
     - 	grep "clone> test-bad-client$" log >sent-bad-request &&
     - 	test_file_not_empty sent-bad-request
     - '
     -+
     -+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
     -+	test_when_finished "rm -f log" &&
      +
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
     ++test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
     ++	test_config -C "$BUNDLE_URI_PARENT" \
     ++		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
      +
      +	# All data about bundle URIs
      +	cat >expect <<-EOF &&
     @@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success !T5730_HTTP "bad clie
      +		version = 1
      +		mode = all
      +	EOF
     -+	GIT_TRACE_PACKET="$PWD/log" \
     ++
      +	test-tool bundle-uri \
      +		ls-remote \
     -+		"$T5730_URI" \
     ++		"$BUNDLE_URI_REPO_URI" \
      +		>actual &&
      +	test_cmp_config_output expect actual
      +'
      +
     -+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and extra data" '
     -+	test_when_finished "rm -f log" &&
     -+
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.only.uri "$T5730_BUNDLE_URI_ESCAPED" &&
     ++test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 and extra data" '
     ++	test_config -C "$BUNDLE_URI_PARENT" \
     ++		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
      +
      +	# Extra data should be ignored
     -+	test_config -C "$T5730_PARENT" bundle.only.extra bogus &&
     ++	test_config -C "$BUNDLE_URI_PARENT" bundle.only.extra bogus &&
      +
      +	# All data about bundle URIs
      +	cat >expect <<-EOF &&
     @@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success !T5730_HTTP "bad clie
      +		version = 1
      +		mode = all
      +	EOF
     -+	GIT_TRACE_PACKET="$PWD/log" \
     ++
      +	test-tool bundle-uri \
      +		ls-remote \
     -+		"$T5730_URI" \
     ++		"$BUNDLE_URI_REPO_URI" \
      +		>actual &&
      +	test_cmp_config_output expect actual
      +'
      
       ## transport.c ##
     -@@ transport.c: struct git_transport_data {
     - 	struct git_transport_options options;
     - 	struct child_process *conn;
     - 	int fd[2];
     -+	unsigned got_advertisement : 1;
     - 	unsigned got_remote_heads : 1;
     - 	enum protocol_version version;
     - 	struct oid_array extra_have;
     -@@ transport.c: static struct ref *handshake(struct transport *transport, int for_push,
     - 		BUG("unknown protocol version");
     - 	}
     - 	data->got_remote_heads = 1;
     -+	data->got_advertisement = 1;
     - 	transport->hash_algo = reader.hash_algo;
     - 
     - 	if (reader.line_peeked)
      @@ transport.c: static int get_bundle_uri(struct transport *transport)
       		init_bundle_list(transport->bundles);
       	}
       
     -+	if (!data->got_advertisement) {
     -+		struct ref *refs;
     -+		struct git_transport_data *data = transport->data;
     -+		enum protocol_version version;
     ++	if (!data->finished_handshake) {
     ++		struct ref *refs = handshake(transport, 0, NULL, 0);
      +
     -+		refs = handshake(transport, 0, NULL, 0);
     -+		version = data->version;
     -+
     -+		switch (version) {
     -+		case protocol_v2:
     -+			assert(!refs);
     -+			break;
     -+		case protocol_v0:
     -+		case protocol_v1:
     -+		case protocol_unknown_version:
     -+			assert(refs);
     -+			break;
     -+		}
     ++		if (refs)
     ++			free_refs(refs);
      +	}
      +
     -+	/*
     -+	 * "Support" protocol v0 and v2 without bundle-uri support by
     -+	 * silently degrading to a NOOP.
     -+	 */
     -+	if (!server_supports_v2("bundle-uri", 0))
     -+		return 0;
     -+
     - 	packet_reader_init(&reader, data->fd[0], NULL, 0,
     - 			   PACKET_READ_CHOMP_NEWLINE |
     - 			   PACKET_READ_GENTLE_ON_EOF);
     -@@ transport.c: int transport_fetch_refs(struct transport *transport, struct ref *refs)
     - 	return rc;
     - }
     - 
     --int transport_get_remote_bundle_uri(struct transport *transport)
     -+int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
     - {
     - 	const struct transport_vtable *vtable = transport->vtable;
     - 
     -@@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport)
     - 	if (transport->got_remote_bundle_uri++)
     - 		return 0;
     - 
     --	/*
     --	 * "Support" protocol v0 and v2 without bundle-uri support by
     --	 * silently degrading to a NOOP.
     --	 */
     --	if (!server_supports_v2("bundle-uri", 0))
     --		return 0;
     --
       	/*
     - 	 * This is intentionally below the transport.injectBundleURI,
     - 	 * we want to be able to inject into protocol v0, or into the
     - 	 * dialog of a server who doesn't support this.
     - 	 */
     --	if (!vtable->get_bundle_uri)
     -+	if (!vtable->get_bundle_uri) {
     -+		if (quiet)
     -+			return -1;
     - 		return error(_("bundle-uri operation not supported by protocol"));
     -+	}
     - 
     - 	if (vtable->get_bundle_uri(transport) < 0)
     - 		return error(_("could not retrieve server-advertised bundle-uri list"));
     -
     - ## transport.h ##
     -@@ transport.h: const struct ref *transport_get_remote_refs(struct transport *transport,
     - /**
     -  * Retrieve bundle URI(s) from a remote. Populates "struct
     -  * transport"'s "bundle_uri" and "got_remote_bundle_uri".
     -+ *
     -+ * With `quiet=1` it will not complain if the serve doesn't support
     -+ * the protocol, but only if we discover the server uses it, and
     -+ * encounter issues then.
     -  */
     --int transport_get_remote_bundle_uri(struct transport *transport);
     -+int transport_get_remote_bundle_uri(struct transport *transport, int quiet);
     - 
     - /*
     -  * Fetch the hash algorithm used by a remote.
     + 	 * "Support" protocol v0 and v2 without bundle-uri support by
     + 	 * silently degrading to a NOOP.
  4:  cd906f6d981 !  7:  acc5a8f57f9 bundle-uri: serve bundle.* keys from config
     @@ bundle-uri.c: int bundle_uri_command(struct repository *r,
       	packet_writer_flush(&writer);
       
      
     - ## t/lib-t5730-protocol-v2-bundle-uri.sh ##
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2" '
     + ## t/lib-bundle-uri-protocol.sh ##
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
       	[bundle]
       		version = 1
       		mode = all
      +	[bundle "only"]
     -+		uri = $T5730_BUNDLE_URI_ESCAPED
     ++		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
       	EOF
     - 	GIT_TRACE_PACKET="$PWD/log" \
     + 
     ++	GIT_TEST_BUNDLE_URI=1 \
       	test-tool bundle-uri \
     -@@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 and ext
     + 		ls-remote \
     + 		"$BUNDLE_URI_REPO_URI" \
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
       	[bundle]
       		version = 1
       		mode = all
      +	[bundle "only"]
     -+		uri = $T5730_BUNDLE_URI_ESCAPED
     -+	EOF
     -+	GIT_TRACE_PACKET="$PWD/log" \
     ++		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
     + 	EOF
     + 
     ++	GIT_TEST_BUNDLE_URI=1 \
      +	test-tool bundle-uri \
      +		ls-remote \
     -+		"$T5730_URI" \
     ++		"$BUNDLE_URI_REPO_URI" \
      +		>actual &&
      +	test_cmp_config_output expect actual
      +'
      +
     -+
     -+test_expect_success "ls-remote with $T5730_PROTOCOL:// using protocol v2 with list" '
     -+	test_when_finished "rm -f log" &&
     -+
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.bundle1.uri "$T5730_BUNDLE_URI_ESCAPED-1.bdl" &&
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.bundle2.uri "$T5730_BUNDLE_URI_ESCAPED-2.bdl" &&
     -+	test_config -C "$T5730_PARENT" \
     -+		bundle.bundle3.uri "$T5730_BUNDLE_URI_ESCAPED-3.bdl" &&
     ++test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 with list" '
     ++	test_config -C "$BUNDLE_URI_PARENT" \
     ++		bundle.bundle1.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl" &&
     ++	test_config -C "$BUNDLE_URI_PARENT" \
     ++		bundle.bundle2.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl" &&
     ++	test_config -C "$BUNDLE_URI_PARENT" \
     ++		bundle.bundle3.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl" &&
      +
      +	# All data about bundle URIs
      +	cat >expect <<-EOF &&
     @@ t/lib-t5730-protocol-v2-bundle-uri.sh: test_expect_success "ls-remote with $T573
      +		version = 1
      +		mode = all
      +	[bundle "bundle1"]
     -+		uri = $T5730_BUNDLE_URI_ESCAPED-1.bdl
     ++		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl
      +	[bundle "bundle2"]
     -+		uri = $T5730_BUNDLE_URI_ESCAPED-2.bdl
     ++		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl
      +	[bundle "bundle3"]
     -+		uri = $T5730_BUNDLE_URI_ESCAPED-3.bdl
     - 	EOF
     - 	GIT_TRACE_PACKET="$PWD/log" \
     ++		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl
     ++	EOF
     ++
     ++	GIT_TEST_BUNDLE_URI=1 \
       	test-tool bundle-uri \
     + 		ls-remote \
     + 		"$BUNDLE_URI_REPO_URI" \
  6:  7d86852c015 !  8:  1eec3426aee strbuf: introduce strbuf_strip_file_from_path()
     @@ Commit message
          Re-add the method, this time in strbuf.c, but with a new name:
          strbuf_strip_file_from_path(). The method requirements are slightly
          modified to allow a trailing slash, in which case nothing is done, which
     -    makes the name change valuable. The return value is the number of bytes
     -    removed.
     +    makes the name change valuable.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ strbuf.c: int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
       	return res;
       }
      +
     -+size_t strbuf_strip_file_from_path(struct strbuf *buf)
     ++void strbuf_strip_file_from_path(struct strbuf *sb)
      +{
     -+	size_t len = buf->len;
     -+	size_t offset = offset_1st_component(buf->buf);
     -+	char *path_sep = find_last_dir_sep(buf->buf + offset);
     -+	strbuf_setlen(buf, path_sep ? path_sep - buf->buf + 1 : offset);
     -+	return len - buf->len;
     ++	char *path_sep = find_last_dir_sep(sb->buf);
     ++	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
      +}
      
       ## strbuf.h ##
     @@ strbuf.h: int launch_sequence_editor(const char *path, struct strbuf *buffer,
      +/*
      + * Remove the filename from the provided path string. If the path
      + * contains a trailing separator, then the path is considered a directory
     -+ * and nothing is modified. Returns the number of characters removed from
     -+ * the path.
     ++ * and nothing is modified.
      + *
      + * Examples:
     -+ * - "/path/to/file" -> "/path/to/" (returns: 4)
     -+ * - "/path/to/dir/" -> "/path/to/dir/" (returns: 0)
     ++ * - "/path/to/file" -> "/path/to/"
     ++ * - "/path/to/dir/" -> "/path/to/dir/"
      + */
     -+size_t strbuf_strip_file_from_path(struct strbuf *buf);
     ++void strbuf_strip_file_from_path(struct strbuf *sb);
      +
       void strbuf_add_lines(struct strbuf *sb,
       		      const char *prefix,
  7:  186e112d821 !  9:  48731438d6a bundle-uri: allow relative URLs in bundle lists
     @@ Commit message
          every push to a CDN would require altering the table of contents to
          match the expected domain and exact location within it.
      
     -    Allow a bundle list to specify a relative URI for the bundles.
     -    This allows easier distribution of bundle data.
     +    Allow a bundle list to specify a relative URI for the bundles. This URI
     +    is based on where the client received the bundle list. For a list
     +    provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
     +    the base URI. Otherwise, the bundle list was provided from an HTTP URI
     +    not using the Git protocol, and that URI is the base URI. This allows
     +    easier distribution of bundle data.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ bundle-uri.h: struct bundle_list {
       	struct hashmap bundles;
      +
      +	/**
     -+	 * The baseURI of a bundle_list is used as the base for any
     -+	 * relative URIs advertised by the bundle list at that location.
     ++	 * The baseURI of a bundle_list is the URI that provided the list.
      +	 *
     -+	 * When the list is generated from a Git server, then use that
     -+	 * server's location.
     ++	 * In the case of the 'bundle-uri' protocol v2 command, the base
     ++	 * URI is the URI of the Git remote.
     ++	 *
     ++	 * Otherewise, the bundle list was downloaded over HTTP from some
     ++	 * known URI.
     ++	 *
     ++	 * The baseURI is used as the base for any relative URIs
     ++	 * advertised by the bundle list at that location.
      +	 */
      +	char *baseURI;
       };
     @@ t/t5750-bundle-uri-parse.sh: test_expect_success 'bundle_uri_parse_line() just U
      +	test_must_be_empty err &&
      +	test_cmp_config_output expect actual
      +'
     ++
     ++test_expect_success 'bundle_uri_parse_line(): relative URIs and parent paths' '
     ++	cat >in <<-\EOF &&
     ++	bundle.one.uri=bundle.bdl
     ++	bundle.two.uri=../bundle.bdl
     ++	bundle.three.uri=../../bundle.bdl
     ++	EOF
     ++
     ++	cat >expect <<-\EOF &&
     ++	[bundle]
     ++		version = 1
     ++		mode = all
     ++	[bundle "one"]
     ++		uri = <uri>/bundle.bdl
     ++	[bundle "two"]
     ++		uri = bundle.bdl
     ++	[bundle "three"]
     ++		uri = <uri>/../bundle.bdl
     ++	EOF
     ++
     ++	# TODO: We would prefer if parsing a bundle list would not cause
     ++	# a die() and instead would give a warning and allow the rest of
     ++	# a Git command to continue. This test_must_fail is necessary for
     ++	# now until the interface for relative_url() allows for reporting
     ++	# an error instead of die()ing.
     ++	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
     ++	grep "fatal: cannot strip one component off url" err
     ++'
      +
       test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
       	cat >in <<-\EOF &&
     @@ t/t5750-bundle-uri-parse.sh: test_expect_success 'parse config format: just URIs
       	= bogus-value
      
       ## transport.c ##
     -@@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport, int quiet)
     +@@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport)
       	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
       		return 0;
       
      +	if (!transport->bundles->baseURI)
      +		transport->bundles->baseURI = xstrdup(transport->url);
      +
     - 	if (!vtable->get_bundle_uri) {
     - 		if (quiet)
     - 			return -1;
     + 	if (!vtable->get_bundle_uri)
     + 		return error(_("bundle-uri operation not supported by protocol"));
     + 
  8:  f254da46a2c = 10:  69bf154bec6 bundle-uri: download bundles from an advertised list
  9:  b62b4b17481 ! 11:  7e1819162b6 clone: unbundle the advertised bundles
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      -	 * Populate transport->got_remote_bundle_uri and
      -	 * transport->bundle_uri. We might get nothing.
      -	 */
     --	transport_get_remote_bundle_uri(transport, 1);
     +-	transport_get_remote_bundle_uri(transport);
      +	if (!bundle_uri) {
      +		/*
      +		* Populate transport->got_remote_bundle_uri and
      +		* transport->bundle_uri. We might get nothing.
      +		*/
     -+		transport_get_remote_bundle_uri(transport, 1);
     ++		transport_get_remote_bundle_uri(transport);
      +
      +		if (transport->bundles &&
      +		    hashmap_get_size(&transport->bundles->bundles)) {
      +			/* At this point, we need the_repository to match the cloned repo. */
      +			if (repo_init(the_repository, git_dir, work_tree))
      +				warning(_("failed to initialize the repo, skipping bundle URI"));
     -+			if (fetch_bundle_list(the_repository,
     -+					      remote->url[0],
     -+					      transport->bundles))
     ++			else if (fetch_bundle_list(the_repository,
     ++						   remote->url[0],
     ++						   transport->bundles))
      +				warning(_("failed to fetch advertised bundles"));
      +		} else {
      +			clear_bundle_list(transport->bundles);
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
       	if (mapped_refs) {
       		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
      
     + ## t/lib-bundle-uri-protocol.sh ##
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
     + '
     + 
     + test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
     +-	test_when_finished "rm -rf log cloned cloned2" &&
     ++	test_when_finished "rm -rf log* cloned*" &&
     + 
     + 	GIT_TRACE_PACKET="$PWD/log" \
     + 	GIT_TEST_BUNDLE_URI=0 \
     +@@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
     + 	grep "< bundle-uri" log &&
     + 
     + 	# Client issued bundle-uri command
     +-	grep "> command=bundle-uri" log
     ++	grep "> command=bundle-uri" log &&
     ++
     ++	GIT_TRACE_PACKET="$PWD/log3" \
     ++	git \
     ++		-c transfer.bundleURI=true \
     ++		-c protocol.version=2 \
     ++		clone --bundle-uri="$BUNDLE_URI_BUNDLE_URI" \
     ++		"$BUNDLE_URI_REPO_URI" cloned3 \
     ++		>actual 2>err &&
     ++
     ++	# Server responded using protocol v2
     ++	grep "< version 2" log3 &&
     ++
     ++	# Server advertised bundle-uri capability
     ++	grep "< bundle-uri" log3 &&
     ++
     ++	# Client did not issue bundle-uri command (--bundle-uri override)
     ++	! grep "> command=bundle-uri" log3
     + '
     + 
     + test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
     +
       ## t/t5601-clone.sh ##
      @@ t/t5601-clone.sh: test_expect_success 'reject cloning shallow repository using HTTP' '
       	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 23:31       ` Victoria Dye
  2022-12-05 17:50     ` [PATCH v3 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
                       ` (11 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/gitprotocol-v2.txt | 201 +++++++++++++++++++++++++++++++
 bundle-uri.c                     |  36 ++++++
 bundle-uri.h                     |   7 ++
 serve.c                          |   6 +
 t/t5701-git-serve.sh             |  40 +++++-
 5 files changed, 289 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 59bf41cefb9..10bd2d40cec 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -578,6 +578,207 @@ and associated requested information, each separated by a single space.
 
 	obj-info = obj-id SP obj-size
 
+bundle-uri
+~~~~~~~~~~
+
+If the 'bundle-uri' capability is advertised, the server supports the
+`bundle-uri' command.
+
+The capability is currently advertised with no value (i.e. not
+"bundle-uri=somevalue"), a value may be added in the future for
+supporting command-wide extensions. Clients MUST ignore any unknown
+capability values and proceed with the 'bundle-uri` dialog they
+support.
+
+The 'bundle-uri' command is intended to be issued before `fetch` to
+get URIs to bundle files (see linkgit:git-bundle[1]) to "seed" and
+inform the subsequent `fetch` command.
+
+The client CAN issue `bundle-uri` before or after any other valid
+command. To be useful to clients it's expected that it'll be issued
+after an `ls-refs` and before `fetch`, but CAN be issued at any time
+in the dialog.
+
+DISCUSSION of bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The intent of the feature is optimize for server resource consumption
+in the common case by changing the common case of fetching a very
+large PACK during linkgit:git-clone[1] into a smaller incremental
+fetch.
+
+It also allows servers to achieve better caching in combination with
+an `uploadpack.packObjectsHook` (see linkgit:git-config[1]).
+
+By having new clones or fetches be a more predictable and common
+negotiation against the tips of recently produces *.bundle file(s).
+Servers might even pre-generate the results of such negotiations for
+the `uploadpack.packObjectsHook` as new pushes come in.
+
+One way that servers could take advantage of these bundles is that the
+server would anticipate that fresh clones will download a known bundle,
+followed by catching up to the current state of the repository using ref
+tips found in that bundle (or bundles).
+
+PROTOCOL for bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A `bundle-uri` request takes no arguments, and as noted above does not
+currently advertise a capability value. Both may be added in the
+future.
+
+When the client issues a `command=bundle-uri` request, the response is a
+list of key-value pairs provided as packet lines with value
+`<key>=<value>`. Each `<key>` should be interpreted as a config key from
+the `bundle.*` namespace to construct a list of bundles. These keys are
+grouped by a `bundle.<id>.` subsection, where each key corresponding to a
+given `<id>` contributes attributes to the bundle defined by that `<id>`.
+See linkgit:git-config[1] for the specific details of these keys and how
+the Git client will interpret their values.
+
+Clients MUST parse the line according to the above format, lines that do
+not conform to the format SHOULD be discarded. The user MAY be warned in
+such a case.
+
+bundle-uri CLIENT AND SERVER EXPECTATIONS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+URI CONTENTS::
+The content at the advertised URIs MUST be one of two types.
++
+The advertised URI may contain a bundle file that `git bundle verify`
+would accept. I.e. they MUST contain one or more reference tips for
+use by the client, MUST indicate prerequisites (in any) with standard
+"-" prefixes, and MUST indicate their "object-format", if
+applicable.
++
+The advertised URI may alternatively contain a plaintext file that `git
+config --list` would accept (with the `--file` option). The key-value
+pairs in this list are in the `bundle.*` namespace (see
+linkgit:git-config[1]).
+
+bundle-uri CLIENT ERROR RECOVERY::
+A client MUST above all gracefully degrade on errors, whether that
+error is because of bad missing/data in the bundle URI(s), because
+that client is too dumb to e.g. understand and fully parse out bundle
+headers and their prerequisite relationships, or something else.
++
+Server operators should feel confident in turning on "bundle-uri" and
+not worry if e.g. their CDN goes down that clones or fetches will run
+into hard failures. Even if the server bundle bundle(s) are
+incomplete, or bad in some way the client should still end up with a
+functioning repository, just as if it had chosen not to use this
+protocol extension.
++
+All subsequent discussion on client and server interaction MUST keep
+this in mind.
+
+bundle-uri SERVER TO CLIENT::
+The ordering of the returned bundle uris is not significant. Clients
+MUST parse their headers to discover their contained OIDS and
+prerequisites. A client MUST consider the content of the bundle(s)
+themselves and their header as the ultimate source of truth.
++
+A server MAY even return bundle(s) that don't have any direct
+relationship to the repository being cloned (either through accident,
+or intentional "clever" configuration), and expect a client to sort
+out what data they'd like from the bundle(s), if any.
+
+bundle-uri CLIENT TO SERVER::
+The client SHOULD provide reference tips found in the bundle header(s)
+as 'have' lines in any subsequent `fetch` request. A client MAY also
+ignore the bundle(s) entirely if doing so is deemed worse for some
+reason, e.g. if the bundles can't be downloaded, it doesn't like the
+tips it finds etc.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
+If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
+of the bundle(s) the client finds that the ref tips it wants can be
+retrieved entirely from advertised bundle(s), the client MAY disconnect
+from the Git server. The results of such a 'clone' or 'fetch' should be
+indistinguishable from the state attained without using bundle-uri.
+
+EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
+A client MAY perform an early disconnect while still downloading the
+bundle(s) (having streamed and parsed their headers). In such a case
+the client MUST gracefully recover from any errors related to
+finishing the download and validation of the bundle(s).
++
+I.e. a client might need to re-connect and issue a 'fetch' command,
+and possibly fall back to not making use of 'bundle-uri' at all.
++
+This "MAY" behavior is specified as such (and not a "SHOULD") on the
+assumption that a server advertising bundle uris is more likely than
+not to be serving up a relatively large repository, and to be pointing
+to URIs that have a good chance of being in working order. A client
+MAY e.g. look at the payload size of the bundles as a heuristic to see
+if an early disconnect is worth it, should falling back on a full
+"fetch" dialog be necessary.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
+A client SHOULD commence a negotiation of a PACK from the server via
+the "fetch" command using the OID tips found in advertised bundles,
+even if's still in the process of downloading those bundle(s).
++
+This allows for aggressive early disconnects from any interactive
+server dialog. The client blindly trusts that the advertised OID tips
+are relevant, and issues them as 'have' lines, it then requests any
+tips it would like (usually from the "ls-refs" advertisement) via
+'want' lines. The server will then compute a (hopefully small) PACK
+with the expected difference between the tips from the bundle(s) and
+the data requested.
++
+The only connection the client then needs to keep active is to the
+concurrently downloading static bundle(s), when those and the
+incremental PACK are retrieved they should be inflated and
+validated. Any errors at this point should be gracefully recovered
+from, see above.
+
+bundle-uri PROTOCOL FEATURES
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The client constructs a bundle list from the `<key>=<value>` pairs
+provided by the server. These pairs are part of the `bundle.*` namespace
+as documented in linkgit:git-config[1]. In this section, we discuss some
+of these keys and describe the actions the client will do in response to
+this information.
+
+In particular, the `bundle.version` key specifies an integer value. The
+only accepted value at the moment is `1`, but if the client sees an
+unexpected value here then the client MUST ignore the bundle list.
+
+As long as `bundle.version` is understood, all other unknown keys MAY be
+ignored by the client. The server will guarantee compatibility with older
+clients, though newer clients may be better able to use the extra keys to
+minimize downloads.
+
+Any backwards-incompatible addition of pre-URI key-value will be
+guarded by a new `bundle.version` value or values in 'bundle-uri'
+capability advertisement itself, and/or by new future `bundle-uri`
+request arguments.
+
+Some example key-value pairs that are not currently implemented but could
+be implemented in the future include:
+
+ * Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
+   size of the bundle file.
+
+ * Advertise that one or more bundle files are the same (to e.g. have
+   clients round-robin or otherwise choose one of N possible files).
+
+ * A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
+   expressing the common case of a bundle with one tip and no
+   prerequisites, or one tip and one prerequisite.
++
+This would allow for optimizing the common case of servers who'd like
+to provide one "big bundle" containing only their "main" branch,
+and/or incremental updates thereof.
++
+A client receiving such a a response MAY assume that they can skip
+retrieving the header from a bundle at the indicated URI, and thus
+save themselves and the server(s) the request(s) needed to inspect the
+headers of that bundle or bundles.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/bundle-uri.c b/bundle-uri.c
index 79a914f961b..32022595964 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -563,6 +563,42 @@ cleanup:
 	return result;
 }
 
+/**
+ * API for serve.c.
+ */
+
+int bundle_uri_advertise(struct repository *r, struct strbuf *value)
+{
+	static int advertise_bundle_uri = -1;
+
+	if (advertise_bundle_uri != -1)
+		goto cached;
+
+	advertise_bundle_uri = 0;
+	git_config_get_maybe_bool("uploadpack.advertisebundleuris", &advertise_bundle_uri);
+
+cached:
+	return advertise_bundle_uri;
+}
+
+int bundle_uri_command(struct repository *r,
+		       struct packet_reader *request)
+{
+	struct packet_writer writer;
+	packet_writer_init(&writer, 1);
+
+	while (packet_reader_read(request) == PACKET_READ_NORMAL)
+		die(_("bundle-uri: unexpected argument: '%s'"), request->line);
+	if (request->status != PACKET_READ_FLUSH)
+		die(_("bundle-uri: expected flush after arguments"));
+
+	/* TODO: Implement the communication */
+
+	packet_writer_flush(&writer);
+
+	return 0;
+}
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 4dbc269823c..357111ecce8 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -4,6 +4,7 @@
 #include "hashmap.h"
 #include "strbuf.h"
 
+struct packet_reader;
 struct repository;
 struct string_list;
 
@@ -92,6 +93,12 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * API for serve.c.
+ */
+int bundle_uri_advertise(struct repository *r, struct strbuf *value);
+int bundle_uri_command(struct repository *r, struct packet_reader *request);
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/serve.c b/serve.c
index 733347f602a..cbf4a143cfe 100644
--- a/serve.c
+++ b/serve.c
@@ -7,6 +7,7 @@
 #include "protocol-caps.h"
 #include "serve.h"
 #include "upload-pack.h"
+#include "bundle-uri.h"
 
 static int advertise_sid = -1;
 static int client_hash_algo = GIT_HASH_SHA1;
@@ -135,6 +136,11 @@ static struct protocol_capability capabilities[] = {
 		.advertise = always_advertise,
 		.command = cap_object_info,
 	},
+	{
+		.name = "bundle-uri",
+		.advertise = bundle_uri_advertise,
+		.command = bundle_uri_command,
+	},
 };
 
 void protocol_v2_advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 1896f671cb3..f21e5e9d33d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -13,7 +13,7 @@ test_expect_success 'test capability advertisement' '
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
-	cat >expect <<-EOF &&
+	cat >expect.base <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs=unborn
@@ -21,8 +21,11 @@ test_expect_success 'test capability advertisement' '
 	server-option
 	object-format=$(test_oid algo)
 	object-info
+	EOF
+	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -342,4 +345,39 @@ test_expect_success 'basics of object-info' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with uploadpack.advertiseBundleURIs' '
+	test_config uploadpack.advertiseBundleURIs true &&
+
+	cat >expect.extra <<-EOF &&
+	bundle-uri
+	EOF
+	cat expect.base \
+	    expect.extra \
+	    expect.trailer >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'basics of bundle-uri: dies if not enabled' '
+	test-tool pkt-line pack >in <<-EOF &&
+	command=bundle-uri
+	0000
+	EOF
+
+	cat >err.expect <<-\EOF &&
+	fatal: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	cat >expect <<-\EOF &&
+	ERR serve: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	test_must_fail test-tool serve-v2 --stateless-rpc <in >out 2>err.actual &&
+	test_cmp err.expect err.actual &&
+	test_must_be_empty out
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 02/11] t: create test harness for 'bundle-uri' command
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 17:50     ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
                       ` (10 subsequent siblings)
  12 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The previous change allowed for a Git server to advertise the
'bundle-uri' command as a capability based on the
uploadPack.advertiseBundleURIs config option. Create a set of tests that
check that this capability is advertised using 'git ls-remote'.

In order to test this functionality across three protocols (file, git,
and http), create lib-bundle-uri-protocol.sh to generalize the tests,
allowing the other test scripts to set an environment variable and
otherwise inherit the setup and tests from this script.

The tests currently only test that the 'bundle-uri' command is
advertised or not. Other actions will be tested as the Git client learns
to request the 'bundle-uri' command and parse its response.

To help with URI escaping, specifically for file paths with a space in
them, extract a 'sed' invocation from t9199-git-svn-info.sh into a
helper function for use here, too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/lib-bundle-uri-protocol.sh           | 85 ++++++++++++++++++++++++++
 t/t5730-protocol-v2-bundle-uri-file.sh | 17 ++++++
 t/t5731-protocol-v2-bundle-uri-git.sh  | 17 ++++++
 t/t5732-protocol-v2-bundle-uri-http.sh | 17 ++++++
 t/t9119-git-svn-info.sh                |  2 +-
 t/test-lib-functions.sh                |  7 +++
 6 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 t/lib-bundle-uri-protocol.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh

diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
new file mode 100644
index 00000000000..2da22a39cb8
--- /dev/null
+++ b/t/lib-bundle-uri-protocol.sh
@@ -0,0 +1,85 @@
+# Set up and run tests of the 'bundle-uri' command in protocol v2
+#
+# The test that includes this script should set BUNDLE_URI_PROTOCOL
+# to one of "file", "git", or "http".
+
+BUNDLE_URI_TEST_PARENT=
+BUNDLE_URI_TEST_URI=
+BUNDLE_URI_TEST_BUNDLE_URI=
+case "$BUNDLE_URI_PROTOCOL" in
+file)
+	BUNDLE_URI_PARENT=file_parent
+	BUNDLE_URI_REPO_URI="file://$PWD/file_parent"
+	BUNDLE_URI_BUNDLE_URI="$BUNDLE_URI_REPO_URI/fake.bdl"
+	test_set_prereq BUNDLE_URI_FILE
+	;;
+git)
+	. "$TEST_DIRECTORY"/lib-git-daemon.sh
+	start_git_daemon --export-all --enable=receive-pack
+	BUNDLE_URI_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
+	BUNDLE_URI_REPO_URI="$GIT_DAEMON_URL/parent"
+	BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq BUNDLE_URI_GIT
+	;;
+http)
+	. "$TEST_DIRECTORY"/lib-httpd.sh
+	start_httpd
+	BUNDLE_URI_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
+	BUNDLE_URI_REPO_URI="$HTTPD_URL/smart/http_parent"
+	BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq BUNDLE_URI_HTTP
+	;;
+*)
+	BUG "Need to pass valid BUNDLE_URI_PROTOCOL (was \"$BUNDLE_URI_PROTOCOL\")"
+	;;
+esac
+
+test_expect_success "setup protocol v2 $BUNDLE_URI_PROTOCOL:// tests" '
+	git init "$BUNDLE_URI_PARENT" &&
+	test_commit -C "$BUNDLE_URI_PARENT" one &&
+	git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs true
+'
+
+case "$BUNDLE_URI_PROTOCOL" in
+http)
+	test_expect_success "setup config for $BUNDLE_URI_PROTOCOL:// tests" '
+		git -C "$BUNDLE_URI_PARENT" config http.receivepack true
+	'
+	;;
+*)
+	;;
+esac
+BUNDLE_URI_BUNDLE_URI_ESCAPED=$(echo "$BUNDLE_URI_BUNDLE_URI" | test_uri_escape)
+
+test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: no bundle-uri" '
+	test_when_finished "rm -f log" &&
+	test_when_finished "git -C \"$BUNDLE_URI_PARENT\" config uploadpack.advertiseBundleURIs true" &&
+	git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs false &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$BUNDLE_URI_REPO_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	! grep bundle-uri log
+'
+
+test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: have bundle-uri" '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$BUNDLE_URI_REPO_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log
+'
diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
new file mode 100755
index 00000000000..37bdb725bca
--- /dev/null
+++ b/t/t5730-protocol-v2-bundle-uri-file.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'file://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+BUNDLE_URI_PROTOCOL=file
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t5731-protocol-v2-bundle-uri-git.sh b/t/t5731-protocol-v2-bundle-uri-git.sh
new file mode 100755
index 00000000000..8add1b37abc
--- /dev/null
+++ b/t/t5731-protocol-v2-bundle-uri-git.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+BUNDLE_URI_PROTOCOL=git
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t5732-protocol-v2-bundle-uri-http.sh b/t/t5732-protocol-v2-bundle-uri-http.sh
new file mode 100755
index 00000000000..129daa02269
--- /dev/null
+++ b/t/t5732-protocol-v2-bundle-uri-http.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'http://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'http://' transport
+#
+BUNDLE_URI_PROTOCOL=http
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t9119-git-svn-info.sh b/t/t9119-git-svn-info.sh
index 8201c3e808a..088d1c57a88 100755
--- a/t/t9119-git-svn-info.sh
+++ b/t/t9119-git-svn-info.sh
@@ -28,7 +28,7 @@ test_cmp_info () {
 	rm -f tmp.expect tmp.actual
 }
 
-quoted_svnrepo="$(echo $svnrepo | sed 's/ /%20/')"
+quoted_svnrepo="$(echo $svnrepo | test_uri_escape)"
 
 test_expect_success 'setup repository and import' '
 	mkdir info &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 29d914a12ba..5f6966a404b 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1755,6 +1755,13 @@ test_path_is_hidden () {
 	return 1
 }
 
+# Poor man's URI escaping. Good enough for the test suite whose trash
+# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
+# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
+test_uri_escape() {
+	sed 's/ /%20/g'
+}
+
 # Check that the given command was invoked as part of the
 # trace2-format trace on stdin.
 #
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 03/11] clone: request the 'bundle-uri' command when available
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 17:50     ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.

If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.

At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.

The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c              |  6 +++++
 connect.c                    | 44 +++++++++++++++++++++++++++++++
 remote.h                     |  5 ++++
 t/lib-bundle-uri-protocol.sh | 19 ++++++++++++++
 transport-helper.c           | 13 +++++++++
 transport-internal.h         |  7 +++++
 transport.c                  | 51 ++++++++++++++++++++++++++++++++++++
 transport.h                  | 19 ++++++++++++++
 8 files changed, 164 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index 547d6464b3c..39364c25b15 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1266,6 +1266,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
+	/*
+	 * Populate transport->got_remote_bundle_uri and
+	 * transport->bundle_uri. We might get nothing.
+	 */
+	transport_get_remote_bundle_uri(transport);
+
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 
diff --git a/connect.c b/connect.c
index 5ea53deda23..624a10f18ee 100644
--- a/connect.c
+++ b/connect.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "protocol.h"
 #include "alias.h"
+#include "bundle-uri.h"
 
 static char *server_capabilities_v1;
 static struct strvec server_capabilities_v2 = STRVEC_INIT;
@@ -491,6 +492,49 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	}
 }
 
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc)
+{
+	int line_nr = 1;
+
+	/* Assert bundle-uri support */
+	server_supports_v2("bundle-uri", 1);
+
+	/* (Re-)send capabilities */
+	send_capabilities(fd_out, reader);
+
+	/* Send command */
+	packet_write_fmt(fd_out, "command=bundle-uri\n");
+	packet_delim(fd_out);
+
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *line = reader->line;
+		line_nr++;
+
+		if (!bundle_uri_parse_line(bundles, line))
+			continue;
+
+		return error(_("error on bundle-uri response line %d: %s"),
+			     line_nr, line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		return error(_("expected flush after bundle-uri listing"));
+
+	/*
+	 * Might die(), but obscure enough that that's OK, e.g. in
+	 * serve.c we'll call BUG() on its equivalent (the
+	 * PACKET_READ_RESPONSE_END check).
+	 */
+	check_stateless_delimiter(stateless_rpc, reader,
+				  _("expected response end packet after ref listing"));
+
+	return 0;
+}
+
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     struct transport_ls_refs_options *transport_options,
diff --git a/remote.h b/remote.h
index 1c4621b414b..1ebbe42792e 100644
--- a/remote.h
+++ b/remote.h
@@ -234,6 +234,11 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
+/* Used for protocol v2 in order to retrieve refs from a remote */
+struct bundle_list;
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc);
+
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
 /*
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 2da22a39cb8..d44c6e10f9e 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -83,3 +83,22 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 	# Server advertised bundle-uri capability
 	grep "< bundle-uri" log
 '
+
+test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
+	test_when_finished "rm -rf log cloned" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		clone "$BUNDLE_URI_REPO_URI" cloned \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log &&
+
+	# Client issued bundle-uri command
+	grep "> command=bundle-uri" log
+'
diff --git a/transport-helper.c b/transport-helper.c
index e95267a4ab5..3ea7c2bb5ad 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
 	return ret;
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	get_helper(transport);
+
+	if (process_connect(transport, 0)) {
+		do_take_over(transport);
+		return transport->vtable->get_bundle_uri(transport);
+	}
+
+	return -1;
+}
+
 static struct transport_vtable vtable = {
 	.set_option	= set_helper_option,
 	.get_refs_list	= get_refs_list,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs,
 	.push_refs	= push_refs,
 	.connect	= connect_helper,
diff --git a/transport-internal.h b/transport-internal.h
index c4ca0b733ac..90ea749e5cf 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -26,6 +26,13 @@ struct transport_vtable {
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
 				     struct transport_ls_refs_options *transport_options);
 
+	/**
+	 * Populates the remote side's bundle-uri under protocol v2,
+	 * if the "bundle-uri" capability was advertised. Returns 0 if
+	 * OK, negative values on error.
+	 */
+	int (*get_bundle_uri)(struct transport *transport);
+
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
diff --git a/transport.c b/transport.c
index e7b97194c10..b6f279e92cb 100644
--- a/transport.c
+++ b/transport.c
@@ -22,6 +22,7 @@
 #include "protocol.h"
 #include "object-store.h"
 #include "color.h"
+#include "bundle-uri.h"
 
 static int transport_use_color = -1;
 static char transport_colors[][COLOR_MAXLEN] = {
@@ -359,6 +360,32 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	return handshake(transport, for_push, options, 1);
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	struct git_transport_data *data = transport->data;
+	struct packet_reader reader;
+	int stateless_rpc = transport->stateless_rpc;
+
+	if (!transport->bundles) {
+		CALLOC_ARRAY(transport->bundles, 1);
+		init_bundle_list(transport->bundles);
+	}
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	return get_remote_bundle_uri(data->fd[1], &reader,
+				     transport->bundles, stateless_rpc);
+}
+
 static int fetch_refs_via_pack(struct transport *transport,
 			       int nr_heads, struct ref **to_fetch)
 {
@@ -902,6 +929,7 @@ static int disconnect_git(struct transport *transport)
 
 static struct transport_vtable taken_over_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.disconnect	= disconnect_git
@@ -1054,6 +1082,7 @@ static struct transport_vtable bundle_vtable = {
 
 static struct transport_vtable builtin_smart_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.connect	= connect_git,
@@ -1068,6 +1097,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	ret->progress = isatty(2);
 	string_list_init_dup(&ret->pack_lockfiles);
 
+	CALLOC_ARRAY(ret->bundles, 1);
+	init_bundle_list(ret->bundles);
+
 	if (!remote)
 		BUG("No remote provided to transport_get()");
 
@@ -1482,6 +1514,23 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
+int transport_get_remote_bundle_uri(struct transport *transport)
+{
+	const struct transport_vtable *vtable = transport->vtable;
+
+	/* Check config only once. */
+	if (transport->got_remote_bundle_uri)
+		return 0;
+	transport->got_remote_bundle_uri = 1;
+
+	if (!vtable->get_bundle_uri)
+		return error(_("bundle-uri operation not supported by protocol"));
+
+	if (vtable->get_bundle_uri(transport) < 0)
+		return error(_("could not retrieve server-advertised bundle-uri list"));
+	return 0;
+}
+
 void transport_unlock_pack(struct transport *transport, unsigned int flags)
 {
 	int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER);
@@ -1512,6 +1561,8 @@ int transport_disconnect(struct transport *transport)
 		ret = transport->vtable->disconnect(transport);
 	if (transport->got_remote_refs)
 		free_refs((void *)transport->remote_refs);
+	clear_bundle_list(transport->bundles);
+	free(transport->bundles);
 	free(transport);
 	return ret;
 }
diff --git a/transport.h b/transport.h
index b5bf7b3e704..85150f504fb 100644
--- a/transport.h
+++ b/transport.h
@@ -62,6 +62,7 @@ enum transport_family {
 	TRANSPORT_FAMILY_IPV6
 };
 
+struct bundle_list;
 struct transport {
 	const struct transport_vtable *vtable;
 
@@ -76,6 +77,18 @@ struct transport {
 	 */
 	unsigned got_remote_refs : 1;
 
+	/**
+	 * Indicates whether we already called get_bundle_uri_list(); set by
+	 * transport.c::transport_get_remote_bundle_uri().
+	 */
+	unsigned got_remote_bundle_uri : 1;
+
+	/*
+	 * The results of "command=bundle-uri", if both sides support
+	 * the "bundle-uri" capability.
+	 */
+	struct bundle_list *bundles;
+
 	/*
 	 * Transports that call take-over destroys the data specific to
 	 * the transport type while doing so, and cannot be reused.
@@ -281,6 +294,12 @@ void transport_ls_refs_options_release(struct transport_ls_refs_options *opts);
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    struct transport_ls_refs_options *transport_options);
 
+/**
+ * Retrieve bundle URI(s) from a remote. Populates "struct
+ * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ */
+int transport_get_remote_bundle_uri(struct transport *transport);
+
 /*
  * Fetch the hash algorithm used by a remote.
  *
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 17:50     ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 23:32       ` Victoria Dye
  2022-12-05 17:50     ` [PATCH v3 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
                       ` (8 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

To enable this setting by default in the correct tests, add a
GIT_TEST_BUNDLE_URI environment variable.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/config/transfer.txt |  6 ++++++
 t/lib-bundle-uri-protocol.sh      | 19 ++++++++++++++++++-
 transport.c                       |  9 +++++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index 264812cca4d..c3ac767d1e4 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -115,3 +115,9 @@ transfer.unpackLimit::
 transfer.advertiseSID::
 	Boolean. When true, client and server processes will advertise their
 	unique session IDs to their remote counterpart. Defaults to false.
+
+transfer.bundleURI::
+	When `true`, local `git clone` commands will request bundle
+	information from the remote server (if advertised) and download
+	bundles before continuing the clone through the Git protocol.
+	Defaults to `false`.
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index d44c6e10f9e..77bfd4f0119 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -85,9 +85,10 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 '
 
 test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
-	test_when_finished "rm -rf log cloned" &&
+	test_when_finished "rm -rf log cloned cloned2" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
+	GIT_TEST_BUNDLE_URI=0 \
 	git \
 		-c protocol.version=2 \
 		clone "$BUNDLE_URI_REPO_URI" cloned \
@@ -99,6 +100,22 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	# Server advertised bundle-uri capability
 	grep "< bundle-uri" log &&
 
+	# Client did not issue bundle-uri command
+	! grep "> command=bundle-uri" log &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c transfer.bundleURI=true \
+		-c protocol.version=2 \
+		clone "$BUNDLE_URI_REPO_URI" cloned2 \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log &&
+
 	# Client issued bundle-uri command
 	grep "> command=bundle-uri" log
 '
diff --git a/transport.c b/transport.c
index b6f279e92cb..9f9e38d66dd 100644
--- a/transport.c
+++ b/transport.c
@@ -1516,6 +1516,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 
 int transport_get_remote_bundle_uri(struct transport *transport)
 {
+	int value = 0;
 	const struct transport_vtable *vtable = transport->vtable;
 
 	/* Check config only once. */
@@ -1523,6 +1524,14 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 		return 0;
 	transport->got_remote_bundle_uri = 1;
 
+	/*
+	 * Don't request bundle-uri from the server unless configured to
+	 * do so by GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
+	 */
+	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&
+	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
+		return 0;
+
 	if (!vtable->get_bundle_uri)
 		return error(_("bundle-uri operation not supported by protocol"));
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 05/11] transport: rename got_remote_heads
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
                       ` (7 subsequent siblings)
  12 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The 'got_remote_heads' member of 'struct git_transport_data' was used
historically to indicate that the initial server connection was made and
the ref advertisement was returned. With protocol v2, that initial
handshake does not necessarily include the ref advertisement, so this
member is not an accurate name. Thankfully, all uses of the member are
only checking to see if the handshake should take place, not whether or
not some local data has the ref advertisement.

Rename the member to 'finished_handshake' to represent the proper state.
Note that the variable is only set to 1 during the handshake() method.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 transport.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 9f9e38d66dd..a2281d95262 100644
--- a/transport.c
+++ b/transport.c
@@ -198,7 +198,7 @@ struct git_transport_data {
 	struct git_transport_options options;
 	struct child_process *conn;
 	int fd[2];
-	unsigned got_remote_heads : 1;
+	unsigned finished_handshake : 1;
 	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
@@ -345,7 +345,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
 	}
-	data->got_remote_heads = 1;
+	data->finished_handshake = 1;
 	transport->hash_algo = reader.hash_algo;
 
 	if (reader.line_peeked)
@@ -421,7 +421,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.negotiation_tips = data->options.negotiation_tips;
 	args.reject_shallow_remote = transport->smart_options->reject_shallow;
 
-	if (!data->got_remote_heads) {
+	if (!data->finished_handshake) {
 		int i;
 		int must_list_refs = 0;
 		for (i = 0; i < nr_heads; i++) {
@@ -461,7 +461,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 			  to_fetch, nr_heads, &data->shallow,
 			  &transport->pack_lockfiles, data->version);
 
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 	data->options.self_contained_and_connected =
 		args.self_contained_and_connected;
 	data->options.connectivity_checked = args.connectivity_checked;
@@ -846,7 +846,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	if (transport_color_config() < 0)
 		return -1;
 
-	if (!data->got_remote_heads)
+	if (!data->finished_handshake)
 		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
@@ -894,7 +894,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		ret = finish_connect(data->conn);
 	data->conn = NULL;
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 
 	return ret;
 }
@@ -914,7 +914,7 @@ static int disconnect_git(struct transport *transport)
 {
 	struct git_transport_data *data = transport->data;
 	if (data->conn) {
-		if (data->got_remote_heads && !transport->stateless_rpc)
+		if (data->finished_handshake && !transport->stateless_rpc)
 			packet_flush(data->fd[1]);
 		close(data->fd[0]);
 		if (data->fd[1] >= 0)
@@ -949,7 +949,7 @@ void transport_take_over(struct transport *transport,
 	data->conn = child;
 	data->fd[0] = data->conn->out;
 	data->fd[1] = data->conn->in;
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 	transport->data = data;
 
 	transport->vtable = &taken_over_vtable;
@@ -1150,7 +1150,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		ret->smart_options = &(data->options);
 
 		data->conn = NULL;
-		data->got_remote_heads = 0;
+		data->finished_handshake = 0;
 	} else {
 		/* Unknown protocol in URL. Pass to external handler. */
 		int len = external_specification_len(url);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 06/11] bundle-uri client: add helper for testing server
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-05 23:32       ` Victoria Dye
  2022-12-05 17:50     ` [PATCH v3 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
                       ` (6 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers. Rename the existing
'got_remote_heads' to 'finished_handshake' to make it more clear what
that bit represents.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/helper/test-bundle-uri.c   | 46 ++++++++++++++++++++++++++++++++++++
 t/lib-bundle-uri-protocol.sh | 39 ++++++++++++++++++++++++++++++
 transport.c                  |  7 ++++++
 3 files changed, 92 insertions(+)

diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 25afd393428..f8159187014 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -3,6 +3,10 @@
 #include "bundle-uri.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "transport.h"
+#include "ref-filter.h"
+#include "remote.h"
+#include "refs.h"
 
 enum input_mode {
 	KEY_VALUE_PAIRS,
@@ -68,6 +72,46 @@ usage:
 	usage_with_options(usage, options);
 }
 
+static int cmd_ls_remote(int argc, const char **argv)
+{
+	const char *uploadpack = NULL;
+	struct string_list server_options = STRING_LIST_INIT_DUP;
+	const char *dest;
+	struct remote *remote;
+	struct transport *transport;
+	int status = 0;
+
+	dest = argc > 1 ? argv[1] : NULL;
+
+	remote = remote_get(dest);
+	if (!remote) {
+		if (dest)
+			die(_("bad repository '%s'"), dest);
+		die(_("no remote configured to get bundle URIs from"));
+	}
+	if (!remote->url_nr)
+		die(_("remote '%s' has no configured URL"), dest);
+
+	transport = transport_get(remote, NULL);
+	if (uploadpack)
+		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
+	if (server_options.nr)
+		transport->server_options = &server_options;
+
+	if (transport_get_remote_bundle_uri(transport) < 0) {
+		error(_("could not get the bundle-uri list"));
+		status = 1;
+		goto cleanup;
+	}
+
+	print_bundle_list(stdout, transport->bundles);
+
+cleanup:
+	if (transport_disconnect(transport))
+		return 1;
+	return status;
+}
+
 int cmd__bundle_uri(int argc, const char **argv)
 {
 	const char *usage[] = {
@@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
 	if (!strcmp(argv[1], "parse-config"))
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
+	if (!strcmp(argv[1], "ls-remote"))
+		return cmd_ls_remote(argc - 1, argv + 1);
 	error("there is no test-tool bundle-uri tool '%s'", argv[1]);
 
 usage:
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 77bfd4f0119..88e339ae9ad 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -119,3 +119,42 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	# Client issued bundle-uri command
 	grep "> command=bundle-uri" log
 '
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 and extra data" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
+
+	# Extra data should be ignored
+	test_config -C "$BUNDLE_URI_PARENT" bundle.only.extra bogus &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
diff --git a/transport.c b/transport.c
index a2281d95262..97d395e10a3 100644
--- a/transport.c
+++ b/transport.c
@@ -371,6 +371,13 @@ static int get_bundle_uri(struct transport *transport)
 		init_bundle_list(transport->bundles);
 	}
 
+	if (!data->finished_handshake) {
+		struct ref *refs = handshake(transport, 0, NULL, 0);
+
+		if (refs)
+			free_refs(refs);
+	}
+
 	/*
 	 * "Support" protocol v0 and v2 without bundle-uri support by
 	 * silently degrading to a NOOP.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 07/11] bundle-uri: serve bundle.* keys from config
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-05 17:50     ` [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                 | 16 +++++++++++++++-
 t/lib-bundle-uri-protocol.sh | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 32022595964..6919f541085 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -581,6 +581,16 @@ cached:
 	return advertise_bundle_uri;
 }
 
+static int config_to_packet_line(const char *key, const char *value, void *data)
+{
+	struct packet_reader *writer = data;
+
+	if (!strncmp(key, "bundle.", 7))
+		packet_write_fmt(writer->fd, "%s=%s", key, value);
+
+	return 0;
+}
+
 int bundle_uri_command(struct repository *r,
 		       struct packet_reader *request)
 {
@@ -592,7 +602,11 @@ int bundle_uri_command(struct repository *r,
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("bundle-uri: expected flush after arguments"));
 
-	/* TODO: Implement the communication */
+	/*
+	 * Read all "bundle.*" config lines to the client as key=value
+	 * packet lines.
+	 */
+	git_config(config_to_packet_line, &writer);
 
 	packet_writer_flush(&writer);
 
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 88e339ae9ad..6d3f871fa0f 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -129,8 +129,11 @@ test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
 	EOF
 
+	GIT_TEST_BUNDLE_URI=1 \
 	test-tool bundle-uri \
 		ls-remote \
 		"$BUNDLE_URI_REPO_URI" \
@@ -150,8 +153,40 @@ test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
 	EOF
 
+	GIT_TEST_BUNDLE_URI=1 \
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 with list" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle1.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl" &&
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle2.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl" &&
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle3.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "bundle1"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl
+	[bundle "bundle2"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl
+	[bundle "bundle3"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl
+	EOF
+
+	GIT_TEST_BUNDLE_URI=1 \
 	test-tool bundle-uri \
 		ls-remote \
 		"$BUNDLE_URI_REPO_URI" \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-06 10:06       ` Ævar Arnfjörð Bjarmason
  2022-12-05 17:50     ` [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
                       ` (4 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.

Re-add the method, this time in strbuf.c, but with a new name:
strbuf_strip_file_from_path(). The method requirements are slightly
modified to allow a trailing slash, in which case nothing is done, which
makes the name change valuable.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 strbuf.c |  6 ++++++
 strbuf.h | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 0890b1405c5..c383f41a3c5 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1200,3 +1200,9 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 	free(path2);
 	return res;
 }
+
+void strbuf_strip_file_from_path(struct strbuf *sb)
+{
+	char *path_sep = find_last_dir_sep(sb->buf);
+	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
+}
diff --git a/strbuf.h b/strbuf.h
index 76965a17d44..f6dbb9681ee 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -664,6 +664,17 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
 int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			      const char *const *env);
 
+/*
+ * Remove the filename from the provided path string. If the path
+ * contains a trailing separator, then the path is considered a directory
+ * and nothing is modified.
+ *
+ * Examples:
+ * - "/path/to/file" -> "/path/to/"
+ * - "/path/to/dir/" -> "/path/to/dir/"
+ */
+void strbuf_strip_file_from_path(struct strbuf *sb);
+
 void strbuf_add_lines(struct strbuf *sb,
 		      const char *prefix,
 		      const char *buf,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (7 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-05 23:33       ` Victoria Dye
  2022-12-05 17:50     ` [PATCH v3 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
                       ` (3 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 16 +++++++-
 bundle-uri.h                | 14 +++++++
 t/helper/test-bundle-uri.c  |  2 +
 t/t5750-bundle-uri-parse.sh | 82 +++++++++++++++++++++++++++++++++++++
 transport.c                 |  3 ++
 5 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 6919f541085..80370992773 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -7,6 +7,7 @@
 #include "hashmap.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "remote.h"
 
 static int compare_bundles(const void *hashmap_cmp_fn_data,
 			   const struct hashmap_entry *he1,
@@ -49,6 +50,7 @@ void clear_bundle_list(struct bundle_list *list)
 
 	for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
 	hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
+	free(list->baseURI);
 }
 
 int for_all_bundles_in_list(struct bundle_list *list,
@@ -163,7 +165,7 @@ static int bundle_list_update(const char *key, const char *value,
 	if (!strcmp(subkey, "uri")) {
 		if (bundle->uri)
 			return -1;
-		bundle->uri = xstrdup(value);
+		bundle->uri = relative_url(list->baseURI, value, NULL);
 		return 0;
 	}
 
@@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
+	if (!list->baseURI) {
+		struct strbuf baseURI = STRBUF_INIT;
+		strbuf_addstr(&baseURI, uri);
+
+		/*
+		 * If the URI does not end with a trailing slash, then
+		 * remove the filename portion of the path. This is
+		 * important for relative URIs.
+		 */
+		strbuf_strip_file_from_path(&baseURI);
+		list->baseURI = strbuf_detach(&baseURI, NULL);
+	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
diff --git a/bundle-uri.h b/bundle-uri.h
index 357111ecce8..e7e90a5f088 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -61,6 +61,20 @@ struct bundle_list {
 	int version;
 	enum bundle_list_mode mode;
 	struct hashmap bundles;
+
+	/**
+	 * The baseURI of a bundle_list is the URI that provided the list.
+	 *
+	 * In the case of the 'bundle-uri' protocol v2 command, the base
+	 * URI is the URI of the Git remote.
+	 *
+	 * Otherewise, the bundle list was downloaded over HTTP from some
+	 * known URI.
+	 *
+	 * The baseURI is used as the base for any relative URIs
+	 * advertised by the bundle list at that location.
+	 */
+	char *baseURI;
 };
 
 void init_bundle_list(struct bundle_list *list);
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index f8159187014..5df5bc3b89e 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 	init_bundle_list(&list);
 
+	list.baseURI = xstrdup("<uri>");
+
 	switch (mode) {
 	case KEY_VALUE_PAIRS:
 		if (argc != 1)
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index c2fe3f9c5a5..7b4f930e532 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -30,6 +30,58 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'bundle_uri_parse_line(): relative URIs' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-key-values in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success 'bundle_uri_parse_line(): relative URIs and parent paths' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=../../bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/../bundle.bdl
+	EOF
+
+	# TODO: We would prefer if parsing a bundle list would not cause
+	# a die() and instead would give a warning and allow the rest of
+	# a Git command to continue. This test_must_fail is necessary for
+	# now until the interface for relative_url() allows for reporting
+	# an error instead of die()ing.
+	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
+	grep "fatal: cannot strip one component off url" err
+'
+
 test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
 	cat >in <<-\EOF &&
 	=bogus-value
@@ -136,6 +188,36 @@ test_expect_success 'parse config format: just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'parse config format: relative URIs' '
+	cat >in <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = bundle.bdl
+	[bundle "two"]
+		uri = ../bundle.bdl
+	[bundle "three"]
+		uri = sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-config in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'parse config format edge cases: empty key or value' '
 	cat >in1 <<-\EOF &&
 	= bogus-value
diff --git a/transport.c b/transport.c
index 97d395e10a3..957dca4923c 100644
--- a/transport.c
+++ b/transport.c
@@ -1539,6 +1539,9 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
 		return 0;
 
+	if (!transport->bundles->baseURI)
+		transport->bundles->baseURI = xstrdup(transport->url);
+
 	if (!vtable->get_bundle_uri)
 		return error(_("bundle-uri operation not supported by protocol"));
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (8 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-07 12:57       ` Jeff King
  2022-12-05 17:50     ` [PATCH v3 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  12 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 command. To actually
download and unbundle the advertised bundles, we need a different
mechanism.

Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.

In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c | 21 +++++++++++++++++++++
 bundle-uri.h | 11 +++++++++++
 2 files changed, 32 insertions(+)

diff --git a/bundle-uri.c b/bundle-uri.c
index 80370992773..c411b871bdd 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -577,6 +577,27 @@ cleanup:
 	return result;
 }
 
+int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
+{
+	int result;
+	struct bundle_list global_list;
+
+	init_bundle_list(&global_list);
+
+	/* If a bundle is added to this global list, then it is required. */
+	global_list.mode = BUNDLE_MODE_ALL;
+
+	if ((result = download_bundle_list(r, list, &global_list, 0)))
+		goto cleanup;
+
+	result = unbundle_all_bundles(r, &global_list);
+
+cleanup:
+	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
+	clear_bundle_list(&global_list);
+	return result;
+}
+
 /**
  * API for serve.c.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index e7e90a5f088..b2c9c160a52 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -107,6 +107,17 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * Given a bundle list that was already advertised (likely by the
+ * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
+ * bundles according to the bundle strategy of that list.
+ *
+ * Returns non-zero if no bundle information is found at the given 'uri'.
+ */
+int fetch_bundle_list(struct repository *r,
+		      const char *uri,
+		      struct bundle_list *list);
+
 /**
  * API for serve.c.
  */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 11/11] clone: unbundle the advertised bundles
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (9 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-12-05 17:50     ` Derrick Stolee via GitGitGadget
  2022-12-05 23:42     ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Victoria Dye
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
  12 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-05 17:50 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c              | 26 +++++++++++++---
 t/lib-bundle-uri-protocol.sh | 21 +++++++++++--
 t/t5601-clone.sh             | 59 ++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 7 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 39364c25b15..af8b2a4df66 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1266,11 +1266,27 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
-	/*
-	 * Populate transport->got_remote_bundle_uri and
-	 * transport->bundle_uri. We might get nothing.
-	 */
-	transport_get_remote_bundle_uri(transport);
+	if (!bundle_uri) {
+		/*
+		* Populate transport->got_remote_bundle_uri and
+		* transport->bundle_uri. We might get nothing.
+		*/
+		transport_get_remote_bundle_uri(transport);
+
+		if (transport->bundles &&
+		    hashmap_get_size(&transport->bundles->bundles)) {
+			/* At this point, we need the_repository to match the cloned repo. */
+			if (repo_init(the_repository, git_dir, work_tree))
+				warning(_("failed to initialize the repo, skipping bundle URI"));
+			else if (fetch_bundle_list(the_repository,
+						   remote->url[0],
+						   transport->bundles))
+				warning(_("failed to fetch advertised bundles"));
+		} else {
+			clear_bundle_list(transport->bundles);
+			FREE_AND_NULL(transport->bundles);
+		}
+	}
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 6d3f871fa0f..73e2d45bc8b 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -85,7 +85,7 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 '
 
 test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
-	test_when_finished "rm -rf log cloned cloned2" &&
+	test_when_finished "rm -rf log* cloned*" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
 	GIT_TEST_BUNDLE_URI=0 \
@@ -117,7 +117,24 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	grep "< bundle-uri" log &&
 
 	# Client issued bundle-uri command
-	grep "> command=bundle-uri" log
+	grep "> command=bundle-uri" log &&
+
+	GIT_TRACE_PACKET="$PWD/log3" \
+	git \
+		-c transfer.bundleURI=true \
+		-c protocol.version=2 \
+		clone --bundle-uri="$BUNDLE_URI_BUNDLE_URI" \
+		"$BUNDLE_URI_REPO_URI" cloned3 \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log3 &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log3 &&
+
+	# Client did not issue bundle-uri command (--bundle-uri override)
+	! grep "> command=bundle-uri" log3
 '
 
 test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 45f0803ed4d..d1d8139751e 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -795,6 +795,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
 	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
 '
 
+test_expect_success 'auto-discover bundle URI from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo2.git repo2 &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
+test_expect_success 'auto-discover multiple bundles from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	test_commit -C src new &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.mode all &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.new.uri "$HTTPD_URL/new.bundle" &&
+
+	GIT_TEST_BUNDLE_URI=1 \
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 clone \
+		$HTTPD_URL/smart/repo3.git repo3 &&
+
+	# We should fetch _both_ bundles
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton
  2022-12-05 17:50     ` [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 23:31       ` Victoria Dye
  0 siblings, 0 replies; 87+ messages in thread
From: Victoria Dye @ 2022-12-05 23:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> index 59bf41cefb9..10bd2d40cec 100644
> --- a/Documentation/gitprotocol-v2.txt
> +++ b/Documentation/gitprotocol-v2.txt
> @@ -578,6 +578,207 @@ and associated requested information, each separated by a single space.
>  
>  	obj-info = obj-id SP obj-size
>  
> +bundle-uri
> +~~~~~~~~~~

Apologies for not following up on this patch when you updated it for v2.
This version is much clearer in describing the bundle URI command protocol,
especially how the 'bundle.*' config is used:

> +When the client issues a `command=bundle-uri` request, the response is a
> +list of key-value pairs provided as packet lines with value
> +`<key>=<value>`. Each `<key>` should be interpreted as a config key from
> +the `bundle.*` namespace to construct a list of bundles. These keys are
> +grouped by a `bundle.<id>.` subsection, where each key corresponding to a
> +given `<id>` contributes attributes to the bundle defined by that `<id>`.
> +See linkgit:git-config[1] for the specific details of these keys and how
> +the Git client will interpret their values.
> +
> +Clients MUST parse the line according to the above format, lines that do
> +not conform to the format SHOULD be discarded. The user MAY be warned in
> +such a case.
> +

and the response types/formats:

> +URI CONTENTS::
> +The content at the advertised URIs MUST be one of two types.
> ++
> +The advertised URI may contain a bundle file that `git bundle verify`
> +would accept. I.e. they MUST contain one or more reference tips for
> +use by the client, MUST indicate prerequisites (in any) with standard
> +"-" prefixes, and MUST indicate their "object-format", if
> +applicable.
> ++
> +The advertised URI may alternatively contain a plaintext file that `git
> +config --list` would accept (with the `--file` option). The key-value
> +pairs in this list are in the `bundle.*` namespace (see
> +linkgit:git-config[1]).
> +

Regarding your point about examples in [1]: after reading the remainder of
the series, I agree that the test cases in the later patches do a good job
of documenting the behavior.

Thanks!

[1] https://lore.kernel.org/git/ca11478b-7b44-3018-04d8-0b84c4f43b56@github.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting
  2022-12-05 17:50     ` [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 23:32       ` Victoria Dye
  2022-12-07 15:20         ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-12-05 23:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>  <avarab@gmail.com>
> 
> The yet-to-be introduced client support for bundle-uri will always
> fall back on a full clone, but we'd still like to be able to ignore a
> server's bundle-uri advertisement entirely.
> 
> The new transfer.bundleURI config option defaults to 'false', but a user
> can set it to 'true' to enable checking for bundle URIs from the origin
> Git server using protocol v2.
> 
> To enable this setting by default in the correct tests, add a
> GIT_TEST_BUNDLE_URI environment variable.

It wasn't immediately clear to me from reading this patch, but it looks like
'GIT_TEST_BUNDLE_URI' is mainly used to allow 'test-tool bundle-uri
ls-remote' to issue the bundle URI command (since it can't use a '-c
transfer.bundleURI=true' command line option) in patch 7 [1].

If that's the only use for 'GIT_TEST_BUNDLE_URI', could you avoid the
environment variable altogether by setting 'transfer.bundleURI=true' with
'test_config' before the 'test-tool' call (and 'test_unconfig' after, if
needed)? Alternatively, if you do want to be able to test the bundle URI
protocol wholesale across all tests (e.g., in the 'linux-TEST-vars' CI job),
then I think the environment variable makes sense.

[1] https://lore.kernel.org/git/acc5a8f57f903342c47802115f8e3de9e9d588dc.1670262639.git.gitgitgadget@gmail.com/

> diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
> index d44c6e10f9e..77bfd4f0119 100644
> --- a/t/lib-bundle-uri-protocol.sh
> +++ b/t/lib-bundle-uri-protocol.sh
> @@ -85,9 +85,10 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
>  '
>  
>  test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
> -	test_when_finished "rm -rf log cloned" &&
> +	test_when_finished "rm -rf log cloned cloned2" &&
>  
>  	GIT_TRACE_PACKET="$PWD/log" \
> +	GIT_TEST_BUNDLE_URI=0 \
>  	git \
>  		-c protocol.version=2 \
>  		clone "$BUNDLE_URI_REPO_URI" cloned \
> @@ -99,6 +100,22 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
>  	# Server advertised bundle-uri capability
>  	grep "< bundle-uri" log &&
>  
> +	# Client did not issue bundle-uri command
> +	! grep "> command=bundle-uri" log &&
> +
> +	GIT_TRACE_PACKET="$PWD/log" \
> +	git \
> +		-c transfer.bundleURI=true \
> +		-c protocol.version=2 \
> +		clone "$BUNDLE_URI_REPO_URI" cloned2 \
> +		>actual 2>err &&

If 'GIT_TEST_BUNDLE_URI' is set to '1' in a more global scope (by a CI job
or user running the tests), then the '-c transfer.bundleURI' config isn't
actually what's enabling the behavior. To make this more directly comparable
to the case earlier in this test, could you add 'GIT_TEST_BUNDLE_URI=0' here
as well?


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 06/11] bundle-uri client: add helper for testing server
  2022-12-05 17:50     ` [PATCH v3 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-05 23:32       ` Victoria Dye
  0 siblings, 0 replies; 87+ messages in thread
From: Victoria Dye @ 2022-12-05 23:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
>  <avarab@gmail.com>
> 
> Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
> for issuing protocol v2 "bundle-uri" commands to a server, and to the
> parsing routines in bundle-uri.c.
> 
> In the "git clone" case we'll have already done the handshake(),
> but not here. Add an extra case to check for this handshake in
> get_bundle_uri() for ease of use for future callers. Rename the existing
> 'got_remote_heads' to 'finished_handshake' to make it more clear what
> that bit represents.

nit: I think this last sentence ("Rename...") is out-of-place, since you
made that change in patch 5 [1].

[1] https://lore.kernel.org/git/b009b4f58ea312e40af385ea5ca7ede5ea07391a.1670262639.git.gitgitgadget@gmail.com/

> diff --git a/transport.c b/transport.c
> index a2281d95262..97d395e10a3 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -371,6 +371,13 @@ static int get_bundle_uri(struct transport *transport)
>  		init_bundle_list(transport->bundles);
>  	}
>  
> +	if (!data->finished_handshake) {
> +		struct ref *refs = handshake(transport, 0, NULL, 0);
> +
> +		if (refs)
> +			free_refs(refs);
> +	}

This makes more sense without the extra assertions. Thanks!


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists
  2022-12-05 17:50     ` [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-12-05 23:33       ` Victoria Dye
  2022-12-07 15:22         ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Victoria Dye @ 2022-12-05 23:33 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> Allow a bundle list to specify a relative URI for the bundles. This URI
> is based on where the client received the bundle list. For a list
> provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
> the base URI. Otherwise, the bundle list was provided from an HTTP URI
> not using the Git protocol, and that URI is the base URI. This allows
> easier distribution of bundle data.

Thanks, this clears up my confusion about the source of 'baseURI'.

> +	/**
> +	 * The baseURI of a bundle_list is the URI that provided the list.
> +	 *
> +	 * In the case of the 'bundle-uri' protocol v2 command, the base
> +	 * URI is the URI of the Git remote.
> +	 *
> +	 * Otherewise, the bundle list was downloaded over HTTP from some
> +	 * known URI.

s/Otherewise/Otherwise

Also, this sentence is a bit more vague than what was noted in the commit
message; it doesn't actually say what the base URI is set to in this
scenario. Feel free to ignore if you think it's overkill, but that could
probably be cleared up by adding another sentence after like "The base URI
is set to that known URI."

> +	 *
> +	 * The baseURI is used as the base for any relative URIs
> +	 * advertised by the bundle list at that location.
> +	 */
> +	char *baseURI;

...

> +	# TODO: We would prefer if parsing a bundle list would not cause
> +	# a die() and instead would give a warning and allow the rest of
> +	# a Git command to continue. This test_must_fail is necessary for
> +	# now until the interface for relative_url() allows for reporting
> +	# an error instead of die()ing.
> +	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
> +	grep "fatal: cannot strip one component off url" err

Thanks for adding this, I'm content to leave this as a TODO for now.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (10 preceding siblings ...)
  2022-12-05 17:50     ` [PATCH v3 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
@ 2022-12-05 23:42     ` Victoria Dye
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
  12 siblings, 0 replies; 87+ messages in thread
From: Victoria Dye @ 2022-12-05 23:42 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

Derrick Stolee via GitGitGadget wrote:
> Updates in v3
> =============
> 

Thanks for addressing all of my questions/comments from the past version(s)!
Overall, the series is easy to follow and cleanly integrates the new
functionality. In this version, I just had a few minor phrasing/typo nits &
some questions about 'GIT_TEST_BUNDLE_URI', but otherwise I'm happy with it. 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-05 17:50     ` [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
@ 2022-12-06 10:06       ` Ævar Arnfjörð Bjarmason
  2022-12-06 11:37         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-06 10:06 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee


On Mon, Dec 05 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The strbuf_parent_directory() method was added as a static method in
> contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
> config and starts maintenance, 2021-12-03) and then removed in
> 65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
> there is a need for a similar method in the bundle URI feature.
>
> Re-add the method, this time in strbuf.c, but with a new name:
> strbuf_strip_file_from_path(). The method requirements are slightly
> modified to allow a trailing slash, in which case nothing is done, which
> makes the name change valuable.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  strbuf.c |  6 ++++++
>  strbuf.h | 11 +++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index 0890b1405c5..c383f41a3c5 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1200,3 +1200,9 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>  	free(path2);
>  	return res;
>  }
> +
> +void strbuf_strip_file_from_path(struct strbuf *sb)
> +{
> +	char *path_sep = find_last_dir_sep(sb->buf);
> +	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
> +}
> diff --git a/strbuf.h b/strbuf.h
> index 76965a17d44..f6dbb9681ee 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -664,6 +664,17 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
>  int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>  			      const char *const *env);
>  
> +/*
> + * Remove the filename from the provided path string. If the path
> + * contains a trailing separator, then the path is considered a directory
> + * and nothing is modified.
> + *
> + * Examples:
> + * - "/path/to/file" -> "/path/to/"
> + * - "/path/to/dir/" -> "/path/to/dir/"
> + */
> +void strbuf_strip_file_from_path(struct strbuf *sb);
> +
>  void strbuf_add_lines(struct strbuf *sb,
>  		      const char *prefix,
>  		      const char *buf,

Re your reply in
https://lore.kernel.org/git/0980dcd4-30eb-4ef4-9369-279abe5ca5a2@github.com/
I still don't get how this is different from a 1-byte change to
strbuf_trim_trailing_dir_sep(), and if it isn't I think it's confusing
API design to have two very different ways to return the same data.

There you said "The difference is all about whether or not we start with
a slash _and_ no other slash appears in the path.".

But I can't find a case where there's any difference between the two. I
tried this ad-hoc test on top:
	
	diff --git a/help.c b/help.c
	index f1e090a4428..b0866b01439 100644
	--- a/help.c
	+++ b/help.c
	@@ -765,6 +765,16 @@ int cmd_version(int argc, const char **argv, const char *prefix)
	 			 "also print build options"),
	 		OPT_END()
	 	};
	+	struct strbuf sb1 = STRBUF_INIT;
	+	struct strbuf sb2 = STRBUF_INIT;
	+
	+	if (getenv("STR")) {
	+		strbuf_addstr(&sb1, getenv("STR"));
	+		strbuf_addstr(&sb2, getenv("STR"));
	+		strbuf_strip_file_from_path(&sb1);
	+		strbuf_trim_trailing_not_dir_sep(&sb2);
	+		fprintf(stderr, "%s: %s | %s\n", strcmp(sb1.buf, sb2.buf) ? "NEQ" : "EQ", sb1.buf, sb2.buf);
	+	}
	 
	 	argc = parse_options(argc, argv, prefix, options, usage, 0);
	 
	diff --git a/strbuf.c b/strbuf.c
	index c383f41a3c5..f75d94556fc 100644
	--- a/strbuf.c
	+++ b/strbuf.c
	@@ -114,13 +114,23 @@ void strbuf_rtrim(struct strbuf *sb)
	 	sb->buf[sb->len] = '\0';
	 }
	 
	-void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
	+static void strbuf_trim_trailing_dir_sep_1(struct strbuf *sb, int flip)
	 {
	-	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]))
	+	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]) - flip)
	 		sb->len--;
	 	sb->buf[sb->len] = '\0';
	 }
	 
	+void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
	+{
	+	strbuf_trim_trailing_dir_sep_1(sb, 1);
	+}
	+
	+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb)
	+{
	+	strbuf_trim_trailing_dir_sep_1(sb, 1);
	+}
	+
	 void strbuf_trim_trailing_newline(struct strbuf *sb)
	 {
	 	if (sb->len > 0 && sb->buf[sb->len - 1] == '\n') {
	diff --git a/strbuf.h b/strbuf.h
	index f6dbb9681ee..b936f45ffad 100644
	--- a/strbuf.h
	+++ b/strbuf.h
	@@ -189,6 +189,8 @@ void strbuf_ltrim(struct strbuf *sb);
	 
	 /* Strip trailing directory separators */
	 void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
	+/* Strip trailing not-directory separators */
	+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb);
	 
	 /* Strip trailing LF or CR/LF */
	 void strbuf_trim_trailing_newline(struct strbuf *sb);

Then:
	
	$ for str in a / b/ /c /d/ /e/ /f/g /h/i/ j/k l//m n/o/p //q/r/s/t; do STR=$str ./git version; done 2>&1 | grep :
	EQ:  | 
	EQ: / | /
	EQ: b/ | b/
	EQ: / | /
	EQ: /d/ | /d/
	EQ: /e/ | /e/
	EQ: /f/ | /f/
	EQ: /h/i/ | /h/i/
	EQ: j/ | j/
	EQ: l// | l//
	EQ: n/o/ | n/o/
	EQ: //q/r/s/ | //q/r/s/

I.e. for those inputs it's the same as the existing
strbuf_trim_trailing_dir_sep() with an inverted test. Is there some edge
case that I'm missing?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-06 10:06       ` Ævar Arnfjörð Bjarmason
@ 2022-12-06 11:37         ` Ævar Arnfjörð Bjarmason
  2022-12-07 14:44           ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-06 11:37 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee


On Tue, Dec 06 2022, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Dec 05 2022, Derrick Stolee via GitGitGadget wrote:
>
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The strbuf_parent_directory() method was added as a static method in
>> contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
>> config and starts maintenance, 2021-12-03) and then removed in
>> 65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
>> there is a need for a similar method in the bundle URI feature.
>>
>> Re-add the method, this time in strbuf.c, but with a new name:
>> strbuf_strip_file_from_path(). The method requirements are slightly
>> modified to allow a trailing slash, in which case nothing is done, which
>> makes the name change valuable.
>>
>> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>> ---
>>  strbuf.c |  6 ++++++
>>  strbuf.h | 11 +++++++++++
>>  2 files changed, 17 insertions(+)
>>
>> diff --git a/strbuf.c b/strbuf.c
>> index 0890b1405c5..c383f41a3c5 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -1200,3 +1200,9 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>>  	free(path2);
>>  	return res;
>>  }
>> +
>> +void strbuf_strip_file_from_path(struct strbuf *sb)
>> +{
>> +	char *path_sep = find_last_dir_sep(sb->buf);
>> +	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
>> +}
>> diff --git a/strbuf.h b/strbuf.h
>> index 76965a17d44..f6dbb9681ee 100644
>> --- a/strbuf.h
>> +++ b/strbuf.h
>> @@ -664,6 +664,17 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
>>  int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
>>  			      const char *const *env);
>>  
>> +/*
>> + * Remove the filename from the provided path string. If the path
>> + * contains a trailing separator, then the path is considered a directory
>> + * and nothing is modified.
>> + *
>> + * Examples:
>> + * - "/path/to/file" -> "/path/to/"
>> + * - "/path/to/dir/" -> "/path/to/dir/"
>> + */
>> +void strbuf_strip_file_from_path(struct strbuf *sb);
>> +
>>  void strbuf_add_lines(struct strbuf *sb,
>>  		      const char *prefix,
>>  		      const char *buf,
>
> Re your reply in
> https://lore.kernel.org/git/0980dcd4-30eb-4ef4-9369-279abe5ca5a2@github.com/
> I still don't get how this is different from a 1-byte change to
> strbuf_trim_trailing_dir_sep(), and if it isn't I think it's confusing
> API design to have two very different ways to return the same data.
>
> There you said "The difference is all about whether or not we start with
> a slash _and_ no other slash appears in the path.".
>
> But I can't find a case where there's any difference between the two. I
> tried this ad-hoc test on top:
> 	
> 	diff --git a/help.c b/help.c
> 	index f1e090a4428..b0866b01439 100644
> 	--- a/help.c
> 	+++ b/help.c
> 	@@ -765,6 +765,16 @@ int cmd_version(int argc, const char **argv, const char *prefix)
> 	 			 "also print build options"),
> 	 		OPT_END()
> 	 	};
> 	+	struct strbuf sb1 = STRBUF_INIT;
> 	+	struct strbuf sb2 = STRBUF_INIT;
> 	+
> 	+	if (getenv("STR")) {
> 	+		strbuf_addstr(&sb1, getenv("STR"));
> 	+		strbuf_addstr(&sb2, getenv("STR"));
> 	+		strbuf_strip_file_from_path(&sb1);
> 	+		strbuf_trim_trailing_not_dir_sep(&sb2);
> 	+		fprintf(stderr, "%s: %s | %s\n", strcmp(sb1.buf, sb2.buf) ? "NEQ" : "EQ", sb1.buf, sb2.buf);
> 	+	}
> 	 
> 	 	argc = parse_options(argc, argv, prefix, options, usage, 0);
> 	 
> 	diff --git a/strbuf.c b/strbuf.c
> 	index c383f41a3c5..f75d94556fc 100644
> 	--- a/strbuf.c
> 	+++ b/strbuf.c
> 	@@ -114,13 +114,23 @@ void strbuf_rtrim(struct strbuf *sb)
> 	 	sb->buf[sb->len] = '\0';
> 	 }
> 	 
> 	-void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
> 	+static void strbuf_trim_trailing_dir_sep_1(struct strbuf *sb, int flip)
> 	 {
> 	-	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]))
> 	+	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]) - flip)
> 	 		sb->len--;
> 	 	sb->buf[sb->len] = '\0';
> 	 }
> 	 
> 	+void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
> 	+{
> 	+	strbuf_trim_trailing_dir_sep_1(sb, 1);
> 	+}
> 	+
> 	+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb)
> 	+{
> 	+	strbuf_trim_trailing_dir_sep_1(sb, 1);
> 	+}
> 	+
> 	 void strbuf_trim_trailing_newline(struct strbuf *sb)
> 	 {
> 	 	if (sb->len > 0 && sb->buf[sb->len - 1] == '\n') {
> 	diff --git a/strbuf.h b/strbuf.h
> 	index f6dbb9681ee..b936f45ffad 100644
> 	--- a/strbuf.h
> 	+++ b/strbuf.h
> 	@@ -189,6 +189,8 @@ void strbuf_ltrim(struct strbuf *sb);
> 	 
> 	 /* Strip trailing directory separators */
> 	 void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
> 	+/* Strip trailing not-directory separators */
> 	+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb);
> 	 
> 	 /* Strip trailing LF or CR/LF */
> 	 void strbuf_trim_trailing_newline(struct strbuf *sb);
>
> Then:
> 	
> 	$ for str in a / b/ /c /d/ /e/ /f/g /h/i/ j/k l//m n/o/p //q/r/s/t; do STR=$str ./git version; done 2>&1 | grep :
> 	EQ:  | 
> 	EQ: / | /
> 	EQ: b/ | b/
> 	EQ: / | /
> 	EQ: /d/ | /d/
> 	EQ: /e/ | /e/
> 	EQ: /f/ | /f/
> 	EQ: /h/i/ | /h/i/
> 	EQ: j/ | j/
> 	EQ: l// | l//
> 	EQ: n/o/ | n/o/
> 	EQ: //q/r/s/ | //q/r/s/
>
> I.e. for those inputs it's the same as the existing
> strbuf_trim_trailing_dir_sep() with an inverted test. Is there some edge
> case that I'm missing?

FWIW the "overkill" change on top to do this via callbacks is the
below. Which I tested just to see how easy it was, and whether it would
fail your tests (it doesn't).

-- >8 --
Subject: [PATCH] strbuf: generalize "{,r,l}trim" to a callback interface

We've had all three variants of "trim" for isspace(), then since
c64a8d200f4 (worktree move: accept destination as directory,
2018-02-12) we've had a "is_dir_sep" variant.

A preceding change then added a "!is_dir_sep" variant. Let's
generalize this, and have all these functions that want to trim
characters matching some criteria be driven by the same logic.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 bundle-uri.c      |  7 +------
 git-compat-util.h |  5 +++++
 strbuf.c          | 44 ++++++++++++++++++++++++++++----------------
 strbuf.h          | 41 +++++++++++++++++++++++++++--------------
 4 files changed, 61 insertions(+), 36 deletions(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index c411b871bdd..7240dedcaee 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -195,13 +195,8 @@ int bundle_uri_parse_config_format(const char *uri,
 	if (!list->baseURI) {
 		struct strbuf baseURI = STRBUF_INIT;
 		strbuf_addstr(&baseURI, uri);
+		strbuf_trim_trailing_not_dir_sep(&baseURI);
 
-		/*
-		 * If the URI does not end with a trailing slash, then
-		 * remove the filename portion of the path. This is
-		 * important for relative URIs.
-		 */
-		strbuf_strip_file_from_path(&baseURI);
 		list->baseURI = strbuf_detach(&baseURI, NULL);
 	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
diff --git a/git-compat-util.h b/git-compat-util.h
index a76d0526f79..5bce9fa768c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -451,6 +451,11 @@ static inline int git_is_dir_sep(int c)
 #define is_dir_sep git_is_dir_sep
 #endif
 
+static inline int is_not_dir_sep(int c)
+{
+	return !is_dir_sep(c);
+}
+
 #ifndef offset_1st_component
 static inline int git_offset_1st_component(const char *path)
 {
diff --git a/strbuf.c b/strbuf.c
index c383f41a3c5..a5a1c01d539 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -101,24 +101,37 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
 		sb->buf[0] = '\0';
 }
 
-void strbuf_trim(struct strbuf *sb)
+void strbuf_trim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn)
 {
-	strbuf_rtrim(sb);
-	strbuf_ltrim(sb);
+	strbuf_rtrim_fn(sb, fn);
+	strbuf_ltrim_fn(sb, fn);
 }
 
-void strbuf_rtrim(struct strbuf *sb)
+void strbuf_rtrim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn)
 {
-	while (sb->len > 0 && isspace((unsigned char)sb->buf[sb->len - 1]))
+	while (sb->len > 0 && fn((unsigned char)sb->buf[sb->len - 1]))
 		sb->len--;
 	sb->buf[sb->len] = '\0';
 }
 
+void strbuf_trim(struct strbuf *sb)
+{
+	strbuf_trim_fn(sb, strbuf_ctype_isspace);
+}
+
+void strbuf_rtrim(struct strbuf *sb)
+{
+	strbuf_rtrim_fn(sb, strbuf_ctype_isspace);
+}
+
 void strbuf_trim_trailing_dir_sep(struct strbuf *sb)
 {
-	while (sb->len > 0 && is_dir_sep((unsigned char)sb->buf[sb->len - 1]))
-		sb->len--;
-	sb->buf[sb->len] = '\0';
+	strbuf_rtrim_fn(sb, is_dir_sep);
+}
+
+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb)
+{
+	strbuf_rtrim_fn(sb, is_not_dir_sep);
 }
 
 void strbuf_trim_trailing_newline(struct strbuf *sb)
@@ -130,10 +143,10 @@ void strbuf_trim_trailing_newline(struct strbuf *sb)
 	}
 }
 
-void strbuf_ltrim(struct strbuf *sb)
+void strbuf_ltrim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn)
 {
 	char *b = sb->buf;
-	while (sb->len > 0 && isspace(*b)) {
+	while (sb->len > 0 && fn(*b)) {
 		b++;
 		sb->len--;
 	}
@@ -141,6 +154,11 @@ void strbuf_ltrim(struct strbuf *sb)
 	sb->buf[sb->len] = '\0';
 }
 
+void strbuf_ltrim(struct strbuf *sb)
+{
+	strbuf_ltrim_fn(sb, strbuf_ctype_isspace);
+}
+
 int strbuf_reencode(struct strbuf *sb, const char *from, const char *to)
 {
 	char *out;
@@ -1200,9 +1218,3 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 	free(path2);
 	return res;
 }
-
-void strbuf_strip_file_from_path(struct strbuf *sb)
-{
-	char *path_sep = find_last_dir_sep(sb->buf);
-	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
-}
diff --git a/strbuf.h b/strbuf.h
index f6dbb9681ee..bb7aa38816f 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -180,15 +180,39 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
  */
 
 /**
- * Strip whitespace from the beginning (`ltrim`), end (`rtrim`), or both side
- * (`trim`) of a string.
+ * A callback function that acts like the macros defined in
+ * <ctype.h>. To be given to strbuf_{,r,l}trim() below.
+ */
+typedef int (*strbuf_ctype_fn_t)(int c);
+static inline int strbuf_ctype_isspace(int c) { return isspace(c); }
+
+/**
+ * Strip characters matching the 'strbuf_ctype_fn_t' from the
+ * beginning (`ltrim`), end (`rtrim`) or both sides (`trim`) of a
+ * string.
+ */
+void strbuf_trim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn);
+void strbuf_rtrim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn);
+void strbuf_ltrim_fn(struct strbuf *sb, strbuf_ctype_fn_t fn);
+
+/**
+ * The common-case wrapper for strbuf_{,r,l}trim_fn() uses the
+ * strbuf_ctype_isspace() callback function.
  */
 void strbuf_trim(struct strbuf *sb);
 void strbuf_rtrim(struct strbuf *sb);
 void strbuf_ltrim(struct strbuf *sb);
 
-/* Strip trailing directory separators */
+/**
+ * Strip trailing directory separators. This is strbuf_rtrim_fn() with
+ * is_dir_sep() as the callback..
+ */
 void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
+/**
+ * Strip trailing not-directory separators. This is strbuf_rtrim_fn()
+ * with is_not_dir_sep() as the callback.
+ */
+void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb);
 
 /* Strip trailing LF or CR/LF */
 void strbuf_trim_trailing_newline(struct strbuf *sb);
@@ -664,17 +688,6 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
 int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			      const char *const *env);
 
-/*
- * Remove the filename from the provided path string. If the path
- * contains a trailing separator, then the path is considered a directory
- * and nothing is modified.
- *
- * Examples:
- * - "/path/to/file" -> "/path/to/"
- * - "/path/to/dir/" -> "/path/to/dir/"
- */
-void strbuf_strip_file_from_path(struct strbuf *sb);
-
 void strbuf_add_lines(struct strbuf *sb,
 		      const char *prefix,
 		      const char *buf,
-- 
2.39.0.rc1.1014.gc37e9814e18


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-05 17:50     ` [PATCH v3 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-12-07 12:57       ` Jeff King
  2022-12-07 15:27         ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Jeff King @ 2022-12-07 12:57 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Derrick Stolee

On Mon, Dec 05, 2022 at 05:50:38PM +0000, Derrick Stolee via GitGitGadget wrote:

> +int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
> +{
> +	int result;
> +	struct bundle_list global_list;
> +
> +	init_bundle_list(&global_list);
> +
> +	/* If a bundle is added to this global list, then it is required. */
> +	global_list.mode = BUNDLE_MODE_ALL;
> +
> +	if ((result = download_bundle_list(r, list, &global_list, 0)))
> +		goto cleanup;
> +
> +	result = unbundle_all_bundles(r, &global_list);
> +
> +cleanup:
> +	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
> +	clear_bundle_list(&global_list);
> +	return result;
> +}

The "uri" parameter in this function is unused. I'm not sure if that's
indicative of a bug or missing feature (e.g., could it be the base for a
relative url?), or if it's just a leftover from development.

If the latter, I'm happy to add it to my list of cleanups.

There are a couple other unused parameters in this series, too, but they
are all in virtual functions and must be kept. I'll add them to my list
of annotations.

-Peff

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-06 11:37         ` Ævar Arnfjörð Bjarmason
@ 2022-12-07 14:44           ` Derrick Stolee
  2022-12-08 12:52             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee @ 2022-12-07 14:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/6/22 6:37 AM, Ævar Arnfjörð Bjarmason wrote:

> FWIW the "overkill" change on top to do this via callbacks is the
> below. Which I tested just to see how easy it was, and whether it would
> fail your tests (it doesn't).
> 
> -- >8 --
> Subject: [PATCH] strbuf: generalize "{,r,l}trim" to a callback interface

I don't like this approach and think it distracts from the goal
of the series. If you want to update it afterwards, then by all
means go for it.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting
  2022-12-05 23:32       ` Victoria Dye
@ 2022-12-07 15:20         ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-07 15:20 UTC (permalink / raw)
  To: Victoria Dye,
	Ævar Arnfjörð Bjarmason via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/5/2022 6:32 PM, Victoria Dye wrote:
> Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
>> +	# Client did not issue bundle-uri command
>> +	! grep "> command=bundle-uri" log &&
>> +
>> +	GIT_TRACE_PACKET="$PWD/log" \
>> +	git \
>> +		-c transfer.bundleURI=true \
>> +		-c protocol.version=2 \
>> +		clone "$BUNDLE_URI_REPO_URI" cloned2 \
>> +		>actual 2>err &&
>
> If 'GIT_TEST_BUNDLE_URI' is set to '1' in a more global scope (by a CI job
> or user running the tests), then the '-c transfer.bundleURI' config isn't
> actually what's enabling the behavior. To make this more directly comparable
> to the case earlier in this test, could you add 'GIT_TEST_BUNDLE_URI=0' here
> as well?

You're right that GIT_TEST_BUNDLE_URI is not needed if we can set
transfer.bundleURI globally earlier in the test. It doesn't make much sense
to run the entire test suite with it on, since the server side does not
advertise bundles unless explicitly configured to do so.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists
  2022-12-05 23:33       ` Victoria Dye
@ 2022-12-07 15:22         ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-07 15:22 UTC (permalink / raw)
  To: Victoria Dye, Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/5/2022 6:33 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> Allow a bundle list to specify a relative URI for the bundles. This URI
>> is based on where the client received the bundle list. For a list
>> provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
>> the base URI. Otherwise, the bundle list was provided from an HTTP URI
>> not using the Git protocol, and that URI is the base URI. This allows
>> easier distribution of bundle data.
> 
> Thanks, this clears up my confusion about the source of 'baseURI'.
> 
>> +	/**
>> +	 * The baseURI of a bundle_list is the URI that provided the list.
>> +	 *
>> +	 * In the case of the 'bundle-uri' protocol v2 command, the base
>> +	 * URI is the URI of the Git remote.
>> +	 *
>> +	 * Otherewise, the bundle list was downloaded over HTTP from some
>> +	 * known URI.
> 
> s/Otherewise/Otherwise
> 
> Also, this sentence is a bit more vague than what was noted in the commit
> message; it doesn't actually say what the base URI is set to in this
> scenario. Feel free to ignore if you think it's overkill, but that could
> probably be cleared up by adding another sentence after like "The base URI
> is set to that known URI."

Thanks for both of these suggestions.

-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-07 12:57       ` Jeff King
@ 2022-12-07 15:27         ` Derrick Stolee
  2022-12-07 15:54           ` Derrick Stolee
  2022-12-08  6:36           ` Jeff King
  0 siblings, 2 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-07 15:27 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/7/2022 7:57 AM, Jeff King wrote:
> On Mon, Dec 05, 2022 at 05:50:38PM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> +int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
>> +{
>> +	int result;
>> +	struct bundle_list global_list;
>> +
>> +	init_bundle_list(&global_list);
>> +
>> +	/* If a bundle is added to this global list, then it is required. */
>> +	global_list.mode = BUNDLE_MODE_ALL;
>> +
>> +	if ((result = download_bundle_list(r, list, &global_list, 0)))
>> +		goto cleanup;
>> +
>> +	result = unbundle_all_bundles(r, &global_list);
>> +
>> +cleanup:
>> +	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
>> +	clear_bundle_list(&global_list);
>> +	return result;
>> +}
> 
> The "uri" parameter in this function is unused. I'm not sure if that's
> indicative of a bug or missing feature (e.g., could it be the base for a
> relative url?), or if it's just a leftover from development.

Thanks for your careful eye. This 'uri' is indeed not needed. I think it
was initially there for relative URIs, but the given 'list' is expected
to have that value initialized. I'll make it clear in the doc comment.
 
> If the latter, I'm happy to add it to my list of cleanups.
> 
> There are a couple other unused parameters in this series, too, but they
> are all in virtual functions and must be kept. I'll add them to my list
> of annotations.

Your UNUSED annotations exist in my tree, so I'll try my best to update
them in the next version.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-07 15:27         ` Derrick Stolee
@ 2022-12-07 15:54           ` Derrick Stolee
  2022-12-08  6:40             ` Jeff King
  2022-12-08  6:36           ` Jeff King
  1 sibling, 1 reply; 87+ messages in thread
From: Derrick Stolee @ 2022-12-07 15:54 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng

On 12/7/2022 10:27 AM, Derrick Stolee wrote:
> On 12/7/2022 7:57 AM, Jeff King wrote:

>> There are a couple other unused parameters in this series, too, but they
>> are all in virtual functions and must be kept. I'll add them to my list
>> of annotations.
> 
> Your UNUSED annotations exist in my tree, so I'll try my best to update
> them in the next version.

One of these showed an example where I should have been using
repo_config_...() instead of git_config_...(). Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-07 15:27         ` Derrick Stolee
  2022-12-07 15:54           ` Derrick Stolee
@ 2022-12-08  6:36           ` Jeff King
  2022-12-08 14:58             ` Derrick Stolee
  1 sibling, 1 reply; 87+ messages in thread
From: Jeff King @ 2022-12-08  6:36 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, gitster, me, newren, avarab,
	mjcheetham, steadmon, chooglen, jonathantanmy, dyroneteng

On Wed, Dec 07, 2022 at 10:27:06AM -0500, Derrick Stolee wrote:

> > The "uri" parameter in this function is unused. I'm not sure if that's
> > indicative of a bug or missing feature (e.g., could it be the base for a
> > relative url?), or if it's just a leftover from development.
> 
> Thanks for your careful eye. This 'uri' is indeed not needed. I think it
> was initially there for relative URIs, but the given 'list' is expected
> to have that value initialized. I'll make it clear in the doc comment.

That makes sense. I've queued a patch locally to remove it (since
locally I build with -Wunused-parameters), which will eventually make
its way to the list.

> > If the latter, I'm happy to add it to my list of cleanups.
> > 
> > There are a couple other unused parameters in this series, too, but they
> > are all in virtual functions and must be kept. I'll add them to my list
> > of annotations.
> 
> Your UNUSED annotations exist in my tree, so I'll try my best to update
> them in the next version.

Sounds good (and again, I've queued something locally, but if you beat
me to it, it's easy to drop mine).

Note that your series hit 'next' (which is how I noticed it), so there
usually would not be a "next version". Though we will rewind
post-release, so there may still be an opportunity (I didn't follow the
topic closely enough to know if you might want to re-roll for other
reasons).

-Peff

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-07 15:54           ` Derrick Stolee
@ 2022-12-08  6:40             ` Jeff King
  0 siblings, 0 replies; 87+ messages in thread
From: Jeff King @ 2022-12-08  6:40 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, gitster, me, newren, avarab,
	mjcheetham, steadmon, chooglen, jonathantanmy, dyroneteng

On Wed, Dec 07, 2022 at 10:54:31AM -0500, Derrick Stolee wrote:

> On 12/7/2022 10:27 AM, Derrick Stolee wrote:
> > On 12/7/2022 7:57 AM, Jeff King wrote:
> 
> >> There are a couple other unused parameters in this series, too, but they
> >> are all in virtual functions and must be kept. I'll add them to my list
> >> of annotations.
> > 
> > Your UNUSED annotations exist in my tree, so I'll try my best to update
> > them in the next version.
> 
> One of these showed an example where I should have been using
> repo_config_...() instead of git_config_...(). Thanks!

Oh good. It is always satisfying when the warning finds a potential bug,
as it sometimes feels a bit like annotation make-work. ;)

In this instance we're in server-side v2 protocol code, which is already
very global-heavy in its world-view. So I don't think it's a real bug
here, but just a nice-to-have.

I have seen this "oops, we don't really use our repository parameter"
issue in a few places. And while I do think it's best to use it if you
have it, I suspect it's the tip of the iceberg in terms of functions
using the_repository. In the long run, I think we'll really smoke those
out from the bottom up, as more functions insist on taking a repository
parameter (and then their callers will have to switch or face the
embarrassment of passing the_repository themselves, and so on).

All of which is to say that yes, that is a fine change to make. But I
don't consider at all urgent in this instance.

-Peff

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-07 14:44           ` Derrick Stolee
@ 2022-12-08 12:52             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-08 12:52 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, gitster, me, newren,
	mjcheetham, steadmon, chooglen, jonathantanmy, dyroneteng


On Wed, Dec 07 2022, Derrick Stolee wrote:

> On 12/6/22 6:37 AM, Ævar Arnfjörð Bjarmason wrote:
>
>> FWIW the "overkill" change on top to do this via callbacks is the
>> below. Which I tested just to see how easy it was, and whether it would
>> fail your tests (it doesn't).
>> 
>> -- >8 --
>> Subject: [PATCH] strbuf: generalize "{,r,l}trim" to a callback interface
>
> I don't like this approach and think it distracts from the goal
> of the series. If you want to update it afterwards, then by all
> means go for it.

Yes, I think it shouldn't be part of this series at all.

I.e. the part in [1] that you're replying to is just something I poked
at because you nerd-sniped me into poking at it.

I think it's probably not worth it, and I don't think I like the
API[2]. What I do think is worth including in this series one way or
another is [3].

You're proposing a new addition to the strbuf API. I think it's relevant
feedback that you seem to be re-inventing close relative (literally a
1-byte difference!) of a function that's there already.

Or I'm just wrong, but then what input does your
strbuf_strip_file_from_path() handle differently than the
strbuf_trim_trailing_not_dir_sep() in [3]?

Making them sibling functions would make the API more discoverable. The
comment you're adding would also improve the existing code. I.e. we
could have this end-state in strbuf.h:

	/**
	 * Strip trailing directory separators, or not-directory separators.
	 *
	 * The "dir_sep" variant portably trims redundant slash(es) from the
	 * end, while the "not_dir_sep" gets you to the base directory, should
	 * the path refer to a file:
	 *
	 * |---------------+---------------+-------------------|
	 * | In            | out (dir_sep) | out (not_dir_sep) |
	 * |---------------+---------------+-------------------|
	 * | /path/to/file | /path/to/file | /path/to/         |
	 * | /path/to/dir/ | /path/to/dir  | /path/to/dir/     |
	 * |---------------+---------------+-------------------|
	 */
	void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
	void strbuf_trim_trailing_not_dir_sep(struct strbuf *sb);

Or maybe you still think it's not worth it, I also think that's
fine. I'd really appreciate knowing if it's a "yeah maybe they're the
same, but I haven't checked", or if it's "I think you missed a case, but
I haven't explained it to you".

Otherwise if I do follow-up I'd probably have to start by brute-force
testing the two to satisfy my own paranoia :)

Thanks.

1. https://lore.kernel.org/git/221206.86wn74bw35.gmgdl@evledraar.gmail.com/
2. Although once I started poking at it I found a lot of cases in-tree
   where we hardcoded that exact loop (or the equivalent strbuf_setlen()
   variant), which could be replaced by "trim all cases of this
   character from the end of a strbuf", or "trim all cases of this
   character match ..." function.
3. https://lore.kernel.org/git/221206.86a640dda3.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 10/11] bundle-uri: download bundles from an advertised list
  2022-12-08  6:36           ` Jeff King
@ 2022-12-08 14:58             ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2022-12-08 14:58 UTC (permalink / raw)
  To: Jeff King
  Cc: Derrick Stolee via GitGitGadget, git, gitster, me, newren, avarab,
	mjcheetham, steadmon, chooglen, jonathantanmy, dyroneteng

On 12/8/2022 1:36 AM, Jeff King wrote:
> On Wed, Dec 07, 2022 at 10:27:06AM -0500, Derrick Stolee wrote:
> 
>>> The "uri" parameter in this function is unused. I'm not sure if that's
>>> indicative of a bug or missing feature (e.g., could it be the base for a
>>> relative url?), or if it's just a leftover from development.
>>
>> Thanks for your careful eye. This 'uri' is indeed not needed. I think it
>> was initially there for relative URIs, but the given 'list' is expected
>> to have that value initialized. I'll make it clear in the doc comment.
> 
> That makes sense. I've queued a patch locally to remove it (since
> locally I build with -Wunused-parameters), which will eventually make
> its way to the list.
> 
>>> If the latter, I'm happy to add it to my list of cleanups.
>>>
>>> There are a couple other unused parameters in this series, too, but they
>>> are all in virtual functions and must be kept. I'll add them to my list
>>> of annotations.
>>
>> Your UNUSED annotations exist in my tree, so I'll try my best to update
>> them in the next version.
> 
> Sounds good (and again, I've queued something locally, but if you beat
> me to it, it's easy to drop mine).
> 
> Note that your series hit 'next' (which is how I noticed it), so there
> usually would not be a "next version". Though we will rewind
> post-release, so there may still be an opportunity (I didn't follow the
> topic closely enough to know if you might want to re-roll for other
> reasons).

I noticed that after reading this round of review, so I'll be preparing
some fixes on top. I noticed some UNUSED that would be necessary from
earlier parts of the bundle URI work, and you've probably already queued
those changes.

Since I no longer plan to re-roll this series, I'd be happy to review
your queued annotations, and I'll focus on the other fixups.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v4 00/11] Bundle URIs IV: advertise over protocol v2
  2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
                       ` (11 preceding siblings ...)
  2022-12-05 23:42     ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Victoria Dye
@ 2022-12-22 15:14     ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
                         ` (11 more replies)
  12 siblings, 12 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee

This is based on the recent master batch that included ds/bundle-uri-....

Now that git clone --bundle-uri can download a bundle list from a plaintex
file in config format, we can use the same set of key-value pairs to
advertise a bundle list over protocol v2. At the end of this series:

 1. A server can advertise bundles when uploadPack.advertiseBundleURIs is
    enabled. The bundle list comes from the server's local config,
    specifically the bundle.* namespace.
 2. A client can notice a server's bundle-uri advertisement and request the
    bundle list if transfer.bundleURI is enabled. The bundles are downloaded
    as if the list was advertised from the --bundle-uri option.

Many patches in this series were adapted from Ævar's v2 RFC [1]. He is
retained as author and I added myself as co-author only if the modifications
were significant.

[1]
https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

 * Patches 1-7 are mostly taken from [1], again with mostly minor updates.
   The one major difference is the packet line format being a single
   key=value format instead of a sequence of pairs. (In v3, these commits
   are significantly reorganized from [1].)

 * Patches 8-11 finish off the ability for the client to notice the
   capability, request the values, and download bundles before continuing
   with the rest of the download.

One thing that is not handled here but could be handled in a future change
is to disconnect from the origin Git server while downloading the bundle
URIs, then reconnecting afterwards. This does not make any difference for
HTTPS, but SSH may benefit from the reduced connection time. The git clone
--bundle-uri option did not suffer from this because the bundles are
downloaded before the server connection begins.

After this series, there is one more before the original scope of the plan
is complete: using creation tokens as a heuristic. See [2] for the RFC
version of those patches.

[2] https://github.com/derrickstolee/git/pull/22


Updates in v4
=============

This version includes squashed-in versions of the fixups that were
previously known as ds/bundle-uri-4-fixup.

 * Some unused parameters are now marked with UNUSED, since we are
   introducing those parameters for the first time. In one case an unused
   parameter should have been used in repo_config_...() instead of
   git_config_...().
 * The GIT_TEST_BUNDLE_URI environment variable is removed in favor of the
   transfer.bundleURI config option in all cases.
 * A stale commit message is fixed to no longer refer to a rename that was
   split into a different commit as part of v3.
 * The documentation comment for fetch_bundle_list() explicitly defines a
   non-zero return value as an error.


Updates in v3
=============

Most of these updates are due to Victoria's very thorough review. Thanks!

 * What was patch 2 was split to be better understood.
 * The new patch 2 is only the new test script infrastructure for testing
   whether or not the server provides the bundle-uri capability. It is
   extended with other more complicated examples in later patches. The name
   was rewritten from lib-t5730-*.sh to lib-bundle-uri-protocol.sh and the
   variable names are renamed with the BUNDLE_URI_ prefix.
 * The new patch 3 performs the basic client interaction with the
   'bundle-uri' command, while still not being fully wired up on the server
   side. The tests do check that the client requests the bundle-uri command
   after seeing it in the server's capabilities. One important difference
   from earlier is that the check for server_supports_v2() was moved into
   the get_bundle_uri() method (underneath the vtable) because we need to
   check the handshake before calling that method. It makes most sense to
   put the handshake call there, so do it from the start.
 * Patch 4 carefully tests how the transfer.bundleURI config blocks the
   client-side request of the bundle-uri command. Later tests will use the
   GIT_TEST_BUNDLE_URI environment variable instead.
 * The new Patch 5 renames got_remote_heads to finished_handshake in 'struct
   git_transport_data' and that's it. That new value is then used in patch 6
   to indicate if we need to request the handshake in the bundle URI logic.
 * Patch 6 creates the ls-remote helper in 'test-tool bundle-uri' as before,
   but now only makes use of the finished_handshake member instead of
   creating a new one. The test helper represents an example consumer of
   transport_get_remote_bundle_uri() without first doing the server-side
   handshake, which motivates several of the placements of code within that
   method and get_bundle_uri() earlier in the series. The "quiet" option is
   also removed to simplify the test helper and to always communicate the
   inner errors to the user.
 * Patch 7 adds the server-side listing of bundle.* config values. The test
   scripts around these config values have been cleaned up since the
   previous version.
 * Patch 8 has another iteration of strbuf_strip_file_from_path() taking the
   feedback from Victoria and Ævar.
 * Patch 9 adds the relative path logic. The definition of the base path is
   clarified in the commit message and comments. An additional test shows
   what happens if the server advertises too many parent paths
   (unfortunately, a die(), and this is marked for cleanup later).
 * Patch 10 is identical to the old patch 8.
 * Patch 11 completes the work by having the client download the bundles
   provided by the server list. It fixes an if/else that should have been an
   if/else-if. A new test checks that the --bundle-uri=X option overrides
   the server advertisement.


Updates in v2
=============

 * Commit messages now refer to protocol v2 "commands" not "verbs".
 * Several edits were made to gitprotocol-v2.txt thanks to Victoria's
   thorough review.
 * strbuf_parent_directory() is renamed strbuf_strip_file_from_path() to
   make it more clear how it behaves when ending with a slash.

Thanks,

 * Stolee

Derrick Stolee (6):
  transport: rename got_remote_heads
  bundle-uri: serve bundle.* keys from config
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: allow relative URLs in bundle lists
  bundle-uri: download bundles from an advertised list
  clone: unbundle the advertised bundles

Ævar Arnfjörð Bjarmason (5):
  protocol v2: add server-side "bundle-uri" skeleton
  t: create test harness for 'bundle-uri' command
  clone: request the 'bundle-uri' command when available
  bundle-uri client: add boolean transfer.bundleURI setting
  bundle-uri client: add helper for testing server

 Documentation/config/transfer.txt      |   6 +
 Documentation/gitprotocol-v2.txt       | 201 +++++++++++++++++++++++
 builtin/clone.c                        |  21 +++
 bundle-uri.c                           |  87 +++++++++-
 bundle-uri.h                           |  35 ++++
 connect.c                              |  44 +++++
 remote.h                               |   5 +
 serve.c                                |   6 +
 strbuf.c                               |   6 +
 strbuf.h                               |  11 ++
 t/helper/test-bundle-uri.c             |  48 ++++++
 t/lib-bundle-uri-protocol.sh           | 216 +++++++++++++++++++++++++
 t/t5601-clone.sh                       |  59 +++++++
 t/t5701-git-serve.sh                   |  40 ++++-
 t/t5730-protocol-v2-bundle-uri-file.sh |  17 ++
 t/t5731-protocol-v2-bundle-uri-git.sh  |  17 ++
 t/t5732-protocol-v2-bundle-uri-http.sh |  17 ++
 t/t5750-bundle-uri-parse.sh            |  82 ++++++++++
 t/t9119-git-svn-info.sh                |   2 +-
 t/test-lib-functions.sh                |   7 +
 transport-helper.c                     |  13 ++
 transport-internal.h                   |   7 +
 transport.c                            |  87 ++++++++--
 transport.h                            |  19 +++
 24 files changed, 1041 insertions(+), 12 deletions(-)
 create mode 100644 t/lib-bundle-uri-protocol.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1400%2Fderrickstolee%2Fbundle-redo%2Fadvertise-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1400/derrickstolee/bundle-redo/advertise-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1400

Range-diff vs v3:

  1:  beae335b855 !  1:  5ba91813de3 protocol v2: add server-side "bundle-uri" skeleton
     @@ bundle-uri.c: cleanup:
      + * API for serve.c.
      + */
      +
     -+int bundle_uri_advertise(struct repository *r, struct strbuf *value)
     ++int bundle_uri_advertise(struct repository *r, struct strbuf *value UNUSED)
      +{
      +	static int advertise_bundle_uri = -1;
      +
     @@ bundle-uri.c: cleanup:
      +		goto cached;
      +
      +	advertise_bundle_uri = 0;
     -+	git_config_get_maybe_bool("uploadpack.advertisebundleuris", &advertise_bundle_uri);
     ++	repo_config_get_maybe_bool(r, "uploadpack.advertisebundleuris", &advertise_bundle_uri);
      +
      +cached:
      +	return advertise_bundle_uri;
  2:  fcdfef2012a =  2:  3267a5b3b37 t: create test harness for 'bundle-uri' command
  3:  a0188ae39c6 =  3:  5bf6fe55771 clone: request the 'bundle-uri' command when available
  4:  e46118e60f7 !  4:  876dd3f221f bundle-uri client: add boolean transfer.bundleURI setting
     @@ Commit message
          can set it to 'true' to enable checking for bundle URIs from the origin
          Git server using protocol v2.
      
     -    To enable this setting by default in the correct tests, add a
     -    GIT_TEST_BUNDLE_URI environment variable.
     -
          Co-authored-by: Derrick Stolee <derrickstolee@github.com>
          Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "connect with $BUNDLE_URI_PROT
      +	test_when_finished "rm -rf log cloned cloned2" &&
       
       	GIT_TRACE_PACKET="$PWD/log" \
     -+	GIT_TEST_BUNDLE_URI=0 \
       	git \
     ++		-c transfer.bundleURI=false \
       		-c protocol.version=2 \
       		clone "$BUNDLE_URI_REPO_URI" cloned \
     + 		>actual 2>err &&
      @@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
       	# Server advertised bundle-uri capability
       	grep "< bundle-uri" log &&
     @@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport)
       
      +	/*
      +	 * Don't request bundle-uri from the server unless configured to
     -+	 * do so by GIT_TEST_BUNDLE_URI=1 or transfer.bundleURI=true.
     ++	 * do so by the transfer.bundleURI=true config option.
      +	 */
     -+	if (!git_env_bool("GIT_TEST_BUNDLE_URI", 0) &&
     -+	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
     ++	if (git_config_get_bool("transfer.bundleuri", &value) || !value)
      +		return 0;
      +
       	if (!vtable->get_bundle_uri)
  5:  b009b4f58ea =  5:  8f5a483c329 transport: rename got_remote_heads
  6:  46a58e83caf !  6:  13e4c82e338 bundle-uri client: add helper for testing server
     @@ Commit message
      
          In the "git clone" case we'll have already done the handshake(),
          but not here. Add an extra case to check for this handshake in
     -    get_bundle_uri() for ease of use for future callers. Rename the existing
     -    'got_remote_heads' to 'finished_handshake' to make it more clear what
     -    that bit represents.
     +    get_bundle_uri() for ease of use for future callers.
      
          Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOC
       	grep "> command=bundle-uri" log
       '
      +
     ++# The remaining tests will all assume transfer.bundleURI=true
     ++#
     ++# This test can be removed when transfer.bundleURI is enabled by default.
     ++test_expect_success 'enable transfer.bundleURI for remaining tests' '
     ++	git config --global transfer.bundleURI true
     ++'
     ++
      +test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
      +	test_config -C "$BUNDLE_URI_PARENT" \
      +		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
  7:  acc5a8f57f9 !  7:  c9b7d8779e4 bundle-uri: serve bundle.* keys from config
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "test bundle-uri with $BUNDLE_
      +		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
       	EOF
       
     -+	GIT_TEST_BUNDLE_URI=1 \
       	test-tool bundle-uri \
     - 		ls-remote \
     - 		"$BUNDLE_URI_REPO_URI" \
      @@ t/lib-bundle-uri-protocol.sh: test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
       	[bundle]
       		version = 1
       		mode = all
      +	[bundle "only"]
      +		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
     - 	EOF
     - 
     -+	GIT_TEST_BUNDLE_URI=1 \
     ++	EOF
     ++
      +	test-tool bundle-uri \
      +		ls-remote \
      +		"$BUNDLE_URI_REPO_URI" \
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "test bundle-uri with $BUNDLE_
      +		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl
      +	[bundle "bundle3"]
      +		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl
     -+	EOF
     -+
     -+	GIT_TEST_BUNDLE_URI=1 \
     + 	EOF
     + 
       	test-tool bundle-uri \
     - 		ls-remote \
     - 		"$BUNDLE_URI_REPO_URI" \
  8:  1eec3426aee =  8:  d13bd6cd95d strbuf: introduce strbuf_strip_file_from_path()
  9:  48731438d6a !  9:  a188b38399d bundle-uri: allow relative URLs in bundle lists
     @@ bundle-uri.h: struct bundle_list {
      +	 * In the case of the 'bundle-uri' protocol v2 command, the base
      +	 * URI is the URI of the Git remote.
      +	 *
     -+	 * Otherewise, the bundle list was downloaded over HTTP from some
     -+	 * known URI.
     ++	 * Otherwise, the bundle list was downloaded over HTTP from some
     ++	 * known URI. 'baseURI' is set to that value.
      +	 *
      +	 * The baseURI is used as the base for any relative URIs
      +	 * advertised by the bundle list at that location.
     @@ t/t5750-bundle-uri-parse.sh: test_expect_success 'parse config format: just URIs
      
       ## transport.c ##
      @@ transport.c: int transport_get_remote_bundle_uri(struct transport *transport)
     - 	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
     + 	if (git_config_get_bool("transfer.bundleuri", &value) || !value)
       		return 0;
       
      +	if (!transport->bundles->baseURI)
 10:  69bf154bec6 ! 10:  72ca6f4254f bundle-uri: download bundles from an advertised list
     @@ bundle-uri.c: cleanup:
       	return result;
       }
       
     -+int fetch_bundle_list(struct repository *r, const char *uri, struct bundle_list *list)
     ++int fetch_bundle_list(struct repository *r, struct bundle_list *list)
      +{
      +	int result;
      +	struct bundle_list global_list;
     @@ bundle-uri.h: int bundle_uri_parse_config_format(const char *uri,
      + * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
      + * bundles according to the bundle strategy of that list.
      + *
     -+ * Returns non-zero if no bundle information is found at the given 'uri'.
     ++ * It is expected that the given 'list' is initialized, including its
     ++ * 'baseURI' value.
     ++ *
     ++ * Returns non-zero if there was an error trying to download the list
     ++ * or any of its advertised bundles.
      + */
      +int fetch_bundle_list(struct repository *r,
     -+		      const char *uri,
      +		      struct bundle_list *list);
      +
       /**
 11:  7e1819162b6 ! 11:  46ab2b05b15 clone: unbundle the advertised bundles
     @@ Commit message
          when the client has chosen to enable the feature.
      
          Teach Git to download and unbundle the data advertised by those bundles
     -    during 'git clone'.
     +    during 'git clone'. This takes place between the ref advertisement and
     +    the object data download, and stateful connections will linger while
     +    the client downloads bundles. In the future, we should consider closing
     +    the remote connection during this process.
      
          Also, since the --bundle-uri option exists, we do not want to mix the
          advertised bundles with the user-specified bundles.
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      +			if (repo_init(the_repository, git_dir, work_tree))
      +				warning(_("failed to initialize the repo, skipping bundle URI"));
      +			else if (fetch_bundle_list(the_repository,
     -+						   remote->url[0],
      +						   transport->bundles))
      +				warning(_("failed to fetch advertised bundles"));
      +		} else {
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "connect with $BUNDLE_URI_PROT
      +	test_when_finished "rm -rf log* cloned*" &&
       
       	GIT_TRACE_PACKET="$PWD/log" \
     - 	GIT_TEST_BUNDLE_URI=0 \
     + 	git \
      @@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
       	grep "< bundle-uri" log &&
       
     @@ t/lib-bundle-uri-protocol.sh: test_expect_success "clone with $BUNDLE_URI_PROTOC
      +	! grep "> command=bundle-uri" log3
       '
       
     - test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
     + # The remaining tests will all assume transfer.bundleURI=true
      
       ## t/t5601-clone.sh ##
      @@ t/t5601-clone.sh: test_expect_success 'reject cloning shallow repository using HTTP' '
     @@ t/t5601-clone.sh: test_expect_success 'reject cloning shallow repository using H
      +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
      +		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
      +
     -+	GIT_TEST_BUNDLE_URI=1 \
      +	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
     -+		git -c protocol.version=2 clone \
     ++		git -c protocol.version=2 \
     ++		    -c transfer.bundleURI=true clone \
      +		$HTTPD_URL/smart/repo2.git repo2 &&
      +	cat >pattern <<-EOF &&
      +	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
     @@ t/t5601-clone.sh: test_expect_success 'reject cloning shallow repository using H
      +	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
      +		bundle.new.uri "$HTTPD_URL/new.bundle" &&
      +
     -+	GIT_TEST_BUNDLE_URI=1 \
      +	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
     -+		git -c protocol.version=2 clone \
     ++		git -c protocol.version=2 \
     ++		    -c transfer.bundleURI=true clone \
      +		$HTTPD_URL/smart/repo3.git repo3 &&
      +
      +	# We should fetch _both_ bundles

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v4 01/11] protocol v2: add server-side "bundle-uri" skeleton
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
                         ` (10 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/gitprotocol-v2.txt | 201 +++++++++++++++++++++++++++++++
 bundle-uri.c                     |  36 ++++++
 bundle-uri.h                     |   7 ++
 serve.c                          |   6 +
 t/t5701-git-serve.sh             |  40 +++++-
 5 files changed, 289 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 59bf41cefb9..10bd2d40cec 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -578,6 +578,207 @@ and associated requested information, each separated by a single space.
 
 	obj-info = obj-id SP obj-size
 
+bundle-uri
+~~~~~~~~~~
+
+If the 'bundle-uri' capability is advertised, the server supports the
+`bundle-uri' command.
+
+The capability is currently advertised with no value (i.e. not
+"bundle-uri=somevalue"), a value may be added in the future for
+supporting command-wide extensions. Clients MUST ignore any unknown
+capability values and proceed with the 'bundle-uri` dialog they
+support.
+
+The 'bundle-uri' command is intended to be issued before `fetch` to
+get URIs to bundle files (see linkgit:git-bundle[1]) to "seed" and
+inform the subsequent `fetch` command.
+
+The client CAN issue `bundle-uri` before or after any other valid
+command. To be useful to clients it's expected that it'll be issued
+after an `ls-refs` and before `fetch`, but CAN be issued at any time
+in the dialog.
+
+DISCUSSION of bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The intent of the feature is optimize for server resource consumption
+in the common case by changing the common case of fetching a very
+large PACK during linkgit:git-clone[1] into a smaller incremental
+fetch.
+
+It also allows servers to achieve better caching in combination with
+an `uploadpack.packObjectsHook` (see linkgit:git-config[1]).
+
+By having new clones or fetches be a more predictable and common
+negotiation against the tips of recently produces *.bundle file(s).
+Servers might even pre-generate the results of such negotiations for
+the `uploadpack.packObjectsHook` as new pushes come in.
+
+One way that servers could take advantage of these bundles is that the
+server would anticipate that fresh clones will download a known bundle,
+followed by catching up to the current state of the repository using ref
+tips found in that bundle (or bundles).
+
+PROTOCOL for bundle-uri
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A `bundle-uri` request takes no arguments, and as noted above does not
+currently advertise a capability value. Both may be added in the
+future.
+
+When the client issues a `command=bundle-uri` request, the response is a
+list of key-value pairs provided as packet lines with value
+`<key>=<value>`. Each `<key>` should be interpreted as a config key from
+the `bundle.*` namespace to construct a list of bundles. These keys are
+grouped by a `bundle.<id>.` subsection, where each key corresponding to a
+given `<id>` contributes attributes to the bundle defined by that `<id>`.
+See linkgit:git-config[1] for the specific details of these keys and how
+the Git client will interpret their values.
+
+Clients MUST parse the line according to the above format, lines that do
+not conform to the format SHOULD be discarded. The user MAY be warned in
+such a case.
+
+bundle-uri CLIENT AND SERVER EXPECTATIONS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+URI CONTENTS::
+The content at the advertised URIs MUST be one of two types.
++
+The advertised URI may contain a bundle file that `git bundle verify`
+would accept. I.e. they MUST contain one or more reference tips for
+use by the client, MUST indicate prerequisites (in any) with standard
+"-" prefixes, and MUST indicate their "object-format", if
+applicable.
++
+The advertised URI may alternatively contain a plaintext file that `git
+config --list` would accept (with the `--file` option). The key-value
+pairs in this list are in the `bundle.*` namespace (see
+linkgit:git-config[1]).
+
+bundle-uri CLIENT ERROR RECOVERY::
+A client MUST above all gracefully degrade on errors, whether that
+error is because of bad missing/data in the bundle URI(s), because
+that client is too dumb to e.g. understand and fully parse out bundle
+headers and their prerequisite relationships, or something else.
++
+Server operators should feel confident in turning on "bundle-uri" and
+not worry if e.g. their CDN goes down that clones or fetches will run
+into hard failures. Even if the server bundle bundle(s) are
+incomplete, or bad in some way the client should still end up with a
+functioning repository, just as if it had chosen not to use this
+protocol extension.
++
+All subsequent discussion on client and server interaction MUST keep
+this in mind.
+
+bundle-uri SERVER TO CLIENT::
+The ordering of the returned bundle uris is not significant. Clients
+MUST parse their headers to discover their contained OIDS and
+prerequisites. A client MUST consider the content of the bundle(s)
+themselves and their header as the ultimate source of truth.
++
+A server MAY even return bundle(s) that don't have any direct
+relationship to the repository being cloned (either through accident,
+or intentional "clever" configuration), and expect a client to sort
+out what data they'd like from the bundle(s), if any.
+
+bundle-uri CLIENT TO SERVER::
+The client SHOULD provide reference tips found in the bundle header(s)
+as 'have' lines in any subsequent `fetch` request. A client MAY also
+ignore the bundle(s) entirely if doing so is deemed worse for some
+reason, e.g. if the bundles can't be downloaded, it doesn't like the
+tips it finds etc.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE NO FURTHER NEGOTIATION::
+If after issuing `bundle-uri` and `ls-refs`, and getting the header(s)
+of the bundle(s) the client finds that the ref tips it wants can be
+retrieved entirely from advertised bundle(s), the client MAY disconnect
+from the Git server. The results of such a 'clone' or 'fetch' should be
+indistinguishable from the state attained without using bundle-uri.
+
+EARLY CLIENT DISCONNECTIONS AND ERROR RECOVERY::
+A client MAY perform an early disconnect while still downloading the
+bundle(s) (having streamed and parsed their headers). In such a case
+the client MUST gracefully recover from any errors related to
+finishing the download and validation of the bundle(s).
++
+I.e. a client might need to re-connect and issue a 'fetch' command,
+and possibly fall back to not making use of 'bundle-uri' at all.
++
+This "MAY" behavior is specified as such (and not a "SHOULD") on the
+assumption that a server advertising bundle uris is more likely than
+not to be serving up a relatively large repository, and to be pointing
+to URIs that have a good chance of being in working order. A client
+MAY e.g. look at the payload size of the bundles as a heuristic to see
+if an early disconnect is worth it, should falling back on a full
+"fetch" dialog be necessary.
+
+WHEN ADVERTISED BUNDLE(S) REQUIRE FURTHER NEGOTIATION::
+A client SHOULD commence a negotiation of a PACK from the server via
+the "fetch" command using the OID tips found in advertised bundles,
+even if's still in the process of downloading those bundle(s).
++
+This allows for aggressive early disconnects from any interactive
+server dialog. The client blindly trusts that the advertised OID tips
+are relevant, and issues them as 'have' lines, it then requests any
+tips it would like (usually from the "ls-refs" advertisement) via
+'want' lines. The server will then compute a (hopefully small) PACK
+with the expected difference between the tips from the bundle(s) and
+the data requested.
++
+The only connection the client then needs to keep active is to the
+concurrently downloading static bundle(s), when those and the
+incremental PACK are retrieved they should be inflated and
+validated. Any errors at this point should be gracefully recovered
+from, see above.
+
+bundle-uri PROTOCOL FEATURES
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The client constructs a bundle list from the `<key>=<value>` pairs
+provided by the server. These pairs are part of the `bundle.*` namespace
+as documented in linkgit:git-config[1]. In this section, we discuss some
+of these keys and describe the actions the client will do in response to
+this information.
+
+In particular, the `bundle.version` key specifies an integer value. The
+only accepted value at the moment is `1`, but if the client sees an
+unexpected value here then the client MUST ignore the bundle list.
+
+As long as `bundle.version` is understood, all other unknown keys MAY be
+ignored by the client. The server will guarantee compatibility with older
+clients, though newer clients may be better able to use the extra keys to
+minimize downloads.
+
+Any backwards-incompatible addition of pre-URI key-value will be
+guarded by a new `bundle.version` value or values in 'bundle-uri'
+capability advertisement itself, and/or by new future `bundle-uri`
+request arguments.
+
+Some example key-value pairs that are not currently implemented but could
+be implemented in the future include:
+
+ * Add a "hash=<val>" or "size=<bytes>" advertise the expected hash or
+   size of the bundle file.
+
+ * Advertise that one or more bundle files are the same (to e.g. have
+   clients round-robin or otherwise choose one of N possible files).
+
+ * A "oid=<OID>" shortcut and "prerequisite=<OID>" shortcut. For
+   expressing the common case of a bundle with one tip and no
+   prerequisites, or one tip and one prerequisite.
++
+This would allow for optimizing the common case of servers who'd like
+to provide one "big bundle" containing only their "main" branch,
+and/or incremental updates thereof.
++
+A client receiving such a a response MAY assume that they can skip
+retrieving the header from a bundle at the indicated URI, and thus
+save themselves and the server(s) the request(s) needed to inspect the
+headers of that bundle or bundles.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/bundle-uri.c b/bundle-uri.c
index 79a914f961b..28d8966005e 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -563,6 +563,42 @@ cleanup:
 	return result;
 }
 
+/**
+ * API for serve.c.
+ */
+
+int bundle_uri_advertise(struct repository *r, struct strbuf *value UNUSED)
+{
+	static int advertise_bundle_uri = -1;
+
+	if (advertise_bundle_uri != -1)
+		goto cached;
+
+	advertise_bundle_uri = 0;
+	repo_config_get_maybe_bool(r, "uploadpack.advertisebundleuris", &advertise_bundle_uri);
+
+cached:
+	return advertise_bundle_uri;
+}
+
+int bundle_uri_command(struct repository *r,
+		       struct packet_reader *request)
+{
+	struct packet_writer writer;
+	packet_writer_init(&writer, 1);
+
+	while (packet_reader_read(request) == PACKET_READ_NORMAL)
+		die(_("bundle-uri: unexpected argument: '%s'"), request->line);
+	if (request->status != PACKET_READ_FLUSH)
+		die(_("bundle-uri: expected flush after arguments"));
+
+	/* TODO: Implement the communication */
+
+	packet_writer_flush(&writer);
+
+	return 0;
+}
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index 4dbc269823c..357111ecce8 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -4,6 +4,7 @@
 #include "hashmap.h"
 #include "strbuf.h"
 
+struct packet_reader;
 struct repository;
 struct string_list;
 
@@ -92,6 +93,12 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * API for serve.c.
+ */
+int bundle_uri_advertise(struct repository *r, struct strbuf *value);
+int bundle_uri_command(struct repository *r, struct packet_reader *request);
+
 /**
  * General API for {transport,connect}.c etc.
  */
diff --git a/serve.c b/serve.c
index 733347f602a..cbf4a143cfe 100644
--- a/serve.c
+++ b/serve.c
@@ -7,6 +7,7 @@
 #include "protocol-caps.h"
 #include "serve.h"
 #include "upload-pack.h"
+#include "bundle-uri.h"
 
 static int advertise_sid = -1;
 static int client_hash_algo = GIT_HASH_SHA1;
@@ -135,6 +136,11 @@ static struct protocol_capability capabilities[] = {
 		.advertise = always_advertise,
 		.command = cap_object_info,
 	},
+	{
+		.name = "bundle-uri",
+		.advertise = bundle_uri_advertise,
+		.command = bundle_uri_command,
+	},
 };
 
 void protocol_v2_advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 1896f671cb3..f21e5e9d33d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -13,7 +13,7 @@ test_expect_success 'test capability advertisement' '
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
-	cat >expect <<-EOF &&
+	cat >expect.base <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs=unborn
@@ -21,8 +21,11 @@ test_expect_success 'test capability advertisement' '
 	server-option
 	object-format=$(test_oid algo)
 	object-info
+	EOF
+	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -342,4 +345,39 @@ test_expect_success 'basics of object-info' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with uploadpack.advertiseBundleURIs' '
+	test_config uploadpack.advertiseBundleURIs true &&
+
+	cat >expect.extra <<-EOF &&
+	bundle-uri
+	EOF
+	cat expect.base \
+	    expect.extra \
+	    expect.trailer >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'basics of bundle-uri: dies if not enabled' '
+	test-tool pkt-line pack >in <<-EOF &&
+	command=bundle-uri
+	0000
+	EOF
+
+	cat >err.expect <<-\EOF &&
+	fatal: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	cat >expect <<-\EOF &&
+	ERR serve: invalid command '"'"'bundle-uri'"'"'
+	EOF
+
+	test_must_fail test-tool serve-v2 --stateless-rpc <in >out 2>err.actual &&
+	test_cmp err.expect err.actual &&
+	test_must_be_empty out
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 02/11] t: create test harness for 'bundle-uri' command
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-22 15:14       ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
                         ` (9 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The previous change allowed for a Git server to advertise the
'bundle-uri' command as a capability based on the
uploadPack.advertiseBundleURIs config option. Create a set of tests that
check that this capability is advertised using 'git ls-remote'.

In order to test this functionality across three protocols (file, git,
and http), create lib-bundle-uri-protocol.sh to generalize the tests,
allowing the other test scripts to set an environment variable and
otherwise inherit the setup and tests from this script.

The tests currently only test that the 'bundle-uri' command is
advertised or not. Other actions will be tested as the Git client learns
to request the 'bundle-uri' command and parse its response.

To help with URI escaping, specifically for file paths with a space in
them, extract a 'sed' invocation from t9199-git-svn-info.sh into a
helper function for use here, too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/lib-bundle-uri-protocol.sh           | 85 ++++++++++++++++++++++++++
 t/t5730-protocol-v2-bundle-uri-file.sh | 17 ++++++
 t/t5731-protocol-v2-bundle-uri-git.sh  | 17 ++++++
 t/t5732-protocol-v2-bundle-uri-http.sh | 17 ++++++
 t/t9119-git-svn-info.sh                |  2 +-
 t/test-lib-functions.sh                |  7 +++
 6 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 t/lib-bundle-uri-protocol.sh
 create mode 100755 t/t5730-protocol-v2-bundle-uri-file.sh
 create mode 100755 t/t5731-protocol-v2-bundle-uri-git.sh
 create mode 100755 t/t5732-protocol-v2-bundle-uri-http.sh

diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
new file mode 100644
index 00000000000..2da22a39cb8
--- /dev/null
+++ b/t/lib-bundle-uri-protocol.sh
@@ -0,0 +1,85 @@
+# Set up and run tests of the 'bundle-uri' command in protocol v2
+#
+# The test that includes this script should set BUNDLE_URI_PROTOCOL
+# to one of "file", "git", or "http".
+
+BUNDLE_URI_TEST_PARENT=
+BUNDLE_URI_TEST_URI=
+BUNDLE_URI_TEST_BUNDLE_URI=
+case "$BUNDLE_URI_PROTOCOL" in
+file)
+	BUNDLE_URI_PARENT=file_parent
+	BUNDLE_URI_REPO_URI="file://$PWD/file_parent"
+	BUNDLE_URI_BUNDLE_URI="$BUNDLE_URI_REPO_URI/fake.bdl"
+	test_set_prereq BUNDLE_URI_FILE
+	;;
+git)
+	. "$TEST_DIRECTORY"/lib-git-daemon.sh
+	start_git_daemon --export-all --enable=receive-pack
+	BUNDLE_URI_PARENT="$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent"
+	BUNDLE_URI_REPO_URI="$GIT_DAEMON_URL/parent"
+	BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq BUNDLE_URI_GIT
+	;;
+http)
+	. "$TEST_DIRECTORY"/lib-httpd.sh
+	start_httpd
+	BUNDLE_URI_PARENT="$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
+	BUNDLE_URI_REPO_URI="$HTTPD_URL/smart/http_parent"
+	BUNDLE_URI_BUNDLE_URI="https://example.com/fake.bdl"
+	test_set_prereq BUNDLE_URI_HTTP
+	;;
+*)
+	BUG "Need to pass valid BUNDLE_URI_PROTOCOL (was \"$BUNDLE_URI_PROTOCOL\")"
+	;;
+esac
+
+test_expect_success "setup protocol v2 $BUNDLE_URI_PROTOCOL:// tests" '
+	git init "$BUNDLE_URI_PARENT" &&
+	test_commit -C "$BUNDLE_URI_PARENT" one &&
+	git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs true
+'
+
+case "$BUNDLE_URI_PROTOCOL" in
+http)
+	test_expect_success "setup config for $BUNDLE_URI_PROTOCOL:// tests" '
+		git -C "$BUNDLE_URI_PARENT" config http.receivepack true
+	'
+	;;
+*)
+	;;
+esac
+BUNDLE_URI_BUNDLE_URI_ESCAPED=$(echo "$BUNDLE_URI_BUNDLE_URI" | test_uri_escape)
+
+test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: no bundle-uri" '
+	test_when_finished "rm -f log" &&
+	test_when_finished "git -C \"$BUNDLE_URI_PARENT\" config uploadpack.advertiseBundleURIs true" &&
+	git -C "$BUNDLE_URI_PARENT" config uploadpack.advertiseBundleURIs false &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$BUNDLE_URI_REPO_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	! grep bundle-uri log
+'
+
+test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: have bundle-uri" '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		ls-remote --symref "$BUNDLE_URI_REPO_URI" \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log
+'
diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
new file mode 100755
index 00000000000..37bdb725bca
--- /dev/null
+++ b/t/t5730-protocol-v2-bundle-uri-file.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'file://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+BUNDLE_URI_PROTOCOL=file
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t5731-protocol-v2-bundle-uri-git.sh b/t/t5731-protocol-v2-bundle-uri-git.sh
new file mode 100755
index 00000000000..8add1b37abc
--- /dev/null
+++ b/t/t5731-protocol-v2-bundle-uri-git.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'git://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+BUNDLE_URI_PROTOCOL=git
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t5732-protocol-v2-bundle-uri-http.sh b/t/t5732-protocol-v2-bundle-uri-http.sh
new file mode 100755
index 00000000000..129daa02269
--- /dev/null
+++ b/t/t5732-protocol-v2-bundle-uri-http.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description="Test bundle-uri with protocol v2 and 'http://' transport"
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'http://' transport
+#
+BUNDLE_URI_PROTOCOL=http
+. "$TEST_DIRECTORY"/lib-bundle-uri-protocol.sh
+
+test_done
diff --git a/t/t9119-git-svn-info.sh b/t/t9119-git-svn-info.sh
index 8201c3e808a..088d1c57a88 100755
--- a/t/t9119-git-svn-info.sh
+++ b/t/t9119-git-svn-info.sh
@@ -28,7 +28,7 @@ test_cmp_info () {
 	rm -f tmp.expect tmp.actual
 }
 
-quoted_svnrepo="$(echo $svnrepo | sed 's/ /%20/')"
+quoted_svnrepo="$(echo $svnrepo | test_uri_escape)"
 
 test_expect_success 'setup repository and import' '
 	mkdir info &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 29d914a12ba..5f6966a404b 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1755,6 +1755,13 @@ test_path_is_hidden () {
 	return 1
 }
 
+# Poor man's URI escaping. Good enough for the test suite whose trash
+# directory has a space in it. See 93c3fcbe4d4 (git-svn: attempt to
+# mimic SVN 1.7 URL canonicalization, 2012-07-28) for prior art.
+test_uri_escape() {
+	sed 's/ /%20/g'
+}
+
 # Check that the given command was invoked as part of the
 # trace2-format trace on stdin.
 #
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 03/11] clone: request the 'bundle-uri' command when available
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-22 15:14       ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
                         ` (8 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.

If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.

At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.

The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c              |  6 +++++
 connect.c                    | 44 +++++++++++++++++++++++++++++++
 remote.h                     |  5 ++++
 t/lib-bundle-uri-protocol.sh | 19 ++++++++++++++
 transport-helper.c           | 13 +++++++++
 transport-internal.h         |  7 +++++
 transport.c                  | 51 ++++++++++++++++++++++++++++++++++++
 transport.h                  | 19 ++++++++++++++
 8 files changed, 164 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index 547d6464b3c..39364c25b15 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1266,6 +1266,12 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
+	/*
+	 * Populate transport->got_remote_bundle_uri and
+	 * transport->bundle_uri. We might get nothing.
+	 */
+	transport_get_remote_bundle_uri(transport);
+
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 
diff --git a/connect.c b/connect.c
index 5ea53deda23..624a10f18ee 100644
--- a/connect.c
+++ b/connect.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "protocol.h"
 #include "alias.h"
+#include "bundle-uri.h"
 
 static char *server_capabilities_v1;
 static struct strvec server_capabilities_v2 = STRVEC_INIT;
@@ -491,6 +492,49 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	}
 }
 
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc)
+{
+	int line_nr = 1;
+
+	/* Assert bundle-uri support */
+	server_supports_v2("bundle-uri", 1);
+
+	/* (Re-)send capabilities */
+	send_capabilities(fd_out, reader);
+
+	/* Send command */
+	packet_write_fmt(fd_out, "command=bundle-uri\n");
+	packet_delim(fd_out);
+
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *line = reader->line;
+		line_nr++;
+
+		if (!bundle_uri_parse_line(bundles, line))
+			continue;
+
+		return error(_("error on bundle-uri response line %d: %s"),
+			     line_nr, line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		return error(_("expected flush after bundle-uri listing"));
+
+	/*
+	 * Might die(), but obscure enough that that's OK, e.g. in
+	 * serve.c we'll call BUG() on its equivalent (the
+	 * PACKET_READ_RESPONSE_END check).
+	 */
+	check_stateless_delimiter(stateless_rpc, reader,
+				  _("expected response end packet after ref listing"));
+
+	return 0;
+}
+
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     struct transport_ls_refs_options *transport_options,
diff --git a/remote.h b/remote.h
index 1c4621b414b..1ebbe42792e 100644
--- a/remote.h
+++ b/remote.h
@@ -234,6 +234,11 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
+/* Used for protocol v2 in order to retrieve refs from a remote */
+struct bundle_list;
+int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
+			  struct bundle_list *bundles, int stateless_rpc);
+
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
 /*
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 2da22a39cb8..d44c6e10f9e 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -83,3 +83,22 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 	# Server advertised bundle-uri capability
 	grep "< bundle-uri" log
 '
+
+test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
+	test_when_finished "rm -rf log cloned" &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c protocol.version=2 \
+		clone "$BUNDLE_URI_REPO_URI" cloned \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log &&
+
+	# Client issued bundle-uri command
+	grep "> command=bundle-uri" log
+'
diff --git a/transport-helper.c b/transport-helper.c
index e95267a4ab5..3ea7c2bb5ad 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1267,9 +1267,22 @@ static struct ref *get_refs_list_using_list(struct transport *transport,
 	return ret;
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	get_helper(transport);
+
+	if (process_connect(transport, 0)) {
+		do_take_over(transport);
+		return transport->vtable->get_bundle_uri(transport);
+	}
+
+	return -1;
+}
+
 static struct transport_vtable vtable = {
 	.set_option	= set_helper_option,
 	.get_refs_list	= get_refs_list,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs,
 	.push_refs	= push_refs,
 	.connect	= connect_helper,
diff --git a/transport-internal.h b/transport-internal.h
index c4ca0b733ac..90ea749e5cf 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -26,6 +26,13 @@ struct transport_vtable {
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
 				     struct transport_ls_refs_options *transport_options);
 
+	/**
+	 * Populates the remote side's bundle-uri under protocol v2,
+	 * if the "bundle-uri" capability was advertised. Returns 0 if
+	 * OK, negative values on error.
+	 */
+	int (*get_bundle_uri)(struct transport *transport);
+
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
diff --git a/transport.c b/transport.c
index e7b97194c10..b6f279e92cb 100644
--- a/transport.c
+++ b/transport.c
@@ -22,6 +22,7 @@
 #include "protocol.h"
 #include "object-store.h"
 #include "color.h"
+#include "bundle-uri.h"
 
 static int transport_use_color = -1;
 static char transport_colors[][COLOR_MAXLEN] = {
@@ -359,6 +360,32 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	return handshake(transport, for_push, options, 1);
 }
 
+static int get_bundle_uri(struct transport *transport)
+{
+	struct git_transport_data *data = transport->data;
+	struct packet_reader reader;
+	int stateless_rpc = transport->stateless_rpc;
+
+	if (!transport->bundles) {
+		CALLOC_ARRAY(transport->bundles, 1);
+		init_bundle_list(transport->bundles);
+	}
+
+	/*
+	 * "Support" protocol v0 and v2 without bundle-uri support by
+	 * silently degrading to a NOOP.
+	 */
+	if (!server_supports_v2("bundle-uri", 0))
+		return 0;
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	return get_remote_bundle_uri(data->fd[1], &reader,
+				     transport->bundles, stateless_rpc);
+}
+
 static int fetch_refs_via_pack(struct transport *transport,
 			       int nr_heads, struct ref **to_fetch)
 {
@@ -902,6 +929,7 @@ static int disconnect_git(struct transport *transport)
 
 static struct transport_vtable taken_over_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.disconnect	= disconnect_git
@@ -1054,6 +1082,7 @@ static struct transport_vtable bundle_vtable = {
 
 static struct transport_vtable builtin_smart_vtable = {
 	.get_refs_list	= get_refs_via_connect,
+	.get_bundle_uri = get_bundle_uri,
 	.fetch_refs	= fetch_refs_via_pack,
 	.push_refs	= git_transport_push,
 	.connect	= connect_git,
@@ -1068,6 +1097,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	ret->progress = isatty(2);
 	string_list_init_dup(&ret->pack_lockfiles);
 
+	CALLOC_ARRAY(ret->bundles, 1);
+	init_bundle_list(ret->bundles);
+
 	if (!remote)
 		BUG("No remote provided to transport_get()");
 
@@ -1482,6 +1514,23 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
+int transport_get_remote_bundle_uri(struct transport *transport)
+{
+	const struct transport_vtable *vtable = transport->vtable;
+
+	/* Check config only once. */
+	if (transport->got_remote_bundle_uri)
+		return 0;
+	transport->got_remote_bundle_uri = 1;
+
+	if (!vtable->get_bundle_uri)
+		return error(_("bundle-uri operation not supported by protocol"));
+
+	if (vtable->get_bundle_uri(transport) < 0)
+		return error(_("could not retrieve server-advertised bundle-uri list"));
+	return 0;
+}
+
 void transport_unlock_pack(struct transport *transport, unsigned int flags)
 {
 	int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER);
@@ -1512,6 +1561,8 @@ int transport_disconnect(struct transport *transport)
 		ret = transport->vtable->disconnect(transport);
 	if (transport->got_remote_refs)
 		free_refs((void *)transport->remote_refs);
+	clear_bundle_list(transport->bundles);
+	free(transport->bundles);
 	free(transport);
 	return ret;
 }
diff --git a/transport.h b/transport.h
index b5bf7b3e704..85150f504fb 100644
--- a/transport.h
+++ b/transport.h
@@ -62,6 +62,7 @@ enum transport_family {
 	TRANSPORT_FAMILY_IPV6
 };
 
+struct bundle_list;
 struct transport {
 	const struct transport_vtable *vtable;
 
@@ -76,6 +77,18 @@ struct transport {
 	 */
 	unsigned got_remote_refs : 1;
 
+	/**
+	 * Indicates whether we already called get_bundle_uri_list(); set by
+	 * transport.c::transport_get_remote_bundle_uri().
+	 */
+	unsigned got_remote_bundle_uri : 1;
+
+	/*
+	 * The results of "command=bundle-uri", if both sides support
+	 * the "bundle-uri" capability.
+	 */
+	struct bundle_list *bundles;
+
 	/*
 	 * Transports that call take-over destroys the data specific to
 	 * the transport type while doing so, and cannot be reused.
@@ -281,6 +294,12 @@ void transport_ls_refs_options_release(struct transport_ls_refs_options *opts);
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    struct transport_ls_refs_options *transport_options);
 
+/**
+ * Retrieve bundle URI(s) from a remote. Populates "struct
+ * transport"'s "bundle_uri" and "got_remote_bundle_uri".
+ */
+int transport_get_remote_bundle_uri(struct transport *transport);
+
 /*
  * Fetch the hash algorithm used by a remote.
  *
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 04/11] bundle-uri client: add boolean transfer.bundleURI setting
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-22 15:14       ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/config/transfer.txt |  6 ++++++
 t/lib-bundle-uri-protocol.sh      | 19 ++++++++++++++++++-
 transport.c                       |  8 ++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index 264812cca4d..c3ac767d1e4 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -115,3 +115,9 @@ transfer.unpackLimit::
 transfer.advertiseSID::
 	Boolean. When true, client and server processes will advertise their
 	unique session IDs to their remote counterpart. Defaults to false.
+
+transfer.bundleURI::
+	When `true`, local `git clone` commands will request bundle
+	information from the remote server (if advertised) and download
+	bundles before continuing the clone through the Git protocol.
+	Defaults to `false`.
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index d44c6e10f9e..75ea8c4418f 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -85,10 +85,11 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 '
 
 test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
-	test_when_finished "rm -rf log cloned" &&
+	test_when_finished "rm -rf log cloned cloned2" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
 	git \
+		-c transfer.bundleURI=false \
 		-c protocol.version=2 \
 		clone "$BUNDLE_URI_REPO_URI" cloned \
 		>actual 2>err &&
@@ -99,6 +100,22 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	# Server advertised bundle-uri capability
 	grep "< bundle-uri" log &&
 
+	# Client did not issue bundle-uri command
+	! grep "> command=bundle-uri" log &&
+
+	GIT_TRACE_PACKET="$PWD/log" \
+	git \
+		-c transfer.bundleURI=true \
+		-c protocol.version=2 \
+		clone "$BUNDLE_URI_REPO_URI" cloned2 \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log &&
+
 	# Client issued bundle-uri command
 	grep "> command=bundle-uri" log
 '
diff --git a/transport.c b/transport.c
index b6f279e92cb..b4cf2c0252e 100644
--- a/transport.c
+++ b/transport.c
@@ -1516,6 +1516,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 
 int transport_get_remote_bundle_uri(struct transport *transport)
 {
+	int value = 0;
 	const struct transport_vtable *vtable = transport->vtable;
 
 	/* Check config only once. */
@@ -1523,6 +1524,13 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 		return 0;
 	transport->got_remote_bundle_uri = 1;
 
+	/*
+	 * Don't request bundle-uri from the server unless configured to
+	 * do so by the transfer.bundleURI=true config option.
+	 */
+	if (git_config_get_bool("transfer.bundleuri", &value) || !value)
+		return 0;
+
 	if (!vtable->get_bundle_uri)
 		return error(_("bundle-uri operation not supported by protocol"));
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 05/11] transport: rename got_remote_heads
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The 'got_remote_heads' member of 'struct git_transport_data' was used
historically to indicate that the initial server connection was made and
the ref advertisement was returned. With protocol v2, that initial
handshake does not necessarily include the ref advertisement, so this
member is not an accurate name. Thankfully, all uses of the member are
only checking to see if the handshake should take place, not whether or
not some local data has the ref advertisement.

Rename the member to 'finished_handshake' to represent the proper state.
Note that the variable is only set to 1 during the handshake() method.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 transport.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index b4cf2c0252e..757ad552bf3 100644
--- a/transport.c
+++ b/transport.c
@@ -198,7 +198,7 @@ struct git_transport_data {
 	struct git_transport_options options;
 	struct child_process *conn;
 	int fd[2];
-	unsigned got_remote_heads : 1;
+	unsigned finished_handshake : 1;
 	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
@@ -345,7 +345,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
 	}
-	data->got_remote_heads = 1;
+	data->finished_handshake = 1;
 	transport->hash_algo = reader.hash_algo;
 
 	if (reader.line_peeked)
@@ -421,7 +421,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.negotiation_tips = data->options.negotiation_tips;
 	args.reject_shallow_remote = transport->smart_options->reject_shallow;
 
-	if (!data->got_remote_heads) {
+	if (!data->finished_handshake) {
 		int i;
 		int must_list_refs = 0;
 		for (i = 0; i < nr_heads; i++) {
@@ -461,7 +461,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 			  to_fetch, nr_heads, &data->shallow,
 			  &transport->pack_lockfiles, data->version);
 
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 	data->options.self_contained_and_connected =
 		args.self_contained_and_connected;
 	data->options.connectivity_checked = args.connectivity_checked;
@@ -846,7 +846,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	if (transport_color_config() < 0)
 		return -1;
 
-	if (!data->got_remote_heads)
+	if (!data->finished_handshake)
 		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
@@ -894,7 +894,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		ret = finish_connect(data->conn);
 	data->conn = NULL;
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 
 	return ret;
 }
@@ -914,7 +914,7 @@ static int disconnect_git(struct transport *transport)
 {
 	struct git_transport_data *data = transport->data;
 	if (data->conn) {
-		if (data->got_remote_heads && !transport->stateless_rpc)
+		if (data->finished_handshake && !transport->stateless_rpc)
 			packet_flush(data->fd[1]);
 		close(data->fd[0]);
 		if (data->fd[1] >= 0)
@@ -949,7 +949,7 @@ void transport_take_over(struct transport *transport,
 	data->conn = child;
 	data->fd[0] = data->conn->out;
 	data->fd[1] = data->conn->in;
-	data->got_remote_heads = 0;
+	data->finished_handshake = 0;
 	transport->data = data;
 
 	transport->vtable = &taken_over_vtable;
@@ -1150,7 +1150,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		ret->smart_options = &(data->options);
 
 		data->conn = NULL;
-		data->got_remote_heads = 0;
+		data->finished_handshake = 0;
 	} else {
 		/* Unknown protocol in URL. Pass to external handler. */
 		int len = external_specification_len(url);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 06/11] bundle-uri client: add helper for testing server
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Ævar Arnfjörð Bjarmason via GitGitGadget
  2022-12-30 16:31         ` Jeff King
  2022-12-22 15:14       ` [PATCH v4 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
                         ` (5 subsequent siblings)
  11 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>

Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/helper/test-bundle-uri.c   | 46 ++++++++++++++++++++++++++++++++++++
 t/lib-bundle-uri-protocol.sh | 46 ++++++++++++++++++++++++++++++++++++
 transport.c                  |  7 ++++++
 3 files changed, 99 insertions(+)

diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 25afd393428..f8159187014 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -3,6 +3,10 @@
 #include "bundle-uri.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "transport.h"
+#include "ref-filter.h"
+#include "remote.h"
+#include "refs.h"
 
 enum input_mode {
 	KEY_VALUE_PAIRS,
@@ -68,6 +72,46 @@ usage:
 	usage_with_options(usage, options);
 }
 
+static int cmd_ls_remote(int argc, const char **argv)
+{
+	const char *uploadpack = NULL;
+	struct string_list server_options = STRING_LIST_INIT_DUP;
+	const char *dest;
+	struct remote *remote;
+	struct transport *transport;
+	int status = 0;
+
+	dest = argc > 1 ? argv[1] : NULL;
+
+	remote = remote_get(dest);
+	if (!remote) {
+		if (dest)
+			die(_("bad repository '%s'"), dest);
+		die(_("no remote configured to get bundle URIs from"));
+	}
+	if (!remote->url_nr)
+		die(_("remote '%s' has no configured URL"), dest);
+
+	transport = transport_get(remote, NULL);
+	if (uploadpack)
+		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
+	if (server_options.nr)
+		transport->server_options = &server_options;
+
+	if (transport_get_remote_bundle_uri(transport) < 0) {
+		error(_("could not get the bundle-uri list"));
+		status = 1;
+		goto cleanup;
+	}
+
+	print_bundle_list(stdout, transport->bundles);
+
+cleanup:
+	if (transport_disconnect(transport))
+		return 1;
+	return status;
+}
+
 int cmd__bundle_uri(int argc, const char **argv)
 {
 	const char *usage[] = {
@@ -88,6 +132,8 @@ int cmd__bundle_uri(int argc, const char **argv)
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS);
 	if (!strcmp(argv[1], "parse-config"))
 		return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE);
+	if (!strcmp(argv[1], "ls-remote"))
+		return cmd_ls_remote(argc - 1, argv + 1);
 	error("there is no test-tool bundle-uri tool '%s'", argv[1]);
 
 usage:
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 75ea8c4418f..5620e230387 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -119,3 +119,49 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	# Client issued bundle-uri command
 	grep "> command=bundle-uri" log
 '
+
+# The remaining tests will all assume transfer.bundleURI=true
+#
+# This test can be removed when transfer.bundleURI is enabled by default.
+test_expect_success 'enable transfer.bundleURI for remaining tests' '
+	git config --global transfer.bundleURI true
+'
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 and extra data" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.only.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED" &&
+
+	# Extra data should be ignored
+	test_config -C "$BUNDLE_URI_PARENT" bundle.only.extra bogus &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	EOF
+
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
diff --git a/transport.c b/transport.c
index 757ad552bf3..0f35114a13e 100644
--- a/transport.c
+++ b/transport.c
@@ -371,6 +371,13 @@ static int get_bundle_uri(struct transport *transport)
 		init_bundle_list(transport->bundles);
 	}
 
+	if (!data->finished_handshake) {
+		struct ref *refs = handshake(transport, 0, NULL, 0);
+
+		if (refs)
+			free_refs(refs);
+	}
+
 	/*
 	 * "Support" protocol v0 and v2 without bundle-uri support by
 	 * silently degrading to a NOOP.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 07/11] bundle-uri: serve bundle.* keys from config
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                 | 16 +++++++++++++++-
 t/lib-bundle-uri-protocol.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 28d8966005e..26ff4b062d7 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -581,6 +581,16 @@ cached:
 	return advertise_bundle_uri;
 }
 
+static int config_to_packet_line(const char *key, const char *value, void *data)
+{
+	struct packet_reader *writer = data;
+
+	if (!strncmp(key, "bundle.", 7))
+		packet_write_fmt(writer->fd, "%s=%s", key, value);
+
+	return 0;
+}
+
 int bundle_uri_command(struct repository *r,
 		       struct packet_reader *request)
 {
@@ -592,7 +602,11 @@ int bundle_uri_command(struct repository *r,
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("bundle-uri: expected flush after arguments"));
 
-	/* TODO: Implement the communication */
+	/*
+	 * Read all "bundle.*" config lines to the client as key=value
+	 * packet lines.
+	 */
+	git_config(config_to_packet_line, &writer);
 
 	packet_writer_flush(&writer);
 
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 5620e230387..3022ea4a95b 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -136,6 +136,8 @@ test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
 	EOF
 
 	test-tool bundle-uri \
@@ -157,6 +159,36 @@ test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol
 	[bundle]
 		version = 1
 		mode = all
+	[bundle "only"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED
+	EOF
+
+	test-tool bundle-uri \
+		ls-remote \
+		"$BUNDLE_URI_REPO_URI" \
+		>actual &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success "test bundle-uri with $BUNDLE_URI_PROTOCOL:// using protocol v2 with list" '
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle1.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl" &&
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle2.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl" &&
+	test_config -C "$BUNDLE_URI_PARENT" \
+		bundle.bundle3.uri "$BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl" &&
+
+	# All data about bundle URIs
+	cat >expect <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "bundle1"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-1.bdl
+	[bundle "bundle2"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-2.bdl
+	[bundle "bundle3"]
+		uri = $BUNDLE_URI_BUNDLE_URI_ESCAPED-3.bdl
 	EOF
 
 	test-tool bundle-uri \
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 08/11] strbuf: introduce strbuf_strip_file_from_path()
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The strbuf_parent_directory() method was added as a static method in
contrib/scalar by d0feac4e8c0 (scalar: 'register' sets recommended
config and starts maintenance, 2021-12-03) and then removed in
65f6a9eb0b9 (scalar: constrain enlistment search, 2022-08-18), but now
there is a need for a similar method in the bundle URI feature.

Re-add the method, this time in strbuf.c, but with a new name:
strbuf_strip_file_from_path(). The method requirements are slightly
modified to allow a trailing slash, in which case nothing is done, which
makes the name change valuable.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 strbuf.c |  6 ++++++
 strbuf.h | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 0890b1405c5..c383f41a3c5 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1200,3 +1200,9 @@ int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 	free(path2);
 	return res;
 }
+
+void strbuf_strip_file_from_path(struct strbuf *sb)
+{
+	char *path_sep = find_last_dir_sep(sb->buf);
+	strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0);
+}
diff --git a/strbuf.h b/strbuf.h
index 76965a17d44..f6dbb9681ee 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -664,6 +664,17 @@ int launch_sequence_editor(const char *path, struct strbuf *buffer,
 int strbuf_edit_interactively(struct strbuf *buffer, const char *path,
 			      const char *const *env);
 
+/*
+ * Remove the filename from the provided path string. If the path
+ * contains a trailing separator, then the path is considered a directory
+ * and nothing is modified.
+ *
+ * Examples:
+ * - "/path/to/file" -> "/path/to/"
+ * - "/path/to/dir/" -> "/path/to/dir/"
+ */
+void strbuf_strip_file_from_path(struct strbuf *sb);
+
 void strbuf_add_lines(struct strbuf *sb,
 		      const char *prefix,
 		      const char *buf,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 09/11] bundle-uri: allow relative URLs in bundle lists
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (7 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 16 +++++++-
 bundle-uri.h                | 14 +++++++
 t/helper/test-bundle-uri.c  |  2 +
 t/t5750-bundle-uri-parse.sh | 82 +++++++++++++++++++++++++++++++++++++
 transport.c                 |  3 ++
 5 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/bundle-uri.c b/bundle-uri.c
index 26ff4b062d7..69929d363cc 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -7,6 +7,7 @@
 #include "hashmap.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "remote.h"
 
 static int compare_bundles(const void *hashmap_cmp_fn_data,
 			   const struct hashmap_entry *he1,
@@ -49,6 +50,7 @@ void clear_bundle_list(struct bundle_list *list)
 
 	for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
 	hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
+	free(list->baseURI);
 }
 
 int for_all_bundles_in_list(struct bundle_list *list,
@@ -163,7 +165,7 @@ static int bundle_list_update(const char *key, const char *value,
 	if (!strcmp(subkey, "uri")) {
 		if (bundle->uri)
 			return -1;
-		bundle->uri = xstrdup(value);
+		bundle->uri = relative_url(list->baseURI, value, NULL);
 		return 0;
 	}
 
@@ -190,6 +192,18 @@ int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
+	if (!list->baseURI) {
+		struct strbuf baseURI = STRBUF_INIT;
+		strbuf_addstr(&baseURI, uri);
+
+		/*
+		 * If the URI does not end with a trailing slash, then
+		 * remove the filename portion of the path. This is
+		 * important for relative URIs.
+		 */
+		strbuf_strip_file_from_path(&baseURI);
+		list->baseURI = strbuf_detach(&baseURI, NULL);
+	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
diff --git a/bundle-uri.h b/bundle-uri.h
index 357111ecce8..c505444bc75 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -61,6 +61,20 @@ struct bundle_list {
 	int version;
 	enum bundle_list_mode mode;
 	struct hashmap bundles;
+
+	/**
+	 * The baseURI of a bundle_list is the URI that provided the list.
+	 *
+	 * In the case of the 'bundle-uri' protocol v2 command, the base
+	 * URI is the URI of the Git remote.
+	 *
+	 * Otherwise, the bundle list was downloaded over HTTP from some
+	 * known URI. 'baseURI' is set to that value.
+	 *
+	 * The baseURI is used as the base for any relative URIs
+	 * advertised by the bundle list at that location.
+	 */
+	char *baseURI;
 };
 
 void init_bundle_list(struct bundle_list *list);
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index f8159187014..5df5bc3b89e 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 	init_bundle_list(&list);
 
+	list.baseURI = xstrdup("<uri>");
+
 	switch (mode) {
 	case KEY_VALUE_PAIRS:
 		if (argc != 1)
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index c2fe3f9c5a5..7b4f930e532 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -30,6 +30,58 @@ test_expect_success 'bundle_uri_parse_line() just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'bundle_uri_parse_line(): relative URIs' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-key-values in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success 'bundle_uri_parse_line(): relative URIs and parent paths' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=../../bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/../bundle.bdl
+	EOF
+
+	# TODO: We would prefer if parsing a bundle list would not cause
+	# a die() and instead would give a warning and allow the rest of
+	# a Git command to continue. This test_must_fail is necessary for
+	# now until the interface for relative_url() allows for reporting
+	# an error instead of die()ing.
+	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
+	grep "fatal: cannot strip one component off url" err
+'
+
 test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
 	cat >in <<-\EOF &&
 	=bogus-value
@@ -136,6 +188,36 @@ test_expect_success 'parse config format: just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'parse config format: relative URIs' '
+	cat >in <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = bundle.bdl
+	[bundle "two"]
+		uri = ../bundle.bdl
+	[bundle "three"]
+		uri = sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-config in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'parse config format edge cases: empty key or value' '
 	cat >in1 <<-\EOF &&
 	= bogus-value
diff --git a/transport.c b/transport.c
index 0f35114a13e..241f8a6ba2d 100644
--- a/transport.c
+++ b/transport.c
@@ -1538,6 +1538,9 @@ int transport_get_remote_bundle_uri(struct transport *transport)
 	if (git_config_get_bool("transfer.bundleuri", &value) || !value)
 		return 0;
 
+	if (!transport->bundles->baseURI)
+		transport->bundles->baseURI = xstrdup(transport->url);
+
 	if (!vtable->get_bundle_uri)
 		return error(_("bundle-uri operation not supported by protocol"));
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 10/11] bundle-uri: download bundles from an advertised list
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (8 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-22 15:14       ` [PATCH v4 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
  2022-12-25 11:35       ` [PATCH v4 00/11] Bundle URIs IV: advertise over protocol v2 Junio C Hamano
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The logic in fetch_bundle_uri() is useful for the --bundle-uri option of
'git clone', but is not helpful when the clone operation discovers a
list of URIs from the bundle-uri protocol v2 command. To actually
download and unbundle the advertised bundles, we need a different
mechanism.

Create the new fetch_bundle_list() method which is very similar to
fetch_bundle_uri() except that it relies on download_bundle_list()
instead of fetch_bundle_uri_internal(). The download_bundle_list()
method will recursively call fetch_bundle_uri_internal() if any of the
advertised URIs serve a bundle list instead of a bundle. This will also
follow the bundle.list.mode setting from the input list: "any" will
download only one such URI while "all" will download data from all of
the URIs.

In an identical way to fetch_bundle_uri(), the bundles are unbundled
after all of the bundle lists have been expanded and all necessary URIs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c | 21 +++++++++++++++++++++
 bundle-uri.h | 14 ++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/bundle-uri.c b/bundle-uri.c
index 69929d363cc..36268dda172 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -577,6 +577,27 @@ cleanup:
 	return result;
 }
 
+int fetch_bundle_list(struct repository *r, struct bundle_list *list)
+{
+	int result;
+	struct bundle_list global_list;
+
+	init_bundle_list(&global_list);
+
+	/* If a bundle is added to this global list, then it is required. */
+	global_list.mode = BUNDLE_MODE_ALL;
+
+	if ((result = download_bundle_list(r, list, &global_list, 0)))
+		goto cleanup;
+
+	result = unbundle_all_bundles(r, &global_list);
+
+cleanup:
+	for_all_bundles_in_list(&global_list, unlink_bundle, NULL);
+	clear_bundle_list(&global_list);
+	return result;
+}
+
 /**
  * API for serve.c.
  */
diff --git a/bundle-uri.h b/bundle-uri.h
index c505444bc75..d5e89f1671c 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -107,6 +107,20 @@ int bundle_uri_parse_config_format(const char *uri,
  */
 int fetch_bundle_uri(struct repository *r, const char *uri);
 
+/**
+ * Given a bundle list that was already advertised (likely by the
+ * bundle-uri protocol v2 verb) at the given uri, fetch and unbundle the
+ * bundles according to the bundle strategy of that list.
+ *
+ * It is expected that the given 'list' is initialized, including its
+ * 'baseURI' value.
+ *
+ * Returns non-zero if there was an error trying to download the list
+ * or any of its advertised bundles.
+ */
+int fetch_bundle_list(struct repository *r,
+		      struct bundle_list *list);
+
 /**
  * API for serve.c.
  */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 11/11] clone: unbundle the advertised bundles
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (9 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
@ 2022-12-22 15:14       ` Derrick Stolee via GitGitGadget
  2022-12-25 11:35       ` [PATCH v4 00/11] Bundle URIs IV: advertise over protocol v2 Junio C Hamano
  11 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-12-22 15:14 UTC (permalink / raw)
  To: git
  Cc: gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'. This takes place between the ref advertisement and
the object data download, and stateful connections will linger while
the client downloads bundles. In the future, we should consider closing
the remote connection during this process.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c              | 25 ++++++++++++---
 t/lib-bundle-uri-protocol.sh | 21 +++++++++++--
 t/t5601-clone.sh             | 59 ++++++++++++++++++++++++++++++++++++
 3 files changed, 98 insertions(+), 7 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 39364c25b15..527839662b0 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1266,11 +1266,26 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (refs)
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
 
-	/*
-	 * Populate transport->got_remote_bundle_uri and
-	 * transport->bundle_uri. We might get nothing.
-	 */
-	transport_get_remote_bundle_uri(transport);
+	if (!bundle_uri) {
+		/*
+		* Populate transport->got_remote_bundle_uri and
+		* transport->bundle_uri. We might get nothing.
+		*/
+		transport_get_remote_bundle_uri(transport);
+
+		if (transport->bundles &&
+		    hashmap_get_size(&transport->bundles->bundles)) {
+			/* At this point, we need the_repository to match the cloned repo. */
+			if (repo_init(the_repository, git_dir, work_tree))
+				warning(_("failed to initialize the repo, skipping bundle URI"));
+			else if (fetch_bundle_list(the_repository,
+						   transport->bundles))
+				warning(_("failed to fetch advertised bundles"));
+		} else {
+			clear_bundle_list(transport->bundles);
+			FREE_AND_NULL(transport->bundles);
+		}
+	}
 
 	if (mapped_refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/t/lib-bundle-uri-protocol.sh b/t/lib-bundle-uri-protocol.sh
index 3022ea4a95b..a4a1af8d029 100644
--- a/t/lib-bundle-uri-protocol.sh
+++ b/t/lib-bundle-uri-protocol.sh
@@ -85,7 +85,7 @@ test_expect_success "connect with $BUNDLE_URI_PROTOCOL:// using protocol v2: hav
 '
 
 test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: request bundle-uris" '
-	test_when_finished "rm -rf log cloned cloned2" &&
+	test_when_finished "rm -rf log* cloned*" &&
 
 	GIT_TRACE_PACKET="$PWD/log" \
 	git \
@@ -117,7 +117,24 @@ test_expect_success "clone with $BUNDLE_URI_PROTOCOL:// using protocol v2: reque
 	grep "< bundle-uri" log &&
 
 	# Client issued bundle-uri command
-	grep "> command=bundle-uri" log
+	grep "> command=bundle-uri" log &&
+
+	GIT_TRACE_PACKET="$PWD/log3" \
+	git \
+		-c transfer.bundleURI=true \
+		-c protocol.version=2 \
+		clone --bundle-uri="$BUNDLE_URI_BUNDLE_URI" \
+		"$BUNDLE_URI_REPO_URI" cloned3 \
+		>actual 2>err &&
+
+	# Server responded using protocol v2
+	grep "< version 2" log3 &&
+
+	# Server advertised bundle-uri capability
+	grep "< bundle-uri" log3 &&
+
+	# Client did not issue bundle-uri command (--bundle-uri override)
+	! grep "> command=bundle-uri" log3
 '
 
 # The remaining tests will all assume transfer.bundleURI=true
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 45f0803ed4d..00d4fae5136 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -795,6 +795,65 @@ test_expect_success 'reject cloning shallow repository using HTTP' '
 	git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo
 '
 
+test_expect_success 'auto-discover bundle URI from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo2 "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 \
+		    -c transfer.bundleURI=true clone \
+		$HTTPD_URL/smart/repo2.git repo2 &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
+test_expect_success 'auto-discover multiple bundles from HTTP clone' '
+	test_when_finished rm -rf trace.txt repo3 "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	test_commit -C src new &&
+	git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/new.bundle" HEAD~1..HEAD &&
+	git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.mode all &&
+
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.everything.uri "$HTTPD_URL/everything.bundle" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo3.git" config \
+		bundle.new.uri "$HTTPD_URL/new.bundle" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+		git -c protocol.version=2 \
+		    -c transfer.bundleURI=true clone \
+		$HTTPD_URL/smart/repo3.git repo3 &&
+
+	# We should fetch _both_ bundles
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/everything.bundle"\]
+	EOF
+	grep -f pattern trace.txt &&
+	cat >pattern <<-EOF &&
+	"event":"child_start".*"argv":\["git-remote-https","$HTTPD_URL/new.bundle"\]
+	EOF
+	grep -f pattern trace.txt
+'
+
 # DO NOT add non-httpd-specific tests here, because the last part of this
 # test script is only executed when httpd is available and enabled.
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 00/11] Bundle URIs IV: advertise over protocol v2
  2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
                         ` (10 preceding siblings ...)
  2022-12-22 15:14       ` [PATCH v4 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
@ 2022-12-25 11:35       ` Junio C Hamano
  11 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-12-25 11:35 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This version includes squashed-in versions of the fixups that were
> previously known as ds/bundle-uri-4-fixup.
>
>  * Some unused parameters are now marked with UNUSED, since we are
>    introducing those parameters for the first time. In one case an unused
>    parameter should have been used in repo_config_...() instead of
>    git_config_...().
>  * The GIT_TEST_BUNDLE_URI environment variable is removed in favor of the
>    transfer.bundleURI config option in all cases.
>  * A stale commit message is fixed to no longer refer to a rename that was
>    split into a different commit as part of v3.
>  * The documentation comment for fetch_bundle_list() explicitly defines a
>    non-zero return value as an error.

I guess that sprinkling the fixes from "4-fixup" series into the
original commits will resolve many comments by Ævar about the
logical ordering of changes in the "fixup" series, mostly by making
them all moot points?  Thanks for working on this, and let's start
merging it down to 'next'.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 06/11] bundle-uri client: add helper for testing server
  2022-12-22 15:14       ` [PATCH v4 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
@ 2022-12-30 16:31         ` Jeff King
  2023-01-05 19:09           ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Jeff King @ 2022-12-30 16:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye, Derrick Stolee

On Thu, Dec 22, 2022 at 03:14:12PM +0000, Ævar Arnfjörð Bjarmason via GitGitGadget wrote:

> +static int cmd_ls_remote(int argc, const char **argv)
> +{
> +	const char *uploadpack = NULL;
> +	struct string_list server_options = STRING_LIST_INIT_DUP;

These two variables are initialized to NULL and empty respectively, and
then not accessed until here:

> +	transport = transport_get(remote, NULL);
> +	if (uploadpack)
> +		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
> +	if (server_options.nr)
> +		transport->server_options = &server_options;

where neither conditional will trigger, since they will still be NULL
and empty.

Is this function missing some argv parsing that would affect these? Or
if it's not needed, would we want to remove them, like:

diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 5df5bc3b89..b18e760310 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -76,8 +76,6 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 static int cmd_ls_remote(int argc, const char **argv)
 {
-	const char *uploadpack = NULL;
-	struct string_list server_options = STRING_LIST_INIT_DUP;
 	const char *dest;
 	struct remote *remote;
 	struct transport *transport;
@@ -95,11 +93,6 @@ static int cmd_ls_remote(int argc, const char **argv)
 		die(_("remote '%s' has no configured URL"), dest);
 
 	transport = transport_get(remote, NULL);
-	if (uploadpack)
-		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
-	if (server_options.nr)
-		transport->server_options = &server_options;
-
 	if (transport_get_remote_bundle_uri(transport) < 0) {
 		error(_("could not get the bundle-uri list"));
 		status = 1;

Not a huge deal, but I noticed that Coverity complained about the
uploadpack one because this hit 'next' (the server_options one I found
manually, but it was kind of obvious when looking at the other).

-Peff

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 06/11] bundle-uri client: add helper for testing server
  2022-12-30 16:31         ` Jeff King
@ 2023-01-05 19:09           ` Derrick Stolee
  2023-01-06  8:48             ` [PATCH] test-bundle-uri: drop unused variables Jeff King
  0 siblings, 1 reply; 87+ messages in thread
From: Derrick Stolee @ 2023-01-05 19:09 UTC (permalink / raw)
  To: Jeff King,
	Ævar Arnfjörð Bjarmason via GitGitGadget
  Cc: git, gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye

On 12/30/22 11:31 AM, Jeff King wrote:
> On Thu, Dec 22, 2022 at 03:14:12PM +0000, Ævar Arnfjörð Bjarmason via GitGitGadget wrote:
> 
>> +static int cmd_ls_remote(int argc, const char **argv)
>> +{
>> +	const char *uploadpack = NULL;
>> +	struct string_list server_options = STRING_LIST_INIT_DUP;
> 
> These two variables are initialized to NULL and empty respectively, and
> then not accessed until here:
> 
>> +	transport = transport_get(remote, NULL);
>> +	if (uploadpack)
>> +		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
>> +	if (server_options.nr)
>> +		transport->server_options = &server_options;
> 
> where neither conditional will trigger, since they will still be NULL
> and empty.
> 
> Is this function missing some argv parsing that would affect these? Or
> if it's not needed, would we want to remove them, like:

...

> Not a huge deal, but I noticed that Coverity complained about the
> uploadpack one because this hit 'next' (the server_options one I found
> manually, but it was kind of obvious when looking at the other).

Yes, removing these lines would be fine. Perhaps there were
uses for these in an earlier version that were dropped. But
we can remove them now and then add them back when they
actually connect to functionality.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH] test-bundle-uri: drop unused variables
  2023-01-05 19:09           ` Derrick Stolee
@ 2023-01-06  8:48             ` Jeff King
  2023-01-06 14:13               ` Derrick Stolee
  0 siblings, 1 reply; 87+ messages in thread
From: Jeff King @ 2023-01-06  8:48 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Ævar Arnfjörð Bjarmason via GitGitGadget, git,
	gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye

On Thu, Jan 05, 2023 at 02:09:34PM -0500, Derrick Stolee wrote:

> > Not a huge deal, but I noticed that Coverity complained about the
> > uploadpack one because this hit 'next' (the server_options one I found
> > manually, but it was kind of obvious when looking at the other).
> 
> Yes, removing these lines would be fine. Perhaps there were
> uses for these in an earlier version that were dropped. But
> we can remove them now and then add them back when they
> actually connect to functionality.

Thanks for confirming. Here's a patch that can go on top of
ds/bundle-uri-4 (or just on master, as that topic has now graduated).

-- >8 --
Subject: [PATCH] test-bundle-uri: drop unused variables

Commit 70b9c10373 (bundle-uri client: add helper for testing server,
2022-12-22) added a cmd_ls_remote() function which contains "uploadpack"
and "server_options" variables. Neither of these variables is ever
modified after being initialized, so the code to handle non-NULL and
non-empty values is impossible to reach.

While in theory we might add command-line parsing to set these, let's
drop the dead code for now in the name of cleanliness. It's easy enough
to add it back later if need be.

Noticed by Coverity.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/helper/test-bundle-uri.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index 5df5bc3b89..b18e760310 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -76,8 +76,6 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 static int cmd_ls_remote(int argc, const char **argv)
 {
-	const char *uploadpack = NULL;
-	struct string_list server_options = STRING_LIST_INIT_DUP;
 	const char *dest;
 	struct remote *remote;
 	struct transport *transport;
@@ -95,11 +93,6 @@ static int cmd_ls_remote(int argc, const char **argv)
 		die(_("remote '%s' has no configured URL"), dest);
 
 	transport = transport_get(remote, NULL);
-	if (uploadpack)
-		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
-	if (server_options.nr)
-		transport->server_options = &server_options;
-
 	if (transport_get_remote_bundle_uri(transport) < 0) {
 		error(_("could not get the bundle-uri list"));
 		status = 1;
-- 
2.39.0.463.g3774f23bc9


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH] test-bundle-uri: drop unused variables
  2023-01-06  8:48             ` [PATCH] test-bundle-uri: drop unused variables Jeff King
@ 2023-01-06 14:13               ` Derrick Stolee
  0 siblings, 0 replies; 87+ messages in thread
From: Derrick Stolee @ 2023-01-06 14:13 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason via GitGitGadget, git,
	gitster, me, newren, avarab, mjcheetham, steadmon, chooglen,
	jonathantanmy, dyroneteng, Victoria Dye

On 1/6/2023 3:48 AM, Jeff King wrote:
> On Thu, Jan 05, 2023 at 02:09:34PM -0500, Derrick Stolee wrote:
> 
>>> Not a huge deal, but I noticed that Coverity complained about the
>>> uploadpack one because this hit 'next' (the server_options one I found
>>> manually, but it was kind of obvious when looking at the other).
>>
>> Yes, removing these lines would be fine. Perhaps there were
>> uses for these in an earlier version that were dropped. But
>> we can remove them now and then add them back when they
>> actually connect to functionality.
> 
> Thanks for confirming. Here's a patch that can go on top of
> ds/bundle-uri-4 (or just on master, as that topic has now graduated).
> 
> -- >8 --
> Subject: [PATCH] test-bundle-uri: drop unused variables

Thanks for putting together a formal patch. LGTM.

-Stolee

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2023-01-06 14:13 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-01  1:07 [PATCH 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
2022-11-01  1:07 ` [PATCH 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-08 17:08   ` SZEDER Gábor
2022-11-11  1:59   ` Victoria Dye
2022-11-16 14:08     ` Derrick Stolee
2022-11-01  1:07 ` [PATCH 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-01  1:07 ` [PATCH 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-01  1:07 ` [PATCH 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
2022-11-01  1:07 ` [PATCH 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-01  1:07 ` [PATCH 6/9] strbuf: reintroduce strbuf_parent_directory() Derrick Stolee via GitGitGadget
2022-11-03  9:28   ` Phillip Wood
2022-11-03  9:49   ` Ævar Arnfjörð Bjarmason
2022-11-01  1:07 ` [PATCH 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
2022-11-01  1:07 ` [PATCH 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
2022-11-01  1:07 ` [PATCH 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
2022-11-16 19:51 ` [PATCH v2 0/9] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
2022-11-16 19:51   ` [PATCH v2 1/9] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-16 19:51   ` [PATCH v2 2/9] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-29  0:57     ` Victoria Dye
2022-12-02 15:00       ` Derrick Stolee
2022-11-16 19:51   ` [PATCH v2 3/9] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-29  0:59     ` Victoria Dye
2022-12-02 15:28       ` Derrick Stolee
2022-11-16 19:51   ` [PATCH v2 4/9] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
2022-11-29  1:00     ` Victoria Dye
2022-11-16 19:51   ` [PATCH v2 5/9] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
2022-11-29  1:03     ` Victoria Dye
2022-12-02 15:38       ` Derrick Stolee
2022-11-16 19:51   ` [PATCH v2 6/9] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
2022-11-29  1:03     ` Victoria Dye
2022-12-02 15:40       ` Derrick Stolee
2022-12-02 18:32     ` Ævar Arnfjörð Bjarmason
2022-12-05 15:11       ` Derrick Stolee
2022-11-16 19:51   ` [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
2022-11-29  1:25     ` Victoria Dye
2022-12-02 16:03       ` Derrick Stolee
2022-11-16 19:51   ` [PATCH v2 8/9] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
2022-11-29  1:51     ` Victoria Dye
2022-11-16 19:51   ` [PATCH v2 9/9] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
2022-11-29  1:59     ` Victoria Dye
2022-12-02 16:16       ` Derrick Stolee
2022-12-05 17:50   ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Derrick Stolee via GitGitGadget
2022-12-05 17:50     ` [PATCH v3 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-05 23:31       ` Victoria Dye
2022-12-05 17:50     ` [PATCH v3 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-05 17:50     ` [PATCH v3 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-05 17:50     ` [PATCH v3 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-05 23:32       ` Victoria Dye
2022-12-07 15:20         ` Derrick Stolee
2022-12-05 17:50     ` [PATCH v3 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
2022-12-05 17:50     ` [PATCH v3 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-05 23:32       ` Victoria Dye
2022-12-05 17:50     ` [PATCH v3 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
2022-12-05 17:50     ` [PATCH v3 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
2022-12-06 10:06       ` Ævar Arnfjörð Bjarmason
2022-12-06 11:37         ` Ævar Arnfjörð Bjarmason
2022-12-07 14:44           ` Derrick Stolee
2022-12-08 12:52             ` Ævar Arnfjörð Bjarmason
2022-12-05 17:50     ` [PATCH v3 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
2022-12-05 23:33       ` Victoria Dye
2022-12-07 15:22         ` Derrick Stolee
2022-12-05 17:50     ` [PATCH v3 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
2022-12-07 12:57       ` Jeff King
2022-12-07 15:27         ` Derrick Stolee
2022-12-07 15:54           ` Derrick Stolee
2022-12-08  6:40             ` Jeff King
2022-12-08  6:36           ` Jeff King
2022-12-08 14:58             ` Derrick Stolee
2022-12-05 17:50     ` [PATCH v3 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
2022-12-05 23:42     ` [PATCH v3 00/11] Bundle URIs IV: advertise over protocol v2 Victoria Dye
2022-12-22 15:14     ` [PATCH v4 " Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 01/11] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 02/11] t: create test harness for 'bundle-uri' command Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 03/11] clone: request the 'bundle-uri' command when available Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 04/11] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 05/11] transport: rename got_remote_heads Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 06/11] bundle-uri client: add helper for testing server Ævar Arnfjörð Bjarmason via GitGitGadget
2022-12-30 16:31         ` Jeff King
2023-01-05 19:09           ` Derrick Stolee
2023-01-06  8:48             ` [PATCH] test-bundle-uri: drop unused variables Jeff King
2023-01-06 14:13               ` Derrick Stolee
2022-12-22 15:14       ` [PATCH v4 07/11] bundle-uri: serve bundle.* keys from config Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 08/11] strbuf: introduce strbuf_strip_file_from_path() Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 09/11] bundle-uri: allow relative URLs in bundle lists Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 10/11] bundle-uri: download bundles from an advertised list Derrick Stolee via GitGitGadget
2022-12-22 15:14       ` [PATCH v4 11/11] clone: unbundle the advertised bundles Derrick Stolee via GitGitGadget
2022-12-25 11:35       ` [PATCH v4 00/11] Bundle URIs IV: advertise over protocol v2 Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).