From: Josh Steadmon <steadmon@google.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
newren@gmail.com, avarab@gmail.com, dyroneteng@gmail.com,
Johannes.Schindelin@gmx.de, szeder.dev@gmail.com,
mjcheetham@outlook.com, Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 0/5] Bundle URIs II: git clone --bundle-uri
Date: Wed, 27 Jul 2022 15:11:44 -0700 [thread overview]
Message-ID: <YuG4ICBFNbLPR9Iv@google.com> (raw)
In-Reply-To: <pull.1300.git.1658781277.gitgitgadget@gmail.com>
On 2022.07.25 20:34, Derrick Stolee via GitGitGadget wrote:
> This is the second series building the bundle URI feature as discussed in
> the previous series that added the design document [1]. This series does not
> modify the design document, so the patches are independent and can be
> applied to the latest 'master'.
>
> [1]
> https://lore.kernel.org/git/pull.1248.v3.git.1658757188.gitgitgadget@gmail.com
>
> This series brings in just enough logic that we can bootstrap clones from a
> single bundle using git clone --bundle-uri=<X>.
>
> * Patch 1 adds a 'get' capability to 'git remote-https' which allows
> downloading the contents of a URI to a local file.
> * Patch 2 creates basic file-copy logic within a new bundle-uri.c file. It
> is not used until patch 3.
> * Patch 3 creates the git clone --bundle-uri=<X> option, allowing Git to
> bootstrap a clone from a bundle, but get the remaining objects from the
> origin URL. (As of this patch, it only accepts a filename.)
> * Patch 4 extends the git clone --bundle-uri=<X> option to allow file://
> and https:// URIs.
> * Patch 5 is a CLI helper to avoid using --bundle-uri and --depth at the
> same time in git clone.
>
> As outlined in [1], the next steps after this are:
>
> 1. Allow parsing a bundle list as a config file at the given URI. The
> key-value format is unified with the protocol v2 verb (coming in (3)).
> [2]
> 2. Implement the protocol v2 verb, re-using the bundle list logic from (2).
> Use this to auto-discover bundle URIs during 'git clone' (behind a
> config option). [3]
> 3. Implement the 'creationToken' heuristic, allowing incremental 'git
> fetch' commands to download a bundle list from a configured URI, and
> only download bundles that are new based on the creation token values.
> [4]
>
> I have prepared some of this work as pull requests on my personal fork so
> curious readers can look ahead to where we are going:
>
> [2] https://github.com/derrickstolee/git/pull/20
>
> [3] https://github.com/derrickstolee/git/pull/21
>
> [4] https://github.com/derrickstolee/git/pull/22
>
> Note: this series includes some code pulled out of the first series [1], and
> in particular the git fetch --bundle-uri=<X> option is removed. The
> intention was to replace that with git bundle fetch <X>, but that conflicts
> with the work to refactor how subcommands are parsed. The git bundle fetch
> subcommand could be added later for maximum flexibility on the client side,
> but we can also move forward without it.
>
> Thanks, -Stolee
>
> P.S. Here is the range-diff compared to v2 of the previous bundle URI
> series:
>
> 1: d444042dc4dcc < -: ------------- docs: document bundle URI standard
> 2: 0a2cf60437f78 ! 1: 40808e92afb7b remote-curl: add 'get' capability
> @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
>
> +static void parse_get(struct strbuf *buf)
> +{
> -+ struct http_get_options opts = { 0 };
> + struct strbuf url = STRBUF_INIT;
> + struct strbuf path = STRBUF_INIT;
> + const char *p, *space;
> @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
> + strbuf_add(&url, p, space - p);
> + strbuf_addstr(&path, space + 1);
> +
> -+ if (http_get_file(url.buf, path.buf, &opts))
> ++ if (http_get_file(url.buf, path.buf, NULL))
> + die(_("failed to download file at URL '%s'"), url.buf);
> +
> + strbuf_release(&url);
> @@ t/t5557-http-get.sh (new)
> + get $url file1
> + EOF
> +
> -+ test_must_fail git remote-http $url $url <input 2>err &&
> ++ test_must_fail git remote-http $url <input 2>err &&
> + test_path_is_missing file1 &&
> + grep "failed to download file at URL" err &&
> + rm file1.temp
> @@ t/t5557-http-get.sh (new)
> +
> + EOF
> +
> -+ GIT_TRACE2_PERF=1 git remote-http $url $url <input &&
> ++ GIT_TRACE2_PERF=1 git remote-http $url <input &&
> + test_cmp "$HTTPD_DOCUMENT_ROOT_PATH/exists.txt" file2
> +'
> +
> +test_done
> -
> - ## transport-helper.c ##
> -@@ transport-helper.c: struct helper_data {
> - check_connectivity : 1,
> - no_disconnect_req : 1,
> - no_private_update : 1,
> -- object_format : 1;
> -+ object_format : 1,
> -+ get : 1;
> -
> - /*
> - * As an optimization, the transport code may invoke fetch before
> -@@ transport-helper.c: static struct child_process *get_helper(struct transport *transport)
> - data->no_private_update = 1;
> - } else if (starts_with(capname, "object-format")) {
> - data->object_format = 1;
> -+ } else if (!strcmp(capname, "get")) {
> -+ data->get = 1;
> - } else if (mandatory) {
> - die(_("unknown mandatory capability %s; this remote "
> - "helper probably needs newer version of Git"),
> 3: abec47564fd9c ! 2: 7d3159f0d9a29 bundle-uri: create basic file-copy logic
> @@ Commit message
> file, not a bundle list. Bundle lists will be implemented in a future
> change.
>
> + Note that the discovery of a temporary filename is slightly racy because
> + the odb_mkstemp() relies on the temporary file not existing. With the
> + current implementation being limited to file copies, we could replace
> + the copy_file() with copy_fd(). The tricky part comes in future changes
> + that send the filename to 'git remote-https' and its 'get' capability.
> + At that point, we need the file descriptor closed _and_ the file
> + unlinked. If we were to keep the file descriptor open for the sake of
> + normal file copies, then we would pollute the rest of the code for
> + little benefit. This is especially the case because we expect that most
> + bundle URI use will be based on HTTPS instead of file copies.
> +
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>
> ## Makefile ##
> @@ bundle-uri.c (new)
> +#include "refs.h"
> +#include "run-command.h"
> +
> -+static void find_temp_filename(struct strbuf *name)
> ++static int find_temp_filename(struct strbuf *name)
> +{
> + int fd;
> + /*
> @@ bundle-uri.c (new)
> + * racy, but unlikely to collide.
> + */
> + fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX");
> -+ if (fd < 0)
> -+ die(_("failed to create temporary file"));
> ++ if (fd < 0) {
> ++ warning(_("failed to create temporary file"));
> ++ return -1;
> ++ }
> ++
> + close(fd);
> + unlink(name->buf);
> ++ return 0;
> +}
> +
> -+static int copy_uri_to_file(const char *uri, const char *file)
> ++static int copy_uri_to_file(const char *file, const char *uri)
> +{
> -+ /* Copy as a file */
> -+ return copy_file(file, uri, 0444);
> ++ /* File-based URIs only for now. */
> ++ return copy_file(file, uri, 0);
> +}
> +
> +static int unbundle_from_file(struct repository *r, const char *file)
> @@ bundle-uri.c (new)
> + int result = 0;
> + int bundle_fd;
> + struct bundle_header header = BUNDLE_HEADER_INIT;
> -+ struct strvec extra_index_pack_args = STRVEC_INIT;
> + struct string_list_item *refname;
> + struct strbuf bundle_ref = STRBUF_INIT;
> + size_t bundle_prefix_len;
> @@ bundle-uri.c (new)
> + if ((bundle_fd = read_bundle_header(file, &header)) < 0)
> + return 1;
> +
> -+ if ((result = unbundle(r, &header, bundle_fd, &extra_index_pack_args)))
> ++ if ((result = unbundle(r, &header, bundle_fd, NULL)))
> + return 1;
> +
> + /*
> @@ bundle-uri.c (new)
> + int result = 0;
> + struct strbuf filename = STRBUF_INIT;
> +
> -+ find_temp_filename(&filename);
> -+ if ((result = copy_uri_to_file(uri, filename.buf)))
> ++ if ((result = find_temp_filename(&filename)))
> ++ goto cleanup;
> ++
> ++ if ((result = copy_uri_to_file(filename.buf, uri))) {
> ++ warning(_("failed to download bundle from URI '%s'"), uri);
> + goto cleanup;
> ++ }
> +
> -+ if ((result = !is_bundle(filename.buf, 0)))
> ++ if ((result = !is_bundle(filename.buf, 0))) {
> ++ warning(_("file at URI '%s' is not a bundle"), uri);
> + goto cleanup;
> ++ }
> +
> -+ if ((result = unbundle_from_file(r, filename.buf)))
> ++ if ((result = unbundle_from_file(r, filename.buf))) {
> ++ warning(_("failed to unbundle bundle from URI '%s'"), uri);
> + goto cleanup;
> ++ }
> +
> +cleanup:
> + unlink(filename.buf);
> 4: f6255ec518857 < -: ------------- fetch: add --bundle-uri option
> -: ------------- > 3: 29e645a54ba7f clone: add --bundle-uri option
> 5: bfbd11b48bf1b ! 4: f6bc3177332e8 bundle-uri: add support for http(s):// and file://
> @@ Metadata
> ## Commit message ##
> bundle-uri: add support for http(s):// and file://
>
> - The previous change created the 'git fetch --bundle-uri=<uri>' option.
> + The previous change created the 'git clone --bundle-uri=<uri>' option.
> Currently, <uri> must be a filename.
>
> Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
> @@ Commit message
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>
> ## bundle-uri.c ##
> -@@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
> - unlink(name->buf);
> +@@ bundle-uri.c: static int find_temp_filename(struct strbuf *name)
> + return 0;
> }
>
> -+static int download_https_uri_to_file(const char *uri, const char *file)
> -+{
> +-static int copy_uri_to_file(const char *file, const char *uri)
> ++static int download_https_uri_to_file(const char *file, const char *uri)
> + {
> +- /* File-based URIs only for now. */
> +- return copy_file(file, uri, 0);
> + int result = 0;
> + struct child_process cp = CHILD_PROCESS_INIT;
> + FILE *child_in = NULL, *child_out = NULL;
> @@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
> + return result;
> +}
> +
> - static int copy_uri_to_file(const char *uri, const char *file)
> - {
> ++static int copy_uri_to_file(const char *filename, const char *uri)
> ++{
> + const char *out;
> ++
> + if (skip_prefix(uri, "https:", &out) ||
> + skip_prefix(uri, "http:", &out))
> -+ return download_https_uri_to_file(uri, file);
> ++ return download_https_uri_to_file(filename, uri);
> +
> + if (!skip_prefix(uri, "file://", &out))
> + out = uri;
> +
> - /* Copy as a file */
> -- return copy_file(file, uri, 0444);
> -+ return !!copy_file(file, out, 0);
> ++ /* Copy as a file */
> ++ return copy_file(filename, out, 0);
> }
>
> static int unbundle_from_file(struct repository *r, const char *file)
>
> - ## t/t5558-fetch-bundle-uri.sh ##
> -@@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
> + ## t/t5558-clone-bundle-uri.sh ##
> +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' '
> test_cmp expect actual
> '
>
> -+test_expect_success 'fetch file:// bundle' '
> -+ git init fetch-file &&
> -+ git -C fetch-file fetch --bundle-uri="file://$(pwd)/fetch-from/B.bundle" &&
> -+ git -C fetch-file rev-parse refs/bundles/topic >actual &&
> -+ git -C fetch-from rev-parse topic >expect &&
> ++test_expect_success 'clone with file:// bundle' '
> ++ git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
> ++ clone-from clone-file &&
> ++ git -C clone-file rev-parse refs/bundles/topic >actual &&
> ++ git -C clone-from rev-parse topic >expect &&
> + test_cmp expect actual
> +'
> +
> @@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
> +start_httpd
> +
> +test_expect_success 'fail to fetch from non-existent HTTP URL' '
> -+ test_must_fail git fetch --bundle-uri="$HTTPD_URL/does-not-exist" 2>err &&
> ++ test_when_finished rm -rf test &&
> ++ git clone --bundle-uri="$HTTPD_URL/does-not-exist" . test 2>err &&
> + grep "failed to download bundle from URI" err
> +'
> +
> +test_expect_success 'fail to fetch from non-bundle HTTP URL' '
> ++ test_when_finished rm -rf test &&
> + echo bogus >"$HTTPD_DOCUMENT_ROOT_PATH/bogus" &&
> -+ test_must_fail git fetch --bundle-uri="$HTTPD_URL/bogus" 2>err &&
> ++ git clone --bundle-uri="$HTTPD_URL/bogus" . test 2>err &&
> + grep "is not a bundle" err
> +'
> +
> -+test_expect_success 'fetch HTTP bundle' '
> -+ cp fetch-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
> -+ git init fetch-http &&
> -+ git -C fetch-http fetch --bundle-uri="$HTTPD_URL/B.bundle" &&
> -+ git -C fetch-http rev-parse refs/bundles/topic >actual &&
> -+ git -C fetch-from rev-parse topic >expect &&
> -+ test_cmp expect actual
> ++test_expect_success 'clone HTTP bundle' '
> ++ cp clone-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
> ++
> ++ git clone --no-local --mirror clone-from \
> ++ "$HTTPD_DOCUMENT_ROOT_PATH/fetch.git" &&
> ++
> ++ git clone --bundle-uri="$HTTPD_URL/B.bundle" \
> ++ "$HTTPD_URL/smart/fetch.git" clone-http &&
> ++ git -C clone-http rev-parse refs/bundles/topic >actual &&
> ++ git -C clone-from rev-parse topic >expect &&
> ++ test_cmp expect actual &&
> ++
> ++ test_config -C clone-http log.excludedecoration refs/bundle/
> +'
> +
> +# Do not add tests here unless they use the HTTP server, as they will
> 6: a217e9a0640b4 < -: ------------- fetch: add 'refs/bundle/' to log.excludeDecoration
> -: ------------- > 5: e823b168ab725 clone: --bundle-uri cannot be combined with --depth
>
>
> Derrick Stolee (5):
> remote-curl: add 'get' capability
> bundle-uri: create basic file-copy logic
> clone: add --bundle-uri option
> bundle-uri: add support for http(s):// and file://
> clone: --bundle-uri cannot be combined with --depth
>
> Documentation/git-clone.txt | 7 ++
> Documentation/gitremote-helpers.txt | 9 ++
> Makefile | 1 +
> builtin/clone.c | 18 +++
> bundle-uri.c | 168 ++++++++++++++++++++++++++++
> bundle-uri.h | 14 +++
> remote-curl.c | 32 ++++++
> t/t5557-http-get.sh | 37 ++++++
> t/t5558-clone-bundle-uri.sh | 81 ++++++++++++++
> t/t5606-clone-options.sh | 8 ++
> 10 files changed, 375 insertions(+)
> create mode 100644 bundle-uri.c
> create mode 100644 bundle-uri.h
> create mode 100755 t/t5557-http-get.sh
> create mode 100755 t/t5558-clone-bundle-uri.sh
>
>
> base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1300%2Fderrickstolee%2Fbundle-redo%2Fclone-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1300/derrickstolee/bundle-redo/clone-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1300
> --
> gitgitgadget
Looks good to me, with just one small fix and a couple of optional
nitpicks. Thaks for the series!
Reviewed-by: Josh Steadmon <steadmon@google.com>
next prev parent reply other threads:[~2022-07-27 22:15 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-25 20:34 [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-27 22:08 ` Josh Steadmon
2022-07-27 23:00 ` Ævar Arnfjörð Bjarmason
2022-08-01 13:55 ` Derrick Stolee
2022-08-01 14:39 ` Ævar Arnfjörð Bjarmason
2022-07-25 20:34 ` [PATCH 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-07-27 22:09 ` Josh Steadmon
2022-08-01 13:58 ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-07-27 22:09 ` Josh Steadmon
2022-08-01 14:00 ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-07-27 22:11 ` Josh Steadmon [this message]
2022-08-01 14:00 ` [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee
2022-08-02 12:29 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2022-08-02 12:29 ` [PATCH v2 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-02 12:29 ` [PATCH v2 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-02 12:29 ` [PATCH v2 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-02 12:29 ` [PATCH v2 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-02 21:32 ` Junio C Hamano
2022-08-04 15:34 ` Derrick Stolee
2022-08-04 18:19 ` Junio C Hamano
2022-08-02 12:29 ` [PATCH v2 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-08-09 13:11 ` [PATCH v3 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-08-09 13:11 ` [PATCH v3 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-09 13:11 ` [PATCH v3 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-09 13:11 ` [PATCH v3 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-22 21:24 ` Junio C Hamano
2022-08-23 14:05 ` Derrick Stolee
2022-08-24 15:46 ` Junio C Hamano
2022-08-09 13:11 ` [PATCH v3 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-29 4:58 ` Teng Long
2022-08-30 13:33 ` Derrick Stolee
2022-08-09 13:11 ` [PATCH v3 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuG4ICBFNbLPR9Iv@google.com \
--to=steadmon@google.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=dyroneteng@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=mjcheetham@outlook.com \
--cc=newren@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).