From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <derrickstolee@github.com>,
Jonathan Tan <jonathantanmy@google.com>,
Jonathan Nieder <jrnieder@gmail.com>,
Albert Cui <albertqcui@gmail.com>,
"Robin H . Johnson" <robbat2@gentoo.org>,
Teng Long <dyroneteng@gmail.com>
Subject: [RFC PATCH v2 18/36] bundle: implement 'fetch' command for direct bundles
Date: Mon, 18 Apr 2022 19:23:35 +0200 [thread overview]
Message-ID: <RFC-patch-v2-18.36-ff9a7afaccd-20220418T165545Z-avarab@gmail.com> (raw)
In-Reply-To: <RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@gmail.com>
From: Derrick Stolee <derrickstolee@github.com>
The 'git bundle fetch <uri>' command will be used to download one or
more bundles from a specified '<uri>'. The implementation being added
here focuses only on downloading a file from '<uri>' and unbundling it
if it is a valid bundle file.
If it is not a bundle file, then we currently die(), but a later change
will attempt to interpret it as a table of contents with possibly
multiple bundles listed, along with other metadata for each bundle.
That explains a bit why cmd_bundle_fetch() has three steps carefully
commented, including a "stack" that currently can only hold one bundle.
We will later update this while loop to push onto the stack when
necessary.
RFC-TODO: Add documentation to Documentation/git-bundle.txt
RFC-TODO: Add direct tests of 'git bundle fetch' when the URI is a
bundle file.
RFC-TODO: Split out the docs and subcommand boilerplate into its own
commit.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
Documentation/git-bundle.txt | 1 +
builtin/bundle.c | 261 +++++++++++++++++++++++++++++++++++
2 files changed, 262 insertions(+)
diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
index 7685b570455..bf5cd90391c 100644
--- a/Documentation/git-bundle.txt
+++ b/Documentation/git-bundle.txt
@@ -12,6 +12,7 @@ SYNOPSIS
'git bundle' create [-q | --quiet | --progress | --all-progress] [--all-progress-implied]
[--version=<version>] <file> <git-rev-list-args>
'git bundle' verify [-q | --quiet] <file>
+'git bundle' fetch [--filter=<spec>] <uri>
'git bundle' list-heads <file> [<refname>...]
'git bundle' unbundle [--progress] <file> [<refname>...]
diff --git a/builtin/bundle.c b/builtin/bundle.c
index 2adad545a2e..6b6107d83cf 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -3,6 +3,10 @@
#include "parse-options.h"
#include "cache.h"
#include "bundle.h"
+#include "run-command.h"
+#include "hashmap.h"
+#include "object-store.h"
+#include "refs.h"
/*
* Basic handler for bundle files to connect repositories via sneakernet.
@@ -14,6 +18,7 @@
static const char * const builtin_bundle_usage[] = {
N_("git bundle create [<options>] <file> <git-rev-list args>"),
N_("git bundle verify [<options>] <file>"),
+ N_("git bundle fetch [<options>] <uri>"),
N_("git bundle list-heads <file> [<refname>...]"),
N_("git bundle unbundle <file> [<refname>...]"),
NULL
@@ -29,6 +34,11 @@ static const char * const builtin_bundle_verify_usage[] = {
NULL
};
+static const char * const builtin_bundle_fetch_usage[] = {
+ N_("git bundle fetch [--filter=<spec>] <uri>"),
+ NULL
+};
+
static const char * const builtin_bundle_list_heads_usage[] = {
N_("git bundle list-heads <file> [<refname>...]"),
NULL
@@ -132,6 +142,255 @@ static int cmd_bundle_verify(int argc, const char **argv, const char *prefix) {
return ret;
}
+/**
+ * The remote_bundle_info struct contains the necessary data for
+ * the list of bundles advertised by a table of contents. If the
+ * bundle URI instead contains a single bundle, then this struct
+ * can represent a single bundle without a 'uri' but with a
+ * tempfile storing its current location on disk.
+ */
+struct remote_bundle_info {
+ struct hashmap_entry ent;
+
+ /**
+ * The 'id' is a name given to the bundle for reference
+ * by other bundle infos.
+ */
+ char *id;
+
+ /**
+ * The 'uri' is the location of the remote bundle so
+ * it can be downloaded on-demand. This will be NULL
+ * if there was no table of contents.
+ */
+ char *uri;
+
+ /**
+ * The 'next_id' string, if non-NULL, contains the 'id'
+ * for a bundle that contains the prerequisites for this
+ * bundle. Used by table of contents to allow fetching
+ * a portion of a repository incrementally.
+ */
+ char *next_id;
+
+ /**
+ * A table of contents can include a timestamp for the
+ * bundle as a heuristic for describing a list of bundles
+ * in order of recency.
+ */
+ timestamp_t timestamp;
+
+ /**
+ * If the bundle has been downloaded, then 'file' is a
+ * filename storing its contents. Otherwise, 'file' is
+ * an empty string.
+ */
+ struct strbuf file;
+
+ /**
+ * The 'stack_next' pointer allows this struct to form
+ * a stack.
+ */
+ struct remote_bundle_info *stack_next;
+};
+
+static void download_uri_to_file(const char *uri, const char *file)
+{
+ struct child_process cp = CHILD_PROCESS_INIT;
+ FILE *child_in;
+
+ strvec_pushl(&cp.args, "git-remote-https", "origin", uri, NULL);
+ cp.in = -1;
+ cp.out = -1;
+
+ if (start_command(&cp))
+ die(_("failed to start remote helper"));
+
+ child_in = fdopen(cp.in, "w");
+ if (!child_in)
+ die(_("cannot write to child process"));
+
+ fprintf(child_in, "get %s %s\n\n", uri, file);
+ fclose(child_in);
+
+ if (finish_command(&cp))
+ die(_("remote helper failed"));
+}
+
+static void find_temp_filename(struct strbuf *name)
+{
+ int fd;
+ /*
+ * Find a temporray filename that is available. This is briefly
+ * racy, but unlikely to collide.
+ */
+ fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX");
+ if (fd < 0)
+ die(_("failed to create temporary file"));
+ close(fd);
+ unlink(name->buf);
+}
+
+static void unbundle_fetched_bundle(struct remote_bundle_info *info)
+{
+ struct child_process cp = CHILD_PROCESS_INIT;
+ FILE *f;
+ struct strbuf line = STRBUF_INIT;
+ struct strbuf bundle_ref = STRBUF_INIT;
+ size_t bundle_prefix_len;
+
+ strvec_pushl(&cp.args, "bundle", "unbundle",
+ info->file.buf, NULL);
+ cp.git_cmd = 1;
+ cp.out = -1;
+
+ if (start_command(&cp))
+ die(_("failed to start 'unbundle' process"));
+
+ strbuf_addstr(&bundle_ref, "refs/bundles/");
+ bundle_prefix_len = bundle_ref.len;
+
+ f = fdopen(cp.out, "r");
+ while (strbuf_getline(&line, f) != EOF) {
+ struct object_id oid, old_oid;
+ const char *refname, *branch_name, *end;
+ char *space;
+ int has_old;
+
+ strbuf_trim_trailing_newline(&line);
+
+ space = strchr(line.buf, ' ');
+
+ if (!space)
+ continue;
+
+ refname = space + 1;
+ *space = '\0';
+ parse_oid_hex(line.buf, &oid, &end);
+
+ if (!skip_prefix(refname, "refs/heads/", &branch_name))
+ continue;
+
+ strbuf_setlen(&bundle_ref, bundle_prefix_len);
+ strbuf_addstr(&bundle_ref, branch_name);
+
+ has_old = !read_ref(bundle_ref.buf, &old_oid);
+
+ update_ref("bundle fetch", bundle_ref.buf, &oid,
+ has_old ? &old_oid : NULL,
+ REF_SKIP_OID_VERIFICATION,
+ UPDATE_REFS_MSG_ON_ERR);
+ }
+
+ if (finish_command(&cp))
+ die(_("failed to unbundle bundle from '%s'"), info->uri);
+
+ unlink_or_warn(info->file.buf);
+}
+
+static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix)
+{
+ int ret = 0;
+ int progress = isatty(2);
+ char *bundle_uri;
+ struct remote_bundle_info first_file = {
+ .file = STRBUF_INIT,
+ };
+ struct remote_bundle_info *stack = NULL;
+
+ struct option options[] = {
+ OPT_BOOL(0, "progress", &progress,
+ N_("show progress meter")),
+ OPT_END()
+ };
+
+ argc = parse_options_cmd_bundle(argc, argv, prefix,
+ builtin_bundle_fetch_usage, options, &bundle_uri);
+
+ if (!startup_info->have_repository)
+ die(_("'fetch' requires a repository"));
+
+ /*
+ * Step 1: determine protocol for uri, and download contents to
+ * a temporary location.
+ */
+ first_file.uri = bundle_uri;
+ find_temp_filename(&first_file.file);
+ download_uri_to_file(bundle_uri, first_file.file.buf);
+
+ /*
+ * Step 2: Check if the file is a bundle (if so, add it to the
+ * stack and move to step 3).
+ */
+
+ if (is_bundle(first_file.file.buf, 1)) {
+ /* The simple case: only one file, no stack to worry about. */
+ stack = &first_file;
+ } else {
+ /* TODO: Expect and parse a table of contents. */
+ die(_("unexpected data at bundle URI"));
+ }
+
+ /*
+ * Step 3: For each bundle in the stack:
+ * i. If not downloaded to a temporary file, download it.
+ * ii. Once downloaded, check that its prerequisites are in
+ * the object database. If not, then push its dependent
+ * bundle onto the stack. (Fail if no such bundle exists.)
+ * iii. If all prerequisites are present, then unbundle the
+ * temporary file and pop the bundle from the stack.
+ */
+ while (stack) {
+ int valid = 1;
+ int bundle_fd;
+ struct string_list_item *prereq;
+ struct bundle_header header = BUNDLE_HEADER_INIT;
+
+ if (!stack->file.len) {
+ find_temp_filename(&stack->file);
+ download_uri_to_file(stack->uri, stack->file.buf);
+ if (!is_bundle(stack->file.buf, 1))
+ die(_("file downloaded from '%s' is not a bundle"), stack->uri);
+ }
+
+ bundle_header_init(&header);
+ bundle_fd = read_bundle_header(stack->file.buf, &header);
+ if (bundle_fd < 0)
+ die(_("failed to read bundle from '%s'"), stack->uri);
+
+ for_each_string_list_item(prereq, &header.prerequisites) {
+ struct object_info info = OBJECT_INFO_INIT;
+ struct object_id *oid = prereq->util;
+
+ if (oid_object_info_extended(the_repository, oid, &info,
+ OBJECT_INFO_QUICK)) {
+ valid = 0;
+ break;
+ }
+ }
+
+ close(bundle_fd);
+ bundle_header_release(&header);
+
+ if (valid) {
+ unbundle_fetched_bundle(stack);
+ } else if (stack->next_id) {
+ /*
+ * Load the next bundle from the hashtable and
+ * push it onto the stack.
+ */
+ } else {
+ die(_("bundle from '%s' has missing prerequisites and no dependent bundle"),
+ stack->uri);
+ }
+
+ stack = stack->stack_next;
+ }
+
+ free(bundle_uri);
+ return ret;
+}
+
static int cmd_bundle_list_heads(int argc, const char **argv, const char *prefix) {
struct bundle_header header = BUNDLE_HEADER_INIT;
int bundle_fd = -1;
@@ -212,6 +471,8 @@ int cmd_bundle(int argc, const char **argv, const char *prefix)
result = cmd_bundle_create(argc, argv, prefix);
else if (!strcmp(argv[0], "verify"))
result = cmd_bundle_verify(argc, argv, prefix);
+ else if (!strcmp(argv[0], "fetch"))
+ result = cmd_bundle_fetch(argc, argv, prefix);
else if (!strcmp(argv[0], "list-heads"))
result = cmd_bundle_list_heads(argc, argv, prefix);
else if (!strcmp(argv[0], "unbundle"))
--
2.36.0.rc2.902.g60576bbc845
next prev parent reply other threads:[~2022-04-18 17:25 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 21:25 [PATCH 0/3] bundle-uri: "dumb" static CDN offloading, spec & server implementation Ævar Arnfjörð Bjarmason
2021-10-25 21:25 ` [PATCH 1/3] leak tests: mark t5701-git-serve.sh as passing SANITIZE=leak Ævar Arnfjörð Bjarmason
2021-10-25 21:25 ` [PATCH 2/3] protocol v2: specify static seeding of clone/fetch via "bundle-uri" Ævar Arnfjörð Bjarmason
2021-10-26 14:00 ` Derrick Stolee
2021-10-26 15:00 ` Ævar Arnfjörð Bjarmason
2021-10-27 1:55 ` Derrick Stolee
2021-10-27 17:49 ` Ævar Arnfjörð Bjarmason
2021-10-27 2:01 ` Derrick Stolee
2021-10-27 8:29 ` Ævar Arnfjörð Bjarmason
2021-10-27 16:31 ` Derrick Stolee
2021-10-27 18:01 ` Ævar Arnfjörð Bjarmason
2021-10-27 19:23 ` Derrick Stolee
2021-10-27 20:22 ` Ævar Arnfjörð Bjarmason
2021-10-29 18:30 ` Derrick Stolee
2021-10-30 14:51 ` Philip Oakley
2021-10-25 21:25 ` [PATCH 3/3] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2021-10-26 14:05 ` Derrick Stolee
2021-10-29 18:46 ` [PATCH 0/3] bundle-uri: "dumb" static CDN offloading, spec & server implementation Derrick Stolee
2021-10-30 7:21 ` Ævar Arnfjörð Bjarmason
2021-11-01 21:00 ` Derrick Stolee
2021-11-01 23:18 ` Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 00/13] bundle-uri: a "dumb CDN" for git Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 01/13] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 02/13] bundle-uri docs: add design notes Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 03/13] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 04/13] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 05/13] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 06/13] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 07/13] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 08/13] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 09/13] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 10/13] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 11/13] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 12/13] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2022-03-11 16:24 ` [RFC PATCH v2 13/13] bundle-uri: make the download program configurable Ævar Arnfjörð Bjarmason
2022-03-11 21:28 ` [RFC PATCH v2 00/13] bundle-uri: a "dumb CDN" for git Derrick Stolee
2022-04-18 17:23 ` [RFC PATCH v2 00/36] bundle-uri: a "dumb CDN" for git + TOC format Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 01/36] connect.c: refactor sending of agent & object-format Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 02/36] dir API: add a generalized path_match_flags() function Ævar Arnfjörð Bjarmason
2022-04-21 17:26 ` Derrick Stolee
2022-04-18 17:23 ` [RFC PATCH v2 03/36] fetch-pack: add a deref_without_lazy_fetch_extended() Ævar Arnfjörð Bjarmason
2022-04-21 17:28 ` Derrick Stolee
2022-04-18 17:23 ` [RFC PATCH v2 04/36] fetch-pack: move --keep=* option filling to a function Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 05/36] http: make http_get_file() external Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 06/36] remote: move relative_url() Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 07/36] remote: allow relative_url() to return an absolute url Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 08/36] bundle.h: make "fd" version of read_bundle_header() public Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 09/36] protocol v2: add server-side "bundle-uri" skeleton Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 10/36] bundle-uri client: add "bundle-uri" parsing + tests Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 11/36] bundle-uri client: add minimal NOOP client Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 12/36] bundle-uri client: add "git ls-remote-bundle-uri" Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 13/36] bundle-uri client: add transfer.injectBundleURI support Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 14/36] bundle-uri client: add boolean transfer.bundleURI setting Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 15/36] bundle-uri client: support for bundle-uri with "clone" Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 16/36] bundle-uri: make the download program configurable Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 17/36] remote-curl: add 'get' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` Ævar Arnfjörð Bjarmason [this message]
2022-04-18 17:23 ` [RFC PATCH v2 19/36] bundle: parse table of contents during 'fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 20/36] bundle: add --filter option to 'fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 21/36] bundle: allow relative URLs in table of contents Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 22/36] bundle: make it easy to call 'git bundle fetch' Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 23/36] clone: add --bundle-uri option Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 24/36] clone: --bundle-uri cannot be combined with --depth Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 25/36] bundle: only fetch bundles if timestamp is new Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 26/36] fetch: fetch bundles before fetching original data Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 27/36] protocol-caps: implement cap_features() Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 28/36] serve: understand but do not advertise 'features' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 29/36] serve: advertise 'features' when config exists Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 30/36] connect: implement get_recommended_features() Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 31/36] transport: add connections for 'features' capability Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 32/36] clone: use server-recommended bundle URI Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 33/36] t5601: basic bundle URI test Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 34/36] protocol v2: add server-side "bundle-uri" skeleton (docs) Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 35/36] bundle-uri docs: add design notes Ævar Arnfjörð Bjarmason
2022-04-18 17:23 ` [RFC PATCH v2 36/36] docs: document bundle URI standard Ævar Arnfjörð Bjarmason
2022-04-21 19:54 ` [RFC PATCH v2 00/36] bundle-uri: a "dumb CDN" for git + TOC format Derrick Stolee
2022-04-22 9:37 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=RFC-patch-v2-18.36-ff9a7afaccd-20220418T165545Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=albertqcui@gmail.com \
--cc=derrickstolee@github.com \
--cc=dyroneteng@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=robbat2@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).