git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
	newren@gmail.com, avarab@gmail.com, dyroneteng@gmail.com,
	Johannes.Schindelin@gmx.de, szeder.dev@gmail.com,
	mjcheetham@outlook.com, Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 2/5] bundle-uri: create basic file-copy logic
Date: Wed, 27 Jul 2022 15:09:11 -0700	[thread overview]
Message-ID: <YuG3h4AcZjyILvdK@google.com> (raw)
In-Reply-To: <7d3159f0d9a29a04b5856982f787f4eddfb91b53.1658781277.git.gitgitgadget@gmail.com>

On 2022.07.25 20:34, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> Before implementing a way to fetch bundles into a repository, create the
> basic logic. Assume that the URI is actually a file path. Future logic
> will make this more careful to other protocols.
> 
> For now, we also only succeed if the content at the URI is a bundle
> file, not a bundle list. Bundle lists will be implemented in a future
> change.
> 
> Note that the discovery of a temporary filename is slightly racy because
> the odb_mkstemp() relies on the temporary file not existing. With the
> current implementation being limited to file copies, we could replace
> the copy_file() with copy_fd(). The tricky part comes in future changes
> that send the filename to 'git remote-https' and its 'get' capability.

> At that point, we need the file descriptor closed _and_ the file
> unlinked.

Ahh, it looks like this was the point I missed in my previous review.
IIUC, we need the file unlinked because http_get_file() will eventually
call finalize_object_file() to move a tempfile to the final object name,
and that will fail if we have an empty file already in place.


> If we were to keep the file descriptor open for the sake of
> normal file copies, then we would pollute the rest of the code for
> little benefit. This is especially the case because we expect that most
> bundle URI use will be based on HTTPS instead of file copies.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  Makefile     |   1 +
>  bundle-uri.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  bundle-uri.h |  14 +++++++
>  3 files changed, 119 insertions(+)
>  create mode 100644 bundle-uri.c
>  create mode 100644 bundle-uri.h
> 
> diff --git a/Makefile b/Makefile
> index 1624471badc..7d5f48069ea 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -906,6 +906,7 @@ LIB_OBJS += blob.o
>  LIB_OBJS += bloom.o
>  LIB_OBJS += branch.o
>  LIB_OBJS += bulk-checkin.o
> +LIB_OBJS += bundle-uri.o
>  LIB_OBJS += bundle.o
>  LIB_OBJS += cache-tree.o
>  LIB_OBJS += cbtree.o
> diff --git a/bundle-uri.c b/bundle-uri.c
> new file mode 100644
> index 00000000000..b35babc36aa
> --- /dev/null
> +++ b/bundle-uri.c
> @@ -0,0 +1,104 @@
> +#include "cache.h"
> +#include "bundle-uri.h"
> +#include "bundle.h"
> +#include "object-store.h"
> +#include "refs.h"
> +#include "run-command.h"
> +
> +static int find_temp_filename(struct strbuf *name)
> +{
> +	int fd;
> +	/*
> +	 * Find a temporary filename that is available. This is briefly
> +	 * racy, but unlikely to collide.
> +	 */
> +	fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX");
> +	if (fd < 0) {
> +		warning(_("failed to create temporary file"));
> +		return -1;
> +	}
> +
> +	close(fd);
> +	unlink(name->buf);
> +	return 0;
> +}
> +
> +static int copy_uri_to_file(const char *file, const char *uri)
> +{
> +	/* File-based URIs only for now. */
> +	return copy_file(file, uri, 0);
> +}
> +
> +static int unbundle_from_file(struct repository *r, const char *file)
> +{
> +	int result = 0;
> +	int bundle_fd;
> +	struct bundle_header header = BUNDLE_HEADER_INIT;
> +	struct string_list_item *refname;
> +	struct strbuf bundle_ref = STRBUF_INIT;
> +	size_t bundle_prefix_len;
> +
> +	if ((bundle_fd = read_bundle_header(file, &header)) < 0)
> +		return 1;
> +
> +	if ((result = unbundle(r, &header, bundle_fd, NULL)))
> +		return 1;
> +
> +	/*
> +	 * Convert all refs/heads/ from the bundle into refs/bundles/
> +	 * in the local repository.
> +	 */
> +	strbuf_addstr(&bundle_ref, "refs/bundles/");
> +	bundle_prefix_len = bundle_ref.len;
> +
> +	for_each_string_list_item(refname, &header.references) {
> +		struct object_id *oid = refname->util;
> +		struct object_id old_oid;
> +		const char *branch_name;
> +		int has_old;
> +
> +		if (!skip_prefix(refname->string, "refs/heads/", &branch_name))
> +			continue;
> +
> +		strbuf_setlen(&bundle_ref, bundle_prefix_len);
> +		strbuf_addstr(&bundle_ref, branch_name);
> +
> +		has_old = !read_ref(bundle_ref.buf, &old_oid);
> +		update_ref("fetched bundle", bundle_ref.buf, oid,
> +			   has_old ? &old_oid : NULL,
> +			   REF_SKIP_OID_VERIFICATION,
> +			   UPDATE_REFS_MSG_ON_ERR);
> +	}
> +
> +	bundle_header_release(&header);

We still also need to release bundle_ref here, right?


> +	return result;
> +}
> +
> +int fetch_bundle_uri(struct repository *r, const char *uri)
> +{
> +	int result = 0;
> +	struct strbuf filename = STRBUF_INIT;
> +
> +	if ((result = find_temp_filename(&filename)))
> +		goto cleanup;
> +
> +	if ((result = copy_uri_to_file(filename.buf, uri))) {
> +		warning(_("failed to download bundle from URI '%s'"), uri);
> +		goto cleanup;
> +	}
> +
> +	if ((result = !is_bundle(filename.buf, 0))) {
> +		warning(_("file at URI '%s' is not a bundle"), uri);
> +		goto cleanup;
> +	}
> +
> +	if ((result = unbundle_from_file(r, filename.buf))) {
> +		warning(_("failed to unbundle bundle from URI '%s'"), uri);
> +		goto cleanup;
> +	}
> +
> +cleanup:
> +	unlink(filename.buf);
> +	strbuf_release(&filename);
> +	return result;
> +}
> diff --git a/bundle-uri.h b/bundle-uri.h
> new file mode 100644
> index 00000000000..8a152f1ef14
> --- /dev/null
> +++ b/bundle-uri.h
> @@ -0,0 +1,14 @@
> +#ifndef BUNDLE_URI_H
> +#define BUNDLE_URI_H
> +
> +struct repository;
> +
> +/**
> + * Fetch data from the given 'uri' and unbundle the bundle data found
> + * based on that information.
> + *
> + * Returns non-zero if no bundle information is found at the given 'uri'.
> + */
> +int fetch_bundle_uri(struct repository *r, const char *uri);
> +
> +#endif
> -- 
> gitgitgadget
> 

  reply	other threads:[~2022-07-27 22:11 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-25 20:34 [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-07-27 22:08   ` Josh Steadmon
2022-07-27 23:00   ` Ævar Arnfjörð Bjarmason
2022-08-01 13:55     ` Derrick Stolee
2022-08-01 14:39       ` Ævar Arnfjörð Bjarmason
2022-07-25 20:34 ` [PATCH 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-07-27 22:09   ` Josh Steadmon [this message]
2022-08-01 13:58     ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-07-25 20:34 ` [PATCH 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-07-27 22:09   ` Josh Steadmon
2022-08-01 14:00     ` Derrick Stolee
2022-07-25 20:34 ` [PATCH 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-07-27 22:11 ` [PATCH 0/5] Bundle URIs II: git clone --bundle-uri Josh Steadmon
2022-08-01 14:00   ` Derrick Stolee
2022-08-02 12:29 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-02 12:29   ` [PATCH v2 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-02 21:32     ` Junio C Hamano
2022-08-04 15:34       ` Derrick Stolee
2022-08-04 18:19         ` Junio C Hamano
2022-08-02 12:29   ` [PATCH v2 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget
2022-08-09 13:11   ` [PATCH v3 0/5] Bundle URIs II: git clone --bundle-uri Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 1/5] remote-curl: add 'get' capability Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 2/5] bundle-uri: create basic file-copy logic Derrick Stolee via GitGitGadget
2022-08-09 13:11     ` [PATCH v3 3/5] clone: add --bundle-uri option Derrick Stolee via GitGitGadget
2022-08-22 21:24       ` Junio C Hamano
2022-08-23 14:05         ` Derrick Stolee
2022-08-24 15:46           ` Junio C Hamano
2022-08-09 13:11     ` [PATCH v3 4/5] bundle-uri: add support for http(s):// and file:// Derrick Stolee via GitGitGadget
2022-08-29  4:58       ` Teng Long
2022-08-30 13:33         ` Derrick Stolee
2022-08-09 13:11     ` [PATCH v3 5/5] clone: --bundle-uri cannot be combined with --depth Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YuG3h4AcZjyILvdK@google.com \
    --to=steadmon@google.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dyroneteng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=mjcheetham@outlook.com \
    --cc=newren@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).