git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 4/4] promisor-remote: teach lazy-fetch in any repo
Date: Mon, 7 Jun 2021 18:41:57 -0700	[thread overview]
Message-ID: <YL7K5fuoBPMAYZlu@google.com> (raw)
In-Reply-To: <b70a00b9b06e5142bb95891cfc2faa87d708ef0d.1622580781.git.jonathantanmy@google.com>

On Tue, Jun 01, 2021 at 02:34:19PM -0700, Jonathan Tan wrote:
> 
> This is one step towards supporting partial clone submodules.
> 
> Even after this patch, we will still lack partial clone submodules
> support, primarily because a lot of Git code that accesses submodule
> objects does so by adding their object stores as alternates, meaning
> that any lazy fetches that would occur in the submodule would be done
> based on the config of the superproject, not of the submodule. This also
> prevents testing of the functionality in this patch by user-facing
> commands. So for now, test this mechanism using a test helper.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  Makefile                      |  1 +
>  object-file.c                 |  7 ++-----
>  promisor-remote.c             | 14 +++++++++-----
>  t/helper/test-partial-clone.c | 34 ++++++++++++++++++++++++++++++++++
>  t/helper/test-tool.c          |  1 +
>  t/helper/test-tool.h          |  1 +
>  t/t0410-partial-clone.sh      | 24 ++++++++++++++++++++++++
>  7 files changed, 72 insertions(+), 10 deletions(-)
>  create mode 100644 t/helper/test-partial-clone.c
> 
> diff --git a/Makefile b/Makefile
> index c3565fc0f8..f6653bcd5e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -725,6 +725,7 @@ TEST_BUILTINS_OBJS += test-oidmap.o
>  TEST_BUILTINS_OBJS += test-online-cpus.o
>  TEST_BUILTINS_OBJS += test-parse-options.o
>  TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
> +TEST_BUILTINS_OBJS += test-partial-clone.o
>  TEST_BUILTINS_OBJS += test-path-utils.o
>  TEST_BUILTINS_OBJS += test-pcre2-config.o
>  TEST_BUILTINS_OBJS += test-pkt-line.o
> diff --git a/object-file.c b/object-file.c
> index f233b440b2..ebf273e9e7 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1570,15 +1570,12 @@ static int do_oid_object_info_extended(struct repository *r,
>  		}
>  
>  		/* Check if it is a missing object */
> -		if (fetch_if_missing && has_promisor_remote() &&
> -		    !already_retried && r == the_repository &&
> +		if (fetch_if_missing && repo_has_promisor_remote(r) &&
> +		    !already_retried &&

So we remove the explicit "if we've got promisors and are operating on
the repo we launched in" and instead ask "if the repo we're operating on
has promisors" - definitely a step towards in-process submodule
happiness :)

>  		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
>  			/*
>  			 * TODO Investigate checking promisor_remote_get_direct()
>  			 * TODO return value and stopping on error here.
> -			 * TODO Pass a repository struct through
> -			 * promisor_remote_get_direct(), such that arbitrary
> -			 * repositories work.
>  			 */
>  			promisor_remote_get_direct(r, real, 1);

And this seems like a stale comment, since I see we were already passing
'r' here. But arbitrary repositories still don't just work, right? Or, I
guess your point was "partial clone + submodules don't just work,
because of the alternates thing" - so maybe this part is OK?

> @@ -150,7 +156,7 @@ static void promisor_remote_init(struct repository *r)
>  		xcalloc(sizeof(*r->promisor_remote_config), 1);
>  	config->promisors_tail = &config->promisors;
>  
> -	git_config(promisor_remote_config, config);
> +	repo_config(r, promisor_remote_config, config);

Should this change have happened when we added 'r' to
promisor_remote_init? If r==the_repository then there's no difference
between these two calls, right?

> diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
> new file mode 100644
> index 0000000000..e7bc7eb21f
> --- /dev/null
> +++ b/t/helper/test-partial-clone.c
> @@ -0,0 +1,34 @@
> +#include "cache.h"
> +#include "test-tool.h"
> +#include "repository.h"
> +#include "object-store.h"
> +
> +static void object_info(const char *gitdir, const char *oid_hex)
> +{
> +	struct repository r;
> +	struct object_id oid;
> +	unsigned long size;
> +	struct object_info oi = {.sizep = &size};
> +	const char *p;
> +
> +	if (repo_init(&r, gitdir, NULL))
> +		die("could not init repo");
> +	if (parse_oid_hex(oid_hex, &oid, &p))
> +		die("could not parse oid");
> +	if (oid_object_info_extended(&r, &oid, &oi, 0))
> +		die("could not obtain object info");
> +	printf("%d\n", (int) size);
> +}
> +
> +int cmd__partial_clone(int argc, const char **argv)
> +{
> +	if (argc < 4)
> +		die("too few arguments");
> +
> +	if (!strcmp(argv[1], "object-info"))
> +		object_info(argv[2], argv[3]);
> +	else
> +		die("invalid argument '%s'", argv[1]);
> +
> +	return 0;
> +}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index c5bd0c6d4c..b21e8f1519 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -46,6 +46,7 @@ static struct test_cmd cmds[] = {
>  	{ "online-cpus", cmd__online_cpus },
>  	{ "parse-options", cmd__parse_options },
>  	{ "parse-pathspec-file", cmd__parse_pathspec_file },
> +	{ "partial-clone", cmd__partial_clone },
>  	{ "path-utils", cmd__path_utils },
>  	{ "pcre2-config", cmd__pcre2_config },
>  	{ "pkt-line", cmd__pkt_line },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index e8069a3b22..f845ced4b3 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -35,6 +35,7 @@ int cmd__oidmap(int argc, const char **argv);
>  int cmd__online_cpus(int argc, const char **argv);
>  int cmd__parse_options(int argc, const char **argv);
>  int cmd__parse_pathspec_file(int argc, const char** argv);
> +int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__pcre2_config(int argc, const char **argv);
>  int cmd__pkt_line(int argc, const char **argv);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 584a039b85..e804d267e6 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -604,6 +604,30 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
>  	git -C repo cherry-pick side1
>  '
>  
> +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> +	rm -rf full partial.git &&
> +	test_create_repo full &&
> +	printf 12345 >full/file.txt &&
> +	git -C full add file.txt &&
> +	git -C full commit -m "first commit" &&
I think there is some test_commit or similar function here that's more
commonly used, right?

> +
> +	test_config -C full uploadpack.allowfilter 1 &&
> +	test_config -C full uploadpack.allowanysha1inwant 1 &&
I wasn't sure what these configs are for, but it looks like .allowfilter
is to allow 'full' to serve as a remote to a partial clone. But what do
you need .allowAnySha1InWant for here? Are we expecting to ask for SHAs
that 'full' doesn't have?

> +	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> +	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> +
> +	# Sanity check that the file is missing
> +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +	grep "[?]$FILE_HASH" out &&
> +
> +	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> +	test "$OUT" -eq 5 &&

Hm. I guess I am confused about why this fetches the desired object into
partial.git. Maybe the test-helper needs a comment (and maybe here too)
on the line where fetch will be triggered?

> +
> +	# Sanity check that the file is now present
> +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +	! grep "[?]$FILE_HASH" out

 - Emily

  parent reply	other threads:[~2021-06-08  1:42 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 21:34 [PATCH 0/4] First steps towards partial clone submodules Jonathan Tan
2021-06-01 21:34 ` [PATCH 1/4] promisor-remote: read partialClone config here Jonathan Tan
2021-06-04 19:56   ` Taylor Blau
2021-06-05  1:38     ` Jonathan Tan
2021-06-07 22:41   ` Emily Shaffer
2021-06-01 21:34 ` [PATCH 2/4] promisor-remote: support per-repository config Jonathan Tan
2021-06-04 20:09   ` Taylor Blau
2021-06-05  1:43     ` Jonathan Tan
2021-06-04 21:21   ` Elijah Newren
2021-06-05  1:54     ` Jonathan Tan
2021-06-08  0:48   ` Emily Shaffer
2021-06-01 21:34 ` [PATCH 3/4] run-command: move envvar-resetting function Jonathan Tan
2021-06-04 20:19   ` Taylor Blau
2021-06-05  1:57     ` Jonathan Tan
2021-06-08  0:54   ` Emily Shaffer
2021-06-01 21:34 ` [PATCH 4/4] promisor-remote: teach lazy-fetch in any repo Jonathan Tan
2021-06-04 21:25   ` Taylor Blau
2021-06-05  2:11     ` Jonathan Tan
2021-06-04 21:35   ` Elijah Newren
2021-06-05  2:16     ` Jonathan Tan
2021-06-05  3:48     ` Elijah Newren
2021-06-05  0:22   ` Elijah Newren
2021-06-05  2:16     ` Jonathan Tan
2021-06-08  1:41   ` Emily Shaffer [this message]
2021-06-09  4:52     ` Jonathan Tan
2021-06-08  0:25 ` [PATCH v2 0/4] First steps towards partial clone submodules Jonathan Tan
2021-06-08  0:25   ` [PATCH v2 1/4] promisor-remote: read partialClone config here Jonathan Tan
2021-06-08  3:18     ` Junio C Hamano
2021-06-09  4:26       ` Jonathan Tan
2021-06-09  9:30         ` Junio C Hamano
2021-06-09 17:16           ` Jonathan Tan
2021-06-08 17:28     ` Elijah Newren
2021-06-09  4:44       ` Jonathan Tan
2021-06-09  5:34         ` Elijah Newren
2021-06-10 17:25           ` Jonathan Tan
2021-06-08  0:25   ` [PATCH v2 2/4] promisor-remote: support per-repository config Jonathan Tan
2021-06-08  3:30     ` Junio C Hamano
2021-06-09  4:29       ` Jonathan Tan
2021-06-08  0:25   ` [PATCH v2 3/4] run-command: move envvar-resetting function Jonathan Tan
2021-06-08  4:14     ` Junio C Hamano
2021-06-09  4:32       ` Jonathan Tan
2021-06-09  5:28         ` Junio C Hamano
2021-06-09 18:15           ` Jonathan Tan
2021-06-08  0:25   ` [PATCH v2 4/4] promisor-remote: teach lazy-fetch in any repo Jonathan Tan
2021-06-08  4:33     ` Junio C Hamano
2021-06-09  4:39       ` Jonathan Tan
2021-06-09  5:33         ` Junio C Hamano
2021-06-09 18:20           ` Jonathan Tan
2021-06-10  1:26             ` Junio C Hamano
2021-06-08 17:42     ` Elijah Newren
2021-06-09  4:46       ` Jonathan Tan
2021-06-08 17:50   ` [PATCH v2 0/4] First steps towards partial clone submodules Elijah Newren
2021-06-08 23:42     ` Junio C Hamano
2021-06-09  0:07       ` Elijah Newren
2021-06-09  0:18         ` Junio C Hamano
2021-06-09  4:58     ` Jonathan Tan
2021-06-08  1:44 ` [PATCH " Emily Shaffer
2021-06-10 17:35 ` [PATCH v3 0/5] " Jonathan Tan
2021-06-10 17:35   ` [PATCH v3 1/5] repository: move global r_f_p_c to repo struct Jonathan Tan
2021-06-10 20:47     ` Elijah Newren
2021-06-10 17:35   ` [PATCH v3 2/5] promisor-remote: support per-repository config Jonathan Tan
2021-06-10 17:35   ` [PATCH v3 3/5] submodule: refrain from filtering GIT_CONFIG_COUNT Jonathan Tan
2021-06-10 21:13     ` Elijah Newren
2021-06-10 21:51       ` Jeff King
2021-06-11 17:02         ` Jonathan Tan
2021-06-10 17:35   ` [PATCH v3 4/5] run-command: refactor subprocess env preparation Jonathan Tan
2021-06-10 21:21     ` Elijah Newren
2021-06-10 17:35   ` [PATCH v3 5/5] promisor-remote: teach lazy-fetch in any repo Jonathan Tan
2021-06-10 21:29   ` [PATCH v3 0/5] First steps towards partial clone submodules Elijah Newren
2021-06-15 21:22     ` Elijah Newren
2021-06-17 17:13 ` [PATCH v4 " Jonathan Tan
2021-06-17 17:13   ` [PATCH v4 1/5] repository: move global r_f_p_c to repo struct Jonathan Tan
2021-06-17 17:13   ` [PATCH v4 2/5] promisor-remote: support per-repository config Jonathan Tan
2021-06-17 17:13   ` [PATCH v4 3/5] submodule: refrain from filtering GIT_CONFIG_COUNT Jonathan Tan
2021-06-17 17:13   ` [PATCH v4 4/5] run-command: refactor subprocess env preparation Jonathan Tan
2021-06-17 17:13   ` [PATCH v4 5/5] promisor-remote: teach lazy-fetch in any repo Jonathan Tan
2021-06-19 20:01   ` [PATCH v4 0/5] First steps towards partial clone submodules Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YL7K5fuoBPMAYZlu@google.com \
    --to=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).