git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] partial-clone: fix two issues with sparse filter handling
@ 2019-08-28 20:18 Jon Simons
  2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons
  2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons
  0 siblings, 2 replies; 8+ messages in thread
From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw)
  To: jon, git; +Cc: me, peff

Included here are two fixes for partial cloning with sparse filters.
These issues were uncovered in early testing internally at GitHub,
where Taylor and Peff have provided early offlist review feedback.

Jon Simons (2):
  list-objects-filter: only parse sparse OID when 'have_git_dir'
  list-objects-filter: handle unresolved sparse filter OID

 list-objects-filter-options.c |  3 ++-
 list-objects-filter.c         |  6 +++++-
 t/t5616-partial-clone.sh      | 30 ++++++++++++++++++++++++++++++
 3 files changed, 37 insertions(+), 2 deletions(-)

-- 
2.23.0.37.g745f681289.dirty


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir'
  2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons
@ 2019-08-28 20:18 ` Jon Simons
  2019-08-28 21:10   ` Eric Sunshine
  2019-08-28 23:35   ` Jeff King
  2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons
  1 sibling, 2 replies; 8+ messages in thread
From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw)
  To: jon, git; +Cc: me, peff

Fix a bug in partial cloning with sparse filters by ensuring to check
for 'have_git_dir' before attempting to resolve the sparse filter OID.

Otherwise the client will trigger:

    BUG: refs.c:1851: attempting to get main_ref_store outside of repository

when attempting to git clone with a sparse filter.

Note that this fix is the minimal one which avoids the BUG and allows
for the clone to complete successfully:

There is an open question as to whether there should be any attempt
to resolve the OID provided by the client in this context, as a filter
for the clone to be used on the remote side.  For cases where local
and remote OID resolutions differ, resolving on the client side could
be considered a bug.  For now, the minimal approach here is used to
unblock further testing for partial clones with sparse filters, while
a more invasive fix could make sense to pursue as a future direction.

t5616 is updated to demonstrate the change.

Signed-off-by: Jon Simons <jon@jonsimons.org>
---
 list-objects-filter-options.c |  3 ++-
 t/t5616-partial-clone.sh      | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 1cb20c659c..aaba312edb 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -71,7 +71,8 @@ static int gently_parse_list_objects_filter(
 		 * command, but DO NOT complain if we don't have the blob or
 		 * ref locally.
 		 */
-		if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB,
+		if (have_git_dir() &&
+		    !get_oid_with_context(the_repository, v0, GET_OID_BLOB,
 					  &sparse_oid, &oc))
 			filter_options->sparse_oid_value = oiddup(&sparse_oid);
 		filter_options->choice = LOFC_SPARSE_OID;
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 565254558f..6c3aa06973 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -241,6 +241,29 @@ test_expect_success 'fetch what is specified on CLI even if already promised' '
 	! grep "?$(cat blob)" missing_after
 '
 
+test_expect_success 'setup src repo for sparse filter' '
+	git init sparse-src &&
+	git -C sparse-src config --local uploadpack.allowfilter 1 &&
+	git -C sparse-src config --local uploadpack.allowanysha1inwant 1 &&
+	for n in 1 2 3 4
+	do
+		test_commit -C sparse-src "this-is-file-$n" file.$n.txt
+	done &&
+	echo "/file.1.txt" >> sparse-src/odd-files &&
+	echo "/file.3.txt" >> sparse-src/odd-files &&
+	echo "/file.2.txt" >> sparse-src/even-files &&
+	echo "/file.4.txt" >> sparse-src/even-files &&
+	echo "/*" >> sparse-src/all-files &&
+	git -C sparse-src add odd-files even-files all-files &&
+	git -C sparse-src commit -m "some sparse checkout files"
+'
+
+test_expect_success 'partial clone with sparse filter succeeds' '
+	git clone --no-local --no-checkout --filter=sparse:oid=master:all-files "file://$(pwd)/sparse-src" pc-all &&
+	git clone --no-local --no-checkout --filter=sparse:oid=master:even-files "file://$(pwd)/sparse-src" pc-even &&
+	git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.23.0.37.g745f681289.dirty


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID
  2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons
  2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons
@ 2019-08-28 20:18 ` Jon Simons
  2019-08-29 13:12   ` Derrick Stolee
  1 sibling, 1 reply; 8+ messages in thread
From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw)
  To: jon, git; +Cc: me, peff

Handle a potential NULL 'sparse_oid_value' when attempting to load
sparse filter exclusions by blob, to avoid segfaulting later during
'add_excludes_from_blob_to_list'.

While here, uniquify the errors emitted to distinguish between the
case that a given OID is NULL due to an earlier failure to resolve it,
and when an OID resolves but parsing the sparse filter spec fails.

t5616 is updated to demonstrate the change.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jon Simons <jon@jonsimons.org>
---
 list-objects-filter.c    | 6 +++++-
 t/t5616-partial-clone.sh | 7 +++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/list-objects-filter.c b/list-objects-filter.c
index 36e1f774bc..252fae5d4e 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -464,9 +464,13 @@ static void *filter_sparse_oid__init(
 {
 	struct filter_sparse_data *d = xcalloc(1, sizeof(*d));
 	d->omits = omitted;
+	if (!filter_options->sparse_oid_value)
+		die(_("unable to read sparse filter specification from %s"),
+		      filter_options->filter_spec);
 	if (add_excludes_from_blob_to_list(filter_options->sparse_oid_value,
 					   NULL, 0, &d->el) < 0)
-		die("could not load filter specification");
+		die(_("unable to parse sparse filter data in %s"),
+		      oid_to_hex(filter_options->sparse_oid_value));
 
 	ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc);
 	d->array_frame[d->nr].defval = 0; /* default to include */
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 6c3aa06973..0adb11f17b 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -264,6 +264,13 @@ test_expect_success 'partial clone with sparse filter succeeds' '
 	git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd
 '
 
+test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' '
+	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err &&
+	test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err &&
+	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err &&
+	test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.23.0.37.g745f681289.dirty


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir'
  2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons
@ 2019-08-28 21:10   ` Eric Sunshine
  2019-08-28 23:35   ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Sunshine @ 2019-08-28 21:10 UTC (permalink / raw)
  To: Jon Simons; +Cc: Git List, Taylor Blau, Jeff King

On Wed, Aug 28, 2019 at 4:27 PM Jon Simons <jon@jonsimons.org> wrote:
> Fix a bug in partial cloning with sparse filters by ensuring to check
> for 'have_git_dir' before attempting to resolve the sparse filter OID.
> [...]
> Signed-off-by: Jon Simons <jon@jonsimons.org>
> ---
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> @@ -241,6 +241,29 @@ test_expect_success 'fetch what is specified on CLI even if already promised' '
> +test_expect_success 'setup src repo for sparse filter' '
> +       git init sparse-src &&
> +       git -C sparse-src config --local uploadpack.allowfilter 1 &&
> +       git -C sparse-src config --local uploadpack.allowanysha1inwant 1 &&
> +       for n in 1 2 3 4
> +       do
> +               test_commit -C sparse-src "this-is-file-$n" file.$n.txt
> +       done &&

The way this is coded, a failure of the test_commit() invocation won't
fail the test overall. You need to do so manually:

    for n in 1 2 3 4
    do
        test_commit -C sparse-src "this-is-file-$n" file.$n.txt || return 1
    done &&

> +       echo "/file.1.txt" >> sparse-src/odd-files &&
> +       echo "/file.3.txt" >> sparse-src/odd-files &&
> +       echo "/file.2.txt" >> sparse-src/even-files &&
> +       echo "/file.4.txt" >> sparse-src/even-files &&

Simpler:

    test_write_lines /file1.txt /file3.txt >sparse-src/odd-files &&
    test_write_lines /file2.txt /file4.txt >sparse-src/even-files &&

> +       echo "/*" >> sparse-src/all-files &&

Style nit: drop whitespace following redirection operator.

And, using >> rather than just > here makes the test more confusing
than it need be; probably best to use >.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir'
  2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons
  2019-08-28 21:10   ` Eric Sunshine
@ 2019-08-28 23:35   ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2019-08-28 23:35 UTC (permalink / raw)
  To: Jon Simons; +Cc: git, me

On Wed, Aug 28, 2019 at 04:18:23PM -0400, Jon Simons wrote:

> Fix a bug in partial cloning with sparse filters by ensuring to check
> for 'have_git_dir' before attempting to resolve the sparse filter OID.
> 
> Otherwise the client will trigger:
> 
>     BUG: refs.c:1851: attempting to get main_ref_store outside of repository
> 
> when attempting to git clone with a sparse filter.
> 
> Note that this fix is the minimal one which avoids the BUG and allows
> for the clone to complete successfully:
> 
> There is an open question as to whether there should be any attempt
> to resolve the OID provided by the client in this context, as a filter
> for the clone to be used on the remote side.  For cases where local
> and remote OID resolutions differ, resolving on the client side could
> be considered a bug.  For now, the minimal approach here is used to
> unblock further testing for partial clones with sparse filters, while
> a more invasive fix could make sense to pursue as a future direction.

Just to provide a little more of our findings to the list: I think the
main thing going on here is that the filter options-parsing code is
shared on the client and server side (and doesn't have any idea which it
is). That's why we see the "do not complain" comment in the context
below:

> --- a/list-objects-filter-options.c
> +++ b/list-objects-filter-options.c
> @@ -71,7 +71,8 @@ static int gently_parse_list_objects_filter(
>  		 * command, but DO NOT complain if we don't have the blob or
>  		 * ref locally.
>  		 */
> -		if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB,
> +		if (have_git_dir() &&
> +		    !get_oid_with_context(the_repository, v0, GET_OID_BLOB,
>  					  &sparse_oid, &oc))

and why it's OK to just quietly ignore this case. I don't think it's
hurting anything in practice. Whether we resolve the name or not, we
send the _original_ name to the other side (it would be a bug for us to
resolve it ourselves and send the oid).

> +test_expect_success 'partial clone with sparse filter succeeds' '
> +	git clone --no-local --no-checkout --filter=sparse:oid=master:all-files "file://$(pwd)/sparse-src" pc-all &&
> +	git clone --no-local --no-checkout --filter=sparse:oid=master:even-files "file://$(pwd)/sparse-src" pc-even &&
> +	git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd
> +'

Since you're using "--no-local", you should be able to just say
"sparse-src" without the full path or file URL.

I think Eric's style suggestions elsewhere in the thread were sensible,
too. And of course the code change itself looks good.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID
  2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons
@ 2019-08-29 13:12   ` Derrick Stolee
  2019-08-29 13:44     ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Derrick Stolee @ 2019-08-29 13:12 UTC (permalink / raw)
  To: Jon Simons, git; +Cc: me, peff

On 8/28/2019 4:18 PM, Jon Simons wrote:
> Handle a potential NULL 'sparse_oid_value' when attempting to load
> sparse filter exclusions by blob, to avoid segfaulting later during
> 'add_excludes_from_blob_to_list'.
> 
> While here, uniquify the errors emitted to distinguish between the
> case that a given OID is NULL due to an earlier failure to resolve it,
> and when an OID resolves but parsing the sparse filter spec fails.

Adding localization here also seems like a good idea. Thanks!

-Stolee
> +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' '
> +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err &&
> +	test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err &&
> +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err &&
> +	test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err

Just as a sanity check: when we use test_i18ngrep, how does it know how to
separate the part that is translated and which part is not?

	translated: "unable to read sparse filter specification from"
	not translated: "sparse:oid=master"

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID
  2019-08-29 13:12   ` Derrick Stolee
@ 2019-08-29 13:44     ` Jeff King
  2019-08-29 14:28       ` Derrick Stolee
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2019-08-29 13:44 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jon Simons, git, me

On Thu, Aug 29, 2019 at 09:12:38AM -0400, Derrick Stolee wrote:

> > +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' '
> > +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err &&
> > +	test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err &&
> > +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err &&
> > +	test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err
> 
> Just as a sanity check: when we use test_i18ngrep, how does it know how to
> separate the part that is translated and which part is not?
> 
> 	translated: "unable to read sparse filter specification from"
> 	not translated: "sparse:oid=master"

It doesn't know. By default we run the suite in LOCALE=C and it checks
the whole string. Under a GETTEXT_POISON build, it checks nothing at
all.

The poison stuff is really about helping people not accidentally mark a
plumbing string (that we expect to get parsed by a machine) as
translatable. So the idea is you'd build with GETTEXT_POISON and then
run the test suite to see if anything breaks. But that means we also
have to annotate the test suite with "yes, I know this will be gibberish
in a poison build, but that's OK because it's meant for humans". And
that's what test_i18ngrep is.

test_i18ngrep could be more clever about matching the gibberish, but
there's not much point. The LOCALE=C run already covered the correctness
of checking the message.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID
  2019-08-29 13:44     ` Jeff King
@ 2019-08-29 14:28       ` Derrick Stolee
  0 siblings, 0 replies; 8+ messages in thread
From: Derrick Stolee @ 2019-08-29 14:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Jon Simons, git, me

On 8/29/2019 9:44 AM, Jeff King wrote:
> On Thu, Aug 29, 2019 at 09:12:38AM -0400, Derrick Stolee wrote:
> 
>>> +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' '
>>> +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err &&
>>> +	test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err &&
>>> +	test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err &&
>>> +	test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err
>>
>> Just as a sanity check: when we use test_i18ngrep, how does it know how to
>> separate the part that is translated and which part is not?
>>
>> 	translated: "unable to read sparse filter specification from"
>> 	not translated: "sparse:oid=master"
> 
> It doesn't know. By default we run the suite in LOCALE=C and it checks
> the whole string. Under a GETTEXT_POISON build, it checks nothing at
> all.
> 
> The poison stuff is really about helping people not accidentally mark a
> plumbing string (that we expect to get parsed by a machine) as
> translatable. So the idea is you'd build with GETTEXT_POISON and then
> run the test suite to see if anything breaks. But that means we also
> have to annotate the test suite with "yes, I know this will be gibberish
> in a poison build, but that's OK because it's meant for humans". And
> that's what test_i18ngrep is.
> 
> test_i18ngrep could be more clever about matching the gibberish, but
> there's not much point. The LOCALE=C run already covered the correctness
> of checking the message.

Thanks for clearing this up for me!

-Stolee

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-08-29 14:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons
2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons
2019-08-28 21:10   ` Eric Sunshine
2019-08-28 23:35   ` Jeff King
2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons
2019-08-29 13:12   ` Derrick Stolee
2019-08-29 13:44     ` Jeff King
2019-08-29 14:28       ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).