* [PATCH 0/2] partial-clone: fix two issues with sparse filter handling @ 2019-08-28 20:18 Jon Simons 2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons 2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons 0 siblings, 2 replies; 8+ messages in thread From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw) To: jon, git; +Cc: me, peff Included here are two fixes for partial cloning with sparse filters. These issues were uncovered in early testing internally at GitHub, where Taylor and Peff have provided early offlist review feedback. Jon Simons (2): list-objects-filter: only parse sparse OID when 'have_git_dir' list-objects-filter: handle unresolved sparse filter OID list-objects-filter-options.c | 3 ++- list-objects-filter.c | 6 +++++- t/t5616-partial-clone.sh | 30 ++++++++++++++++++++++++++++++ 3 files changed, 37 insertions(+), 2 deletions(-) -- 2.23.0.37.g745f681289.dirty ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' 2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons @ 2019-08-28 20:18 ` Jon Simons 2019-08-28 21:10 ` Eric Sunshine 2019-08-28 23:35 ` Jeff King 2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons 1 sibling, 2 replies; 8+ messages in thread From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw) To: jon, git; +Cc: me, peff Fix a bug in partial cloning with sparse filters by ensuring to check for 'have_git_dir' before attempting to resolve the sparse filter OID. Otherwise the client will trigger: BUG: refs.c:1851: attempting to get main_ref_store outside of repository when attempting to git clone with a sparse filter. Note that this fix is the minimal one which avoids the BUG and allows for the clone to complete successfully: There is an open question as to whether there should be any attempt to resolve the OID provided by the client in this context, as a filter for the clone to be used on the remote side. For cases where local and remote OID resolutions differ, resolving on the client side could be considered a bug. For now, the minimal approach here is used to unblock further testing for partial clones with sparse filters, while a more invasive fix could make sense to pursue as a future direction. t5616 is updated to demonstrate the change. Signed-off-by: Jon Simons <jon@jonsimons.org> --- list-objects-filter-options.c | 3 ++- t/t5616-partial-clone.sh | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 1cb20c659c..aaba312edb 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -71,7 +71,8 @@ static int gently_parse_list_objects_filter( * command, but DO NOT complain if we don't have the blob or * ref locally. */ - if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB, + if (have_git_dir() && + !get_oid_with_context(the_repository, v0, GET_OID_BLOB, &sparse_oid, &oc)) filter_options->sparse_oid_value = oiddup(&sparse_oid); filter_options->choice = LOFC_SPARSE_OID; diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 565254558f..6c3aa06973 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -241,6 +241,29 @@ test_expect_success 'fetch what is specified on CLI even if already promised' ' ! grep "?$(cat blob)" missing_after ' +test_expect_success 'setup src repo for sparse filter' ' + git init sparse-src && + git -C sparse-src config --local uploadpack.allowfilter 1 && + git -C sparse-src config --local uploadpack.allowanysha1inwant 1 && + for n in 1 2 3 4 + do + test_commit -C sparse-src "this-is-file-$n" file.$n.txt + done && + echo "/file.1.txt" >> sparse-src/odd-files && + echo "/file.3.txt" >> sparse-src/odd-files && + echo "/file.2.txt" >> sparse-src/even-files && + echo "/file.4.txt" >> sparse-src/even-files && + echo "/*" >> sparse-src/all-files && + git -C sparse-src add odd-files even-files all-files && + git -C sparse-src commit -m "some sparse checkout files" +' + +test_expect_success 'partial clone with sparse filter succeeds' ' + git clone --no-local --no-checkout --filter=sparse:oid=master:all-files "file://$(pwd)/sparse-src" pc-all && + git clone --no-local --no-checkout --filter=sparse:oid=master:even-files "file://$(pwd)/sparse-src" pc-even && + git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd -- 2.23.0.37.g745f681289.dirty ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' 2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons @ 2019-08-28 21:10 ` Eric Sunshine 2019-08-28 23:35 ` Jeff King 1 sibling, 0 replies; 8+ messages in thread From: Eric Sunshine @ 2019-08-28 21:10 UTC (permalink / raw) To: Jon Simons; +Cc: Git List, Taylor Blau, Jeff King On Wed, Aug 28, 2019 at 4:27 PM Jon Simons <jon@jonsimons.org> wrote: > Fix a bug in partial cloning with sparse filters by ensuring to check > for 'have_git_dir' before attempting to resolve the sparse filter OID. > [...] > Signed-off-by: Jon Simons <jon@jonsimons.org> > --- > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > @@ -241,6 +241,29 @@ test_expect_success 'fetch what is specified on CLI even if already promised' ' > +test_expect_success 'setup src repo for sparse filter' ' > + git init sparse-src && > + git -C sparse-src config --local uploadpack.allowfilter 1 && > + git -C sparse-src config --local uploadpack.allowanysha1inwant 1 && > + for n in 1 2 3 4 > + do > + test_commit -C sparse-src "this-is-file-$n" file.$n.txt > + done && The way this is coded, a failure of the test_commit() invocation won't fail the test overall. You need to do so manually: for n in 1 2 3 4 do test_commit -C sparse-src "this-is-file-$n" file.$n.txt || return 1 done && > + echo "/file.1.txt" >> sparse-src/odd-files && > + echo "/file.3.txt" >> sparse-src/odd-files && > + echo "/file.2.txt" >> sparse-src/even-files && > + echo "/file.4.txt" >> sparse-src/even-files && Simpler: test_write_lines /file1.txt /file3.txt >sparse-src/odd-files && test_write_lines /file2.txt /file4.txt >sparse-src/even-files && > + echo "/*" >> sparse-src/all-files && Style nit: drop whitespace following redirection operator. And, using >> rather than just > here makes the test more confusing than it need be; probably best to use >. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' 2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons 2019-08-28 21:10 ` Eric Sunshine @ 2019-08-28 23:35 ` Jeff King 1 sibling, 0 replies; 8+ messages in thread From: Jeff King @ 2019-08-28 23:35 UTC (permalink / raw) To: Jon Simons; +Cc: git, me On Wed, Aug 28, 2019 at 04:18:23PM -0400, Jon Simons wrote: > Fix a bug in partial cloning with sparse filters by ensuring to check > for 'have_git_dir' before attempting to resolve the sparse filter OID. > > Otherwise the client will trigger: > > BUG: refs.c:1851: attempting to get main_ref_store outside of repository > > when attempting to git clone with a sparse filter. > > Note that this fix is the minimal one which avoids the BUG and allows > for the clone to complete successfully: > > There is an open question as to whether there should be any attempt > to resolve the OID provided by the client in this context, as a filter > for the clone to be used on the remote side. For cases where local > and remote OID resolutions differ, resolving on the client side could > be considered a bug. For now, the minimal approach here is used to > unblock further testing for partial clones with sparse filters, while > a more invasive fix could make sense to pursue as a future direction. Just to provide a little more of our findings to the list: I think the main thing going on here is that the filter options-parsing code is shared on the client and server side (and doesn't have any idea which it is). That's why we see the "do not complain" comment in the context below: > --- a/list-objects-filter-options.c > +++ b/list-objects-filter-options.c > @@ -71,7 +71,8 @@ static int gently_parse_list_objects_filter( > * command, but DO NOT complain if we don't have the blob or > * ref locally. > */ > - if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB, > + if (have_git_dir() && > + !get_oid_with_context(the_repository, v0, GET_OID_BLOB, > &sparse_oid, &oc)) and why it's OK to just quietly ignore this case. I don't think it's hurting anything in practice. Whether we resolve the name or not, we send the _original_ name to the other side (it would be a bug for us to resolve it ourselves and send the oid). > +test_expect_success 'partial clone with sparse filter succeeds' ' > + git clone --no-local --no-checkout --filter=sparse:oid=master:all-files "file://$(pwd)/sparse-src" pc-all && > + git clone --no-local --no-checkout --filter=sparse:oid=master:even-files "file://$(pwd)/sparse-src" pc-even && > + git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd > +' Since you're using "--no-local", you should be able to just say "sparse-src" without the full path or file URL. I think Eric's style suggestions elsewhere in the thread were sensible, too. And of course the code change itself looks good. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID 2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons 2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons @ 2019-08-28 20:18 ` Jon Simons 2019-08-29 13:12 ` Derrick Stolee 1 sibling, 1 reply; 8+ messages in thread From: Jon Simons @ 2019-08-28 20:18 UTC (permalink / raw) To: jon, git; +Cc: me, peff Handle a potential NULL 'sparse_oid_value' when attempting to load sparse filter exclusions by blob, to avoid segfaulting later during 'add_excludes_from_blob_to_list'. While here, uniquify the errors emitted to distinguish between the case that a given OID is NULL due to an earlier failure to resolve it, and when an OID resolves but parsing the sparse filter spec fails. t5616 is updated to demonstrate the change. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jon Simons <jon@jonsimons.org> --- list-objects-filter.c | 6 +++++- t/t5616-partial-clone.sh | 7 +++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index 36e1f774bc..252fae5d4e 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -464,9 +464,13 @@ static void *filter_sparse_oid__init( { struct filter_sparse_data *d = xcalloc(1, sizeof(*d)); d->omits = omitted; + if (!filter_options->sparse_oid_value) + die(_("unable to read sparse filter specification from %s"), + filter_options->filter_spec); if (add_excludes_from_blob_to_list(filter_options->sparse_oid_value, NULL, 0, &d->el) < 0) - die("could not load filter specification"); + die(_("unable to parse sparse filter data in %s"), + oid_to_hex(filter_options->sparse_oid_value)); ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); d->array_frame[d->nr].defval = 0; /* default to include */ diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 6c3aa06973..0adb11f17b 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -264,6 +264,13 @@ test_expect_success 'partial clone with sparse filter succeeds' ' git clone --no-local --no-checkout --filter=sparse:oid=master:odd-files "file://$(pwd)/sparse-src" pc-odd ' +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' ' + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err && + test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err && + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err && + test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd -- 2.23.0.37.g745f681289.dirty ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID 2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons @ 2019-08-29 13:12 ` Derrick Stolee 2019-08-29 13:44 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Derrick Stolee @ 2019-08-29 13:12 UTC (permalink / raw) To: Jon Simons, git; +Cc: me, peff On 8/28/2019 4:18 PM, Jon Simons wrote: > Handle a potential NULL 'sparse_oid_value' when attempting to load > sparse filter exclusions by blob, to avoid segfaulting later during > 'add_excludes_from_blob_to_list'. > > While here, uniquify the errors emitted to distinguish between the > case that a given OID is NULL due to an earlier failure to resolve it, > and when an OID resolves but parsing the sparse filter spec fails. Adding localization here also seems like a good idea. Thanks! -Stolee > +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' ' > + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err && > + test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err && > + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err && > + test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err Just as a sanity check: when we use test_i18ngrep, how does it know how to separate the part that is translated and which part is not? translated: "unable to read sparse filter specification from" not translated: "sparse:oid=master" Thanks, -Stolee ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID 2019-08-29 13:12 ` Derrick Stolee @ 2019-08-29 13:44 ` Jeff King 2019-08-29 14:28 ` Derrick Stolee 0 siblings, 1 reply; 8+ messages in thread From: Jeff King @ 2019-08-29 13:44 UTC (permalink / raw) To: Derrick Stolee; +Cc: Jon Simons, git, me On Thu, Aug 29, 2019 at 09:12:38AM -0400, Derrick Stolee wrote: > > +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' ' > > + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err && > > + test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err && > > + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err && > > + test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err > > Just as a sanity check: when we use test_i18ngrep, how does it know how to > separate the part that is translated and which part is not? > > translated: "unable to read sparse filter specification from" > not translated: "sparse:oid=master" It doesn't know. By default we run the suite in LOCALE=C and it checks the whole string. Under a GETTEXT_POISON build, it checks nothing at all. The poison stuff is really about helping people not accidentally mark a plumbing string (that we expect to get parsed by a machine) as translatable. So the idea is you'd build with GETTEXT_POISON and then run the test suite to see if anything breaks. But that means we also have to annotate the test suite with "yes, I know this will be gibberish in a poison build, but that's OK because it's meant for humans". And that's what test_i18ngrep is. test_i18ngrep could be more clever about matching the gibberish, but there's not much point. The LOCALE=C run already covered the correctness of checking the message. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID 2019-08-29 13:44 ` Jeff King @ 2019-08-29 14:28 ` Derrick Stolee 0 siblings, 0 replies; 8+ messages in thread From: Derrick Stolee @ 2019-08-29 14:28 UTC (permalink / raw) To: Jeff King; +Cc: Jon Simons, git, me On 8/29/2019 9:44 AM, Jeff King wrote: > On Thu, Aug 29, 2019 at 09:12:38AM -0400, Derrick Stolee wrote: > >>> +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' ' >>> + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master:sparse-filter "file://$(pwd)/sparse-src" sc1 2>err && >>> + test_i18ngrep "unable to read sparse filter specification from sparse:oid=master:sparse-filter" err && >>> + test_must_fail git clone --no-local --no-checkout --filter=sparse:oid=master "file://$(pwd)/sparse-src" sc2 2>err && >>> + test_i18ngrep "unable to parse sparse filter data in $(git -C sparse-src rev-parse master)" err >> >> Just as a sanity check: when we use test_i18ngrep, how does it know how to >> separate the part that is translated and which part is not? >> >> translated: "unable to read sparse filter specification from" >> not translated: "sparse:oid=master" > > It doesn't know. By default we run the suite in LOCALE=C and it checks > the whole string. Under a GETTEXT_POISON build, it checks nothing at > all. > > The poison stuff is really about helping people not accidentally mark a > plumbing string (that we expect to get parsed by a machine) as > translatable. So the idea is you'd build with GETTEXT_POISON and then > run the test suite to see if anything breaks. But that means we also > have to annotate the test suite with "yes, I know this will be gibberish > in a poison build, but that's OK because it's meant for humans". And > that's what test_i18ngrep is. > > test_i18ngrep could be more clever about matching the gibberish, but > there's not much point. The LOCALE=C run already covered the correctness > of checking the message. Thanks for clearing this up for me! -Stolee ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-08-29 14:28 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-08-28 20:18 [PATCH 0/2] partial-clone: fix two issues with sparse filter handling Jon Simons 2019-08-28 20:18 ` [PATCH 1/2] list-objects-filter: only parse sparse OID when 'have_git_dir' Jon Simons 2019-08-28 21:10 ` Eric Sunshine 2019-08-28 23:35 ` Jeff King 2019-08-28 20:18 ` [PATCH 2/2] list-objects-filter: handle unresolved sparse filter OID Jon Simons 2019-08-29 13:12 ` Derrick Stolee 2019-08-29 13:44 ` Jeff King 2019-08-29 14:28 ` Derrick Stolee
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).