git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Document two partial clone bugs, fix one
@ 2020-02-19 16:21 Derrick Stolee via GitGitGadget
  2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-19 16:21 UTC (permalink / raw)
  To: git; +Cc: peff, me, jonathantanmy, jrnieder, Derrick Stolee

While playing with partial clone, I discovered a few bugs and document them
with tests in patch 1. One seems to be a server-side bug that happens in a
somewhat rare situation, but not terribly unlikely. The other is a
client-side bug that leads to quadratic amounts of data transfer; I fix this
bug in patch 2.

Thanks, -Stolee

Derrick Stolee (2):
  partial-clone: demonstrate bugs in partial fetch
  partial-clone: avoid fetching when looking for objects

 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 5 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-556%2Fderrickstolee%2Fpartial-clone-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-556/derrickstolee/partial-clone-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/556
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
@ 2020-02-19 16:21 ` Derrick Stolee via GitGitGadget
  2020-02-19 18:38   ` Eric Sunshine
  2020-02-19 20:52   ` Junio C Hamano
  2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-19 16:21 UTC (permalink / raw)
  To: git; +Cc: peff, me, jonathantanmy, jrnieder, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

While testing partial clone, I noticed some odd behavior. I was testing
a way of running 'git init', followed by manually configuring the remote
for partial clone, and then running 'git fetch'. Astonishingly, I saw
the 'git fetch' process start asking the server for multiple rounds of
pack-file downloads! When tweaking the situation a little more, I
discovered that I could cause the remote to hang up with an error.

Add two tests that demonstrate these two issues.

In the first test, we find that when fetching with blob filters from
a repository that previously did not have any tags, the 'git fetch
--tags origin' command fails because the server sends "multiple
filter-specs cannot be combined".

In the second test, we see that a 'git fetch origin' request with
several ref updates results in multiple pack-file downloads. This must
be due to Git trying to fault-in the objects pointed by the refs. What
makes this matter particularly nasty is that this goes through the
do_oid_object_info_extended() method, so there are no "haves" in the
negotiation. This leads the remote to send every reachable commit and
tree from each new ref, providing a quadratic amount of data transfer!
This test is fixed if we revert 6462d5eb9a (fetch: remove
fetch_if_missing=0, 2019-11-05), but that revert causes other test
failures. The real fix will need more care.

The tests are ordered in this way because if I swap the test order the
tag test will succeed instead of fail. I believe this is because somehow
we need the srv.bare repo to not have any tags when we clone, but then
have tags in our next fetch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t5616-partial-clone.sh | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index fea56cda6d3..ed2ef45c37a 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_failure 'verify fetch succeeds when asking for new tags' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
+	for i in I J K
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
+	git -C tag-test fetch --tags origin
+'
+
+test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 2 pack-list &&
+	for i in A B C
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
+	git -C pack-test fetch origin &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 3 pack-list
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] partial-clone: avoid fetching when looking for objects
  2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
  2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
@ 2020-02-19 16:21 ` Derrick Stolee via GitGitGadget
  2020-02-19 18:10   ` Jonathan Tan
  2020-02-19 21:10 ` [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee
  2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
  3 siblings, 1 reply; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-19 16:21 UTC (permalink / raw)
  To: git; +Cc: peff, me, jonathantanmy, jrnieder, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When using partial-clone, do_oid_object_info_extended() can trigger a
fetch for missing objects. This can be extremely expensive when asking
for a tag or commit, as we are completely removed from the context of
the missing object and thus supply no "haves" in the request.

6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
global variable that prevented these fetches in favor of a bitflag.
However, some object existence checks were not updated to use this flag.

Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
repreparing the pack-file structures. We need to be extremely careful
about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
due to updated refs.

This resolves a broken test in t5616-partial-clone.sh.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b4c6d921d06..fd69c4f69d7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -335,6 +335,7 @@ static void find_non_local_tags(const struct ref *refs,
 	struct string_list_item *remote_ref_item;
 	const struct ref *ref;
 	struct refname_hash_entry *item = NULL;
+	const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
 
 	refname_hash_init(&existing_refs);
 	refname_hash_init(&remote_refs);
@@ -353,10 +354,9 @@ static void find_non_local_tags(const struct ref *refs,
 		 */
 		if (ends_with(ref->name, "^{}")) {
 			if (item &&
-			    !has_object_file_with_flags(&ref->old_oid,
-							OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&ref->old_oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &ref->old_oid) &&
-			    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&item->oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &item->oid))
 				clear_item(item);
 			item = NULL;
@@ -370,7 +370,7 @@ static void find_non_local_tags(const struct ref *refs,
 		 * fetch.
 		 */
 		if (item &&
-		    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+		    !has_object_file_with_flags(&item->oid, quick_flags) &&
 		    !oidset_contains(&fetch_oids, &item->oid))
 			clear_item(item);
 
@@ -391,7 +391,7 @@ static void find_non_local_tags(const struct ref *refs,
 	 * checked to see if it needs fetching.
 	 */
 	if (item &&
-	    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+	    !has_object_file_with_flags(&item->oid, quick_flags) &&
 	    !oidset_contains(&fetch_oids, &item->oid))
 		clear_item(item);
 
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index ed2ef45c37a..c70516734d5 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -385,7 +385,7 @@ test_expect_failure 'verify fetch succeeds when asking for new tags' '
 	git -C tag-test fetch --tags origin
 '
 
-test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+test_expect_success 'verify fetch downloads only one pack when updating refs' '
 	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
 	ls pack-test/.git/objects/pack/*pack >pack-list &&
 	test_line_count = 2 pack-list &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] partial-clone: avoid fetching when looking for objects
  2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
@ 2020-02-19 18:10   ` Jonathan Tan
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Tan @ 2020-02-19 18:10 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, peff, me, jonathantanmy, jrnieder, dstolee

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> When using partial-clone, do_oid_object_info_extended() can trigger a
> fetch for missing objects. This can be extremely expensive when asking
> for a tag or commit, as we are completely removed from the context of
> the missing object and thus supply no "haves" in the request.
> 
> 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
> global variable that prevented these fetches in favor of a bitflag.
> However, some object existence checks were not updated to use this flag.
> 
> Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
> addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
> repreparing the pack-file structures. We need to be extremely careful
> about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
> due to updated refs.
> 
> This resolves a broken test in t5616-partial-clone.sh.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

Thanks for catching this.

I wonder if the commit message in this patch could be better worded -
the first paragraph seems to say that fetching missing commits and tags
are expensive, but that is not the problem here; the problem is that the
client lazily fetches refs advertised by the server, thinking that it is
lacking them due to partial clone, even when there is no expectation
that the client have them (so the commits and tags are not truly
missing).

So I would reword the first paragraph as:

  When using partial clone, find_non_local_tags() in builtin/fetch.c
  checks each remote tag to see if its object also exists locally. There
  is no expectation that the object exist locally, but this function
  nevertheless triggers a lazy fetch if the object does not exist. This
  can be extremely expensive when asking for a commit, as we are
  completely removed from the context of the non-existent object and
  thus supply no "haves" in the request.

All this rests on my thinking that "missing" has the connotation (or
actual meaning) that we expect the object to be there. If we think that
"missing" can also mean that the remote has it but the local doesn't,
then you can ignore what I just said :-)

Other than that, both patches look good to me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
@ 2020-02-19 18:38   ` Eric Sunshine
  2020-02-19 20:42     ` Derrick Stolee
  2020-02-19 20:52   ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Eric Sunshine @ 2020-02-19 18:38 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git List, Jeff King, Taylor Blau, Jonathan Tan, Jonathan Nieder,
	Derrick Stolee

On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> [...]
> The tests are ordered in this way because if I swap the test order the
> tag test will succeed instead of fail. I believe this is because somehow
> we need the srv.bare repo to not have any tags when we clone, but then
> have tags in our next fetch.

This ordering requirement might deserve an in-code comment in the test
script itself.

More below...

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> +       for i in I J K
> +       do
> +               test_commit -C src $i &&
> +               git -C src branch $i
> +       done &&

If test_commit() or git-branch fail, those failures will go unnoticed.
You can fix this by bailing from the loop, like this:

    for i in I J K
    do
        test_commit -C src $i &&
        git -C src branch $i || return 1
    done &&

Same comment applies to the other new test.

> +       git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> +       git -C tag-test fetch --tags origin
> +'
> +
> +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
> +       ls pack-test/.git/objects/pack/*pack >pack-list &&
> +       test_line_count = 2 pack-list &&
> +       for i in A B C
> +       do
> +               test_commit -C src $i &&
> +               git -C src branch $i
> +       done &&
> +       git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
> +       git -C pack-test fetch origin &&
> +       ls pack-test/.git/objects/pack/*pack >pack-list &&
> +       test_line_count = 3 pack-list
> +'

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 18:38   ` Eric Sunshine
@ 2020-02-19 20:42     ` Derrick Stolee
  0 siblings, 0 replies; 15+ messages in thread
From: Derrick Stolee @ 2020-02-19 20:42 UTC (permalink / raw)
  To: Eric Sunshine, Derrick Stolee via GitGitGadget
  Cc: Git List, Jeff King, Taylor Blau, Jonathan Tan, Jonathan Nieder,
	Derrick Stolee

On 2/19/2020 1:38 PM, Eric Sunshine wrote:
> On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> [...]
>> The tests are ordered in this way because if I swap the test order the
>> tag test will succeed instead of fail. I believe this is because somehow
>> we need the srv.bare repo to not have any tags when we clone, but then
>> have tags in our next fetch.
> 
> This ordering requirement might deserve an in-code comment in the test
> script itself.

Can do.

> More below...
> 
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
>> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
>> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>> +       for i in I J K
>> +       do
>> +               test_commit -C src $i &&
>> +               git -C src branch $i
>> +       done &&
> 
> If test_commit() or git-branch fail, those failures will go unnoticed.
> You can fix this by bailing from the loop, like this:
> 
>     for i in I J K
>     do
>         test_commit -C src $i &&
>         git -C src branch $i || return 1
>     done &&
> 
> Same comment applies to the other new test.

Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
  2020-02-19 18:38   ` Eric Sunshine
@ 2020-02-19 20:52   ` Junio C Hamano
  2020-02-19 20:59     ` Eric Sunshine
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2020-02-19 20:52 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, peff, me, jonathantanmy, jrnieder, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index fea56cda6d3..ed2ef45c37a 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
>  	grep "want $(cat hash)" trace
>  '
>  
> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> +	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> +	for i in I J K
> +	do
> +		test_commit -C src $i &&
> +		git -C src branch $i
> +	done &&
> +	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> +	git -C tag-test fetch --tags origin
> +'

Is this about an ultra-recent regresssion?  When applied directly on
top of v2.25.0, this one seems to pass already without any change.

> +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
> +	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
> +	ls pack-test/.git/objects/pack/*pack >pack-list &&
> +	test_line_count = 2 pack-list &&
> +	for i in A B C
> +	do
> +		test_commit -C src $i &&
> +		git -C src branch $i
> +	done &&
> +	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
> +	git -C pack-test fetch origin &&
> +	ls pack-test/.git/objects/pack/*pack >pack-list &&
> +	test_line_count = 3 pack-list
> +'
> +
>  . "$TEST_DIRECTORY"/lib-httpd.sh
>  start_httpd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 20:52   ` Junio C Hamano
@ 2020-02-19 20:59     ` Eric Sunshine
  2020-02-19 21:17       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Sunshine @ 2020-02-19 20:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, Git List, Jeff King, Taylor Blau,
	Jonathan Tan, Jonathan Nieder, Derrick Stolee

On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <gitster@pobox.com> wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> > +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> > +     for i in I J K
> > +     do
> > +             test_commit -C src $i &&
> > +             git -C src branch $i
> > +     done &&
> > +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> > +     git -C tag-test fetch --tags origin
> > +'
>
> Is this about an ultra-recent regresssion?  When applied directly on
> top of v2.25.0, this one seems to pass already without any change.

True, although both fail when applied atop "master".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] Document two partial clone bugs, fix one
  2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
  2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
  2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
@ 2020-02-19 21:10 ` Derrick Stolee
  2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
  3 siblings, 0 replies; 15+ messages in thread
From: Derrick Stolee @ 2020-02-19 21:10 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: peff, me, jonathantanmy, jrnieder, Derrick Stolee, Eric Sunshine

On 2/19/2020 11:21 AM, Derrick Stolee via GitGitGadget wrote:
> While playing with partial clone, I discovered a few bugs and document them
> with tests in patch 1. One seems to be a server-side bug that happens in a
> somewhat rare situation, but not terribly unlikely. The other is a
> client-side bug that leads to quadratic amounts of data transfer; I fix this
> bug in patch 2.

While I was able to demonstrate these bugs, after looking at my real-world
example with these fixes I found _yet another_ set of issues.

My "real world" test is the following: run "git init" and then populate
the config file with these contents:

[core]
        repositoryformatversion = 1
        filemode = true
        bare = false
        logallrefupdates = true
[remote "origin"]
        url = <url-that-supports-partial-clone>
        fetch = +refs/heads/*:refs/remotes/origin/*
        promisor = true
        partialclonefilter = blob:none
[branch "master"]
        remote = origin
        merge = refs/heads/master

Then run "git fetch origin".

First, we check if the repository contains a .gitmodule file, which
triggers a download of HEAD:.gitmodules. First HEAD, then its blob.
(This may be a bit of a red herring: I may have forgotten to delete
the contents of my refs/heads/master when setting up the test.)

#0  promisor_remote_get_direct (repo=repo@entry=0x555555a71680 <the_repo>, oids=oids@entry=0x7fffffffda70, oid_nr=oid_nr@entry=1) at promisor-remote.c:237
#1  0x000055555572bbfa in oid_object_info_extended (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, oi=oi@entry=0x7fffffffd8e0, flags=flags@entry=0) at sha1-file.c:1483
#2  0x000055555572bd7c in read_object (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, type=type@entry=0x7fffffffda64, size=size@entry=0x7fffffffda68) at sha1-file.c:1537
#3  0x000055555572be15 in read_object_file_extended (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, type=type@entry=0x7fffffffda64, size=size@entry=0x7fffffffda68, 
    lookup_replace=lookup_replace@entry=1) at sha1-file.c:1579
#4  0x000055555572c04c in repo_read_object_file (size=0x7fffffffda68, type=0x7fffffffda64, oid=0x7fffffffda70, r=0x555555a71680 <the_repo>) at object-store.h:192
#5  read_object_with_reference (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffdc20, required_type_name=<optimized out>, size=size@entry=0x7fffffffdae8, 
    actual_oid_return=actual_oid_return@entry=0x7fffffffdaf0) at sha1-file.c:1619
#6  0x0000555555755a53 in get_tree_entry (r=r@entry=0x555555a71680 <the_repo>, tree_oid=tree_oid@entry=0x7fffffffdc20, name=name@entry=0x5555557d5ffe ".gitmodules", oid=0x7fffffffdd40, 
    mode=mode@entry=0x7fffffffdcb0) at tree-walk.c:573
#7  0x000055555572f628 in get_oid_with_context_1 (repo=repo@entry=0x555555a71680 <the_repo>, name=name@entry=0x5555557d5ff9 "HEAD:.gitmodules", flags=flags@entry=0, prefix=prefix@entry=0x0, 
    oid=oid@entry=0x7fffffffdd40, oc=oc@entry=0x7fffffffdcb0) at sha1-name.c:1899
#8  0x000055555572fea3 in get_oid_with_context (oc=0x7fffffffdcb0, oid=0x7fffffffdd40, flags=0, str=0x5555557d5ff9 "HEAD:.gitmodules", repo=0x555555a71680 <the_repo>) at sha1-name.c:1946
#9  repo_get_oid (r=r@entry=0x555555a71680 <the_repo>, name=name@entry=0x5555557d5ff9 "HEAD:.gitmodules", oid=oid@entry=0x7fffffffdd40) at sha1-name.c:1602
#10 0x000055555573d962 in config_from_gitmodules (fn=fn@entry=0x55555573da30 <gitmodules_fetch_config>, repo=0x555555a71680 <the_repo>, data=data@entry=0x7fffffffdda0) at submodule-config.c:648
#11 0x000055555573ebed in config_from_gitmodules (data=0x7fffffffdda0, repo=<optimized out>, fn=0x55555573da30 <gitmodules_fetch_config>) at submodule-config.c:637
#12 fetch_config_from_gitmodules (max_children=max_children@entry=0x555555a32894 <submodule_fetch_jobs_config>, recurse_submodules=recurse_submodules@entry=0x555555a3288c <recurse_submodules>)
    at submodule-config.c:799
#13 0x00005555555a6920 in cmd_fetch (argc=2, argv=0x7fffffffe2e8, prefix=0x0) at builtin/fetch.c:1762
#14 0x0000555555570b9d in run_builtin (argv=<optimized out>, argc=<optimized out>, p=<optimized out>) at git.c:444
#15 handle_builtin (argc=<optimized out>, argv=<optimized out>) at git.c:674
#16 0x0000555555571d05 in run_argv (argv=0x7fffffffe030, argcp=0x7fffffffe03c) at git.c:741
#17 cmd_main (argc=<optimized out>, argv=<optimized out>) at git.c:872
#18 0x00005555555707c8 in main (argc=6, argv=0x7fffffffe2c8) at common-main.c:52

THEN, we start walking the refs to see if we have the objects
locally:

#0  promisor_remote_get_direct (repo=repo@entry=0x555555a71680 <the_repo>, oids=oids@entry=0x555555af7ed0, oid_nr=oid_nr@entry=1) at promisor-remote.c:237
#1  0x000055555572bbfa in oid_object_info_extended (r=0x555555a71680 <the_repo>, oid=<optimized out>, oi=0x555555a71940 <blank_oi>, oi@entry=0x0, flags=flags@entry=4) at sha1-file.c:1483
#2  0x000055555572c420 in repo_has_object_file_with_flags (flags=<optimized out>, oid=<optimized out>, r=<optimized out>) at sha1-file.c:1935
#3  repo_has_object_file (r=<optimized out>, oid=<optimized out>) at sha1-file.c:1942
#4  0x00005555556e682f in ref_resolves_to_object (refname=0x555555af7f40 "refs/remotes/origin/next", oid=<optimized out>, flags=<optimized out>) at refs.c:261
#5  0x00005555556eb059 in files_ref_iterator_advance (ref_iterator=0x555555b27ed0) at refs/files-backend.c:754
#6  0x00005555556f1082 in ref_iterator_advance (ref_iterator=0x555555b27ed0) at refs/iterator.c:13
#7  do_for_each_repo_ref_iterator (r=0x555555a71680 <the_repo>, iter=0x555555b27ed0, fn=fn@entry=0x5555556e5930 <do_for_each_ref_helper>, cb_data=cb_data@entry=0x7fffffffdc80) at refs/iterator.c:417
#8  0x00005555556e7b79 in do_for_each_ref (refs=<optimized out>, prefix=prefix@entry=0x5555557cb095 "", fn=fn@entry=0x5555555a3cb0 <add_one_refname>, trim=trim@entry=0, flags=flags@entry=0, 
    cb_data=cb_data@entry=0x55555572bbfa <oid_object_info_extended+778>) at refs.c:1566
#9  0x00005555556e8918 in refs_for_each_ref (cb_data=0x55555572bbfa <oid_object_info_extended+778>, fn=0x5555555a3cb0 <add_one_refname>, refs=<optimized out>) at refs.c:1572
#10 for_each_ref (fn=fn@entry=0x5555555a3cb0 <add_one_refname>, cb_data=cb_data@entry=0x7fffffffdd40) at refs.c:1577
#11 0x00005555555a46d6 in find_non_local_tags (refs=0x970e93150326b500, refs@entry=0x555555b2ae40, head=head@entry=0x7fffffffde20, tail=tail@entry=0x7fffffffde28) at builtin/fetch.c:344
#12 0x00005555555a7a87 in get_ref_map (rs=0x7fffffffde90, rs=0x7fffffffde90, autotags=<synthetic pointer>, tags=<optimized out>, remote_refs=0x555555b2ae40, remote=<optimized out>) at builtin/fetch.c:523
#13 do_fetch (rs=0x7fffffffde90, transport=<optimized out>) at builtin/fetch.c:1367
#14 fetch_one (prune_tags_ok=<optimized out>, argv=<optimized out>, argc=<optimized out>, remote=<optimized out>) at builtin/fetch.c:1738
#15 cmd_fetch (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin/fetch.c:1827
#16 0x0000555555570b9d in run_builtin (argv=<optimized out>, argc=<optimized out>, p=<optimized out>) at git.c:444
#17 handle_builtin (argc=<optimized out>, argv=<optimized out>) at git.c:674
#18 0x0000555555571d05 in run_argv (argv=0x7fffffffe050, argcp=0x7fffffffe05c) at git.c:741
#19 cmd_main (argc=<optimized out>, argv=<optimized out>) at git.c:872
#20 0x00005555555707c8 in main (argc=5, argv=0x7fffffffe2e8) at common-main.c:52

This is running through has_object_file(), but the worst part is that switching
it to has_object_file_with_flags(oid, OBJECT_INFO_SKIP_FETCH_OBJECT) will cause
warnings when we do not have the object. Something else must be done here.

Since this is more complicated to fix, I'm going to set this part aside
for now. I may come back with a test case that demonstrates the problem.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 20:59     ` Eric Sunshine
@ 2020-02-19 21:17       ` Junio C Hamano
  2020-02-19 21:20         ` Derrick Stolee
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2020-02-19 21:17 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Derrick Stolee via GitGitGadget, Git List, Jeff King, Taylor Blau,
	Jonathan Tan, Jonathan Nieder, Derrick Stolee

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <gitster@pobox.com> wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> > +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>> > +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>> > +     for i in I J K
>> > +     do
>> > +             test_commit -C src $i &&
>> > +             git -C src branch $i
>> > +     done &&
>> > +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
>> > +     git -C tag-test fetch --tags origin
>> > +'
>>
>> Is this about an ultra-recent regresssion?  When applied directly on
>> top of v2.25.0, this one seems to pass already without any change.
>
> True, although both fail when applied atop "master".

I flipped the first one (i.e. test #24) to expect success and run
bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'",
2020-02-12) and the tip of 'master'.

Interesting that bisecting it points at 684ceae3 ("fetch: default to
protocol version 2", 2019-12-23).

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 9a9178fd28..099406c2f1 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -384,6 +384,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_success 'verify fetch succeeds when asking for new tags' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
+	for i in I J K
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
+	git -C tag-test fetch --tags origin
+'
+
+test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 2 pack-list &&
+	for i in A B C
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
+	git -C pack-test fetch origin &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 3 pack-list
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.25.1-440-g39558b81cc


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-19 21:17       ` Junio C Hamano
@ 2020-02-19 21:20         ` Derrick Stolee
  0 siblings, 0 replies; 15+ messages in thread
From: Derrick Stolee @ 2020-02-19 21:20 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine
  Cc: Derrick Stolee via GitGitGadget, Git List, Jeff King, Taylor Blau,
	Jonathan Tan, Jonathan Nieder, Derrick Stolee

On 2/19/2020 4:17 PM, Junio C Hamano wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> 
>> On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <gitster@pobox.com> wrote:
>>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>>>> +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>>>> +     for i in I J K
>>>> +     do
>>>> +             test_commit -C src $i &&
>>>> +             git -C src branch $i
>>>> +     done &&
>>>> +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
>>>> +     git -C tag-test fetch --tags origin
>>>> +'
>>>
>>> Is this about an ultra-recent regresssion?  When applied directly on
>>> top of v2.25.0, this one seems to pass already without any change.
>>
>> True, although both fail when applied atop "master".
> 
> I flipped the first one (i.e. test #24) to expect success and run
> bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'",
> 2020-02-12) and the tip of 'master'.
> 
> Interesting that bisecting it points at 684ceae3 ("fetch: default to
> protocol version 2", 2019-12-23).

Thanks for tracking this down. I had originally been working on
top of master, but then rebased onto v2.25.0 to test this on our
VFS for Git/Scalar fork [1]. I have since noticed that the test
passes in that case.

Thanks,
-Stolee

[1] https://github.com/microsoft/git/pull/247

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 0/2] Document two partial clone bugs, fix one
  2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-02-19 21:10 ` [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee
@ 2020-02-21 21:47 ` Derrick Stolee via GitGitGadget
  2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
                     ` (2 more replies)
  3 siblings, 3 replies; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-21 21:47 UTC (permalink / raw)
  To: git; +Cc: peff, me, jonathantanmy, jrnieder, sunshine, Derrick Stolee

While playing with partial clone, I discovered a few bugs and document them
with tests in patch 1. One seems to be a server-side bug that happens in a
somewhat rare situation, but not terribly unlikely. The other is a
client-side bug that leads to quadratic amounts of data transfer; I fix this
bug in patch 2.

UPDATES in V2:

 * Added "|| return 1" inside the for loops.
   
   
 * Added an in-test comment about the test ordering.
   
   
 * Required protocol.version=2 in the tags test due to the bisect Junio
   performed.
   
   
 * Updated the commit message via Jonathan Tan's suggestion.
   
   

You can ignore the stack traces I sent earlier, as those seem to be from
states I cannot get into without being destructive to my .git directory.

Thanks, -Stolee

Derrick Stolee (2):
  partial-clone: demonstrate bugs in partial fetch
  partial-clone: avoid fetching when looking for objects

 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh | 31 +++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 5 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-556%2Fderrickstolee%2Fpartial-clone-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-556/derrickstolee/partial-clone-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/556

Range-diff vs v1:

 1:  dbc1bdcae16 ! 1:  04965a8c7a4 partial-clone: demonstrate bugs in partial fetch
     @@ -14,7 +14,8 @@
          In the first test, we find that when fetching with blob filters from
          a repository that previously did not have any tags, the 'git fetch
          --tags origin' command fails because the server sends "multiple
     -    filter-specs cannot be combined".
     +    filter-specs cannot be combined". This only happens when using
     +    protocol v2.
      
          In the second test, we see that a 'git fetch origin' request with
          several ref updates results in multiple pack-file downloads. This must
     @@ -41,15 +42,20 @@
       	grep "want $(cat hash)" trace
       '
       
     ++# The following two tests must be in this order, or else
     ++# the first will not fail. It is important that the srv.bare
     ++# repository did not have tags during clone, but has tags
     ++# in the fetch.
     ++
      +test_expect_failure 'verify fetch succeeds when asking for new tags' '
      +	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
      +	for i in I J K
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
     -+	git -C tag-test fetch --tags origin
     ++	git -C tag-test -c protocol.version=2 fetch --tags origin
      +'
      +
      +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
     @@ -59,7 +65,7 @@
      +	for i in A B C
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
      +	git -C pack-test fetch origin &&
 2:  937a882261d ! 2:  7c4c9f0f8e1 partial-clone: avoid fetching when looking for objects
     @@ -2,10 +2,13 @@
      
          partial-clone: avoid fetching when looking for objects
      
     -    When using partial-clone, do_oid_object_info_extended() can trigger a
     -    fetch for missing objects. This can be extremely expensive when asking
     -    for a tag or commit, as we are completely removed from the context of
     -    the missing object and thus supply no "haves" in the request.
     +    When using partial clone, find_non_local_tags() in builtin/fetch.c
     +    checks each remote tag to see if its object also exists locally. There
     +    is no expectation that the object exist locally, but this function
     +    nevertheless triggers a lazy fetch if the object does not exist. This
     +    can be extremely expensive when asking for a commit, as we are
     +    completely removed from the context of the non-existent object and
     +    thus supply no "haves" in the request.
      
          6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
          global variable that prevented these fetches in favor of a bitflag.
     @@ -68,7 +71,7 @@
       --- a/t/t5616-partial-clone.sh
       +++ b/t/t5616-partial-clone.sh
      @@
     - 	git -C tag-test fetch --tags origin
     + 	git -C tag-test -c protocol.version=2 fetch --tags origin
       '
       
      -test_expect_failure 'verify fetch downloads only one pack when updating refs' '

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch
  2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
@ 2020-02-21 21:47   ` Derrick Stolee via GitGitGadget
  2020-02-21 21:47   ` [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
  2020-02-22 17:25   ` [PATCH v2 0/2] Document two partial clone bugs, fix one Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-21 21:47 UTC (permalink / raw)
  To: git
  Cc: peff, me, jonathantanmy, jrnieder, sunshine, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

While testing partial clone, I noticed some odd behavior. I was testing
a way of running 'git init', followed by manually configuring the remote
for partial clone, and then running 'git fetch'. Astonishingly, I saw
the 'git fetch' process start asking the server for multiple rounds of
pack-file downloads! When tweaking the situation a little more, I
discovered that I could cause the remote to hang up with an error.

Add two tests that demonstrate these two issues.

In the first test, we find that when fetching with blob filters from
a repository that previously did not have any tags, the 'git fetch
--tags origin' command fails because the server sends "multiple
filter-specs cannot be combined". This only happens when using
protocol v2.

In the second test, we see that a 'git fetch origin' request with
several ref updates results in multiple pack-file downloads. This must
be due to Git trying to fault-in the objects pointed by the refs. What
makes this matter particularly nasty is that this goes through the
do_oid_object_info_extended() method, so there are no "haves" in the
negotiation. This leads the remote to send every reachable commit and
tree from each new ref, providing a quadratic amount of data transfer!
This test is fixed if we revert 6462d5eb9a (fetch: remove
fetch_if_missing=0, 2019-11-05), but that revert causes other test
failures. The real fix will need more care.

The tests are ordered in this way because if I swap the test order the
tag test will succeed instead of fail. I believe this is because somehow
we need the srv.bare repo to not have any tags when we clone, but then
have tags in our next fetch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t5616-partial-clone.sh | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index fea56cda6d3..022f1018ace 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -374,6 +374,37 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+# The following two tests must be in this order, or else
+# the first will not fail. It is important that the srv.bare
+# repository did not have tags during clone, but has tags
+# in the fetch.
+
+test_expect_failure 'verify fetch succeeds when asking for new tags' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
+	for i in I J K
+	do
+		test_commit -C src $i &&
+		git -C src branch $i || return 1
+	done &&
+	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
+	git -C tag-test -c protocol.version=2 fetch --tags origin
+'
+
+test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 2 pack-list &&
+	for i in A B C
+	do
+		test_commit -C src $i &&
+		git -C src branch $i || return 1
+	done &&
+	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
+	git -C pack-test fetch origin &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 3 pack-list
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects
  2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
  2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
@ 2020-02-21 21:47   ` Derrick Stolee via GitGitGadget
  2020-02-22 17:25   ` [PATCH v2 0/2] Document two partial clone bugs, fix one Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-02-21 21:47 UTC (permalink / raw)
  To: git
  Cc: peff, me, jonathantanmy, jrnieder, sunshine, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When using partial clone, find_non_local_tags() in builtin/fetch.c
checks each remote tag to see if its object also exists locally. There
is no expectation that the object exist locally, but this function
nevertheless triggers a lazy fetch if the object does not exist. This
can be extremely expensive when asking for a commit, as we are
completely removed from the context of the non-existent object and
thus supply no "haves" in the request.

6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
global variable that prevented these fetches in favor of a bitflag.
However, some object existence checks were not updated to use this flag.

Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
repreparing the pack-file structures. We need to be extremely careful
about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
due to updated refs.

This resolves a broken test in t5616-partial-clone.sh.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b4c6d921d06..fd69c4f69d7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -335,6 +335,7 @@ static void find_non_local_tags(const struct ref *refs,
 	struct string_list_item *remote_ref_item;
 	const struct ref *ref;
 	struct refname_hash_entry *item = NULL;
+	const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
 
 	refname_hash_init(&existing_refs);
 	refname_hash_init(&remote_refs);
@@ -353,10 +354,9 @@ static void find_non_local_tags(const struct ref *refs,
 		 */
 		if (ends_with(ref->name, "^{}")) {
 			if (item &&
-			    !has_object_file_with_flags(&ref->old_oid,
-							OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&ref->old_oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &ref->old_oid) &&
-			    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&item->oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &item->oid))
 				clear_item(item);
 			item = NULL;
@@ -370,7 +370,7 @@ static void find_non_local_tags(const struct ref *refs,
 		 * fetch.
 		 */
 		if (item &&
-		    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+		    !has_object_file_with_flags(&item->oid, quick_flags) &&
 		    !oidset_contains(&fetch_oids, &item->oid))
 			clear_item(item);
 
@@ -391,7 +391,7 @@ static void find_non_local_tags(const struct ref *refs,
 	 * checked to see if it needs fetching.
 	 */
 	if (item &&
-	    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+	    !has_object_file_with_flags(&item->oid, quick_flags) &&
 	    !oidset_contains(&fetch_oids, &item->oid))
 		clear_item(item);
 
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 022f1018ace..da3ad4e32a1 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -390,7 +390,7 @@ test_expect_failure 'verify fetch succeeds when asking for new tags' '
 	git -C tag-test -c protocol.version=2 fetch --tags origin
 '
 
-test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+test_expect_success 'verify fetch downloads only one pack when updating refs' '
 	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
 	ls pack-test/.git/objects/pack/*pack >pack-list &&
 	test_line_count = 2 pack-list &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/2] Document two partial clone bugs, fix one
  2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
  2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
  2020-02-21 21:47   ` [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
@ 2020-02-22 17:25   ` Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2020-02-22 17:25 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, peff, me, jonathantanmy, jrnieder, sunshine, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> While playing with partial clone, I discovered a few bugs and document them
> with tests in patch 1. One seems to be a server-side bug that happens in a
> somewhat rare situation, but not terribly unlikely. The other is a
> client-side bug that leads to quadratic amounts of data transfer; I fix this
> bug in patch 2.
>
> UPDATES in V2:
>
>  * Added "|| return 1" inside the for loops.
>    
>    
>  * Added an in-test comment about the test ordering.
>    
>    
>  * Required protocol.version=2 in the tags test due to the bisect Junio
>    performed.
>    
>    
>  * Updated the commit message via Jonathan Tan's suggestion.
>    

Now this can safely be queued directly on v2.25.0, I'll
rebase it (earlyer I queued it after the merge to make protocol v2
the default).

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-02-22 17:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
2020-02-19 18:38   ` Eric Sunshine
2020-02-19 20:42     ` Derrick Stolee
2020-02-19 20:52   ` Junio C Hamano
2020-02-19 20:59     ` Eric Sunshine
2020-02-19 21:17       ` Junio C Hamano
2020-02-19 21:20         ` Derrick Stolee
2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-19 18:10   ` Jonathan Tan
2020-02-19 21:10 ` [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee
2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
2020-02-21 21:47   ` [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-22 17:25   ` [PATCH v2 0/2] Document two partial clone bugs, fix one Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).