git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] connected.c: reprepare packs for corner cases
@ 2020-03-12 15:36 Derrick Stolee via GitGitGadget
  2020-03-12 16:39 ` Jonathan Tan
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-03-12 15:36 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, me, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

While updating the microsoft/git fork on top of v2.26.0-rc0 and
consuming that build into Scalar, I noticed a corner case bug around
partial clone.

The "scalar clone" command can create a Git repository with the
proper config for using partial clone with the "blob:none" filter.
Instead of calling "git clone", it runs "git init" then sets a few
more config values before running "git fetch".

In our builds on v2.26.0-rc0, we noticed that our "git fetch"
command was failing with

  error: https://github.com/microsoft/scalar did not send all necessary objects

This does not happen if you copy the config file from a repository
created by "git clone --filter=blob:none <url>", but it does happen
when adding the config option "core.logAllRefUpdates = true".

By debugging, I was able to see that the loop inside
check_connnected() that checks if all refs are contained in
promisor packs actually did not have any packfiles in the packed_git
list.

I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
spot during the fetch operation. Even worse, I have failed to create
a test case to prevent a regression.

Placing a reprepare_packed_git() call inside chck_connected() before
looping through the packed_git list seems like the safest way to
avoid this issue in the future. While reprepare_packed_git() does
another scan of the pack directory, it is not terribly expensive as
long as we do not run it in a loop. We check connectivity only a
few times per command, so this will not have a meaningful performance
impact. In exchange, we get extra safety around this check.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
    connected.c: reprepare packs for corner cases
    
    I included how I found this (integrating v2.26.0-rc0 into Scalar), but I
    am able to reproduce it on my Linux machine using real fetches from
    github.com. I'm not sure why I was unable to reproduce the issue in test
    cases using the file:// URLs or the HTTP tests.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-579%2Fderrickstolee%2Ffetch-reprepare-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-579/derrickstolee/fetch-reprepare-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/579

 connected.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/connected.c b/connected.c
index 7e9bd1bc622..ac52b07b474 100644
--- a/connected.c
+++ b/connected.c
@@ -61,7 +61,11 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		 * object is a promisor object. Instead, just make sure we
 		 * received, in a promisor packfile, the objects pointed to by
 		 * each wanted ref.
+		 *
+		 * Before checking for promisor packs, be sure we have the
+		 * latest pack-files loaded into memory.
 		 */
+		reprepare_packed_git(the_repository);
 		do {
 			struct packed_git *p;
 

base-commit: b4374e96c84ed9394fed363973eb540da308ed4f
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 15:36 [PATCH] connected.c: reprepare packs for corner cases Derrick Stolee via GitGitGadget
@ 2020-03-12 16:39 ` Jonathan Tan
  2020-03-12 17:34   ` Derrick Stolee
  2020-03-12 20:42 ` Junio C Hamano
  2020-03-13 21:11 ` [PATCH v2] " Derrick Stolee via GitGitGadget
  2 siblings, 1 reply; 12+ messages in thread
From: Jonathan Tan @ 2020-03-12 16:39 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, jonathantanmy, me, dstolee

> Placing a reprepare_packed_git() call inside chck_connected() before
> looping through the packed_git list seems like the safest way to
> avoid this issue in the future. While reprepare_packed_git() does
> another scan of the pack directory, it is not terribly expensive as
> long as we do not run it in a loop. We check connectivity only a
> few times per command, so this will not have a meaningful performance
> impact. In exchange, we get extra safety around this check.

This also means that check_connected() now does the equivalent of
reprepare_packed_git() in both its branches (the rev-list one, which
spawns a new process and thus rereads the pack directory, and the fast
one). This will at least help callers to reason about its behavior more
simply, so it sounds like a good change to me.

>     I included how I found this (integrating v2.26.0-rc0 into Scalar), but I
>     am able to reproduce it on my Linux machine using real fetches from
>     github.com. I'm not sure why I was unable to reproduce the issue in test
>     cases using the file:// URLs or the HTTP tests.

If you remember how to reproduce it using real fetches from github.com,
could you give us reproduction steps?

Thanks for attempting to reproduce it in a test script.

> diff --git a/connected.c b/connected.c
> index 7e9bd1bc622..ac52b07b474 100644
> --- a/connected.c
> +++ b/connected.c
> @@ -61,7 +61,11 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  		 * object is a promisor object. Instead, just make sure we
>  		 * received, in a promisor packfile, the objects pointed to by
>  		 * each wanted ref.
> +		 *
> +		 * Before checking for promisor packs, be sure we have the
> +		 * latest pack-files loaded into memory.
>  		 */
> +		reprepare_packed_git(the_repository);
>  		do {
>  			struct packed_git *p;

The code itself looks good to me.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 16:39 ` Jonathan Tan
@ 2020-03-12 17:34   ` Derrick Stolee
  0 siblings, 0 replies; 12+ messages in thread
From: Derrick Stolee @ 2020-03-12 17:34 UTC (permalink / raw)
  To: Jonathan Tan, gitgitgadget; +Cc: git, me, dstolee

On 3/12/2020 12:39 PM, Jonathan Tan wrote:
>> Placing a reprepare_packed_git() call inside chck_connected() before
>> looping through the packed_git list seems like the safest way to
>> avoid this issue in the future. While reprepare_packed_git() does
>> another scan of the pack directory, it is not terribly expensive as
>> long as we do not run it in a loop. We check connectivity only a
>> few times per command, so this will not have a meaningful performance
>> impact. In exchange, we get extra safety around this check.
> 
> This also means that check_connected() now does the equivalent of
> reprepare_packed_git() in both its branches (the rev-list one, which
> spawns a new process and thus rereads the pack directory, and the fast
> one). This will at least help callers to reason about its behavior more
> simply, so it sounds like a good change to me.
> 
>>     I included how I found this (integrating v2.26.0-rc0 into Scalar), but I
>>     am able to reproduce it on my Linux machine using real fetches from
>>     github.com. I'm not sure why I was unable to reproduce the issue in test
>>     cases using the file:// URLs or the HTTP tests.
> 
> If you remember how to reproduce it using real fetches from github.com,
> could you give us reproduction steps?

Sure. Run `git init`, then replace the `.git/config` file with this:

[core]
        repositoryformatversion = 1
        filemode = false
        bare = false
        logallrefupdates = true
        symlinks = false
        ignorecase = true
        repositoryFormat = 1
[protocol]
        version = 1
[remote "origin"]
        url = https://github.com/derrickstolee/TreeSearch
        fetch = +refs/heads/*:refs/remotes/origin/*
        promisor = true
        partialclonefilter = blob:none
[branch "master"]
        remote = origin
        merge = refs/heads/master

then run "git fetch origin". I purposefully put a very small repo that
I have laying around, but this repros with git/git.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 15:36 [PATCH] connected.c: reprepare packs for corner cases Derrick Stolee via GitGitGadget
  2020-03-12 16:39 ` Jonathan Tan
@ 2020-03-12 20:42 ` Junio C Hamano
  2020-03-12 21:16   ` Jeff King
  2020-03-13 21:11 ` [PATCH v2] " Derrick Stolee via GitGitGadget
  2 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2020-03-12 20:42 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, jonathantanmy, me, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> While updating the microsoft/git fork on top of v2.26.0-rc0 and
> consuming that build into Scalar, I noticed a corner case bug around
> partial clone.
>
> The "scalar clone" command can create a Git repository with the
> proper config for using partial clone with the "blob:none" filter.
> Instead of calling "git clone", it runs "git init" then sets a few
> more config values before running "git fetch".
>
> In our builds on v2.26.0-rc0, we noticed that our "git fetch"
> command was failing with
>
>   error: https://github.com/microsoft/scalar did not send all necessary objects
>
> This does not happen if you copy the config file from a repository
> created by "git clone --filter=blob:none <url>", but it does happen
> when adding the config option "core.logAllRefUpdates = true".
>
> By debugging, I was able to see that the loop inside
> check_connnected() that checks if all refs are contained in
> promisor packs actually did not have any packfiles in the packed_git
> list.
> I'm not sure what corner-case issues caused this config option to
> prevent the reprepare_packed_git() from being called at the proper
> spot during the fetch operation. Even worse, I have failed to create
> a test case to prevent a regression.
>
> Placing a reprepare_packed_git() call inside chck_connected() before
> looping through the packed_git list seems like the safest way to
> avoid this issue in the future.

Hmmm.  I am not sure if I am convinced that check_connected() is the
best place to do this.  Do we know the place that adds a new pack to
the repository, yet forgets to add it to the packed-git list, that
caused the failure you were observing?  Doing this change, without
describing the answer to the question in the log message, makes it
smell rather like a random hack than a designed solution to me.

If lazy fetching of objects happen in multiple fetches before a
single check_connected() sweeps them to check for connectivity, then
perhaps the lazy fetching codepath needs to remember the fact that
it added a new pack that is still not known to the packed-git list
(or just add it immediately, without having to scan at all), and
check_connected() would need to rescan only when there is at least
one such new pack?  That way, you do not have to penalize normal
callers of check_connected() that do not use lazy fetches at all,
right?

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 20:42 ` Junio C Hamano
@ 2020-03-12 21:16   ` Jeff King
  2020-03-12 21:26     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2020-03-12 21:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, jonathantanmy, me,
	Derrick Stolee

On Thu, Mar 12, 2020 at 01:42:58PM -0700, Junio C Hamano wrote:

> > Placing a reprepare_packed_git() call inside chck_connected() before
> > looping through the packed_git list seems like the safest way to
> > avoid this issue in the future.
> 
> Hmmm.  I am not sure if I am convinced that check_connected() is the
> best place to do this.  Do we know the place that adds a new pack to
> the repository, yet forgets to add it to the packed-git list, that
> caused the failure you were observing?  Doing this change, without
> describing the answer to the question in the log message, makes it
> smell rather like a random hack than a designed solution to me.

Thanks, I was just writing a very similar message. :)

> If lazy fetching of objects happen in multiple fetches before a
> single check_connected() sweeps them to check for connectivity, then
> perhaps the lazy fetching codepath needs to remember the fact that
> it added a new pack that is still not known to the packed-git list
> (or just add it immediately, without having to scan at all), and
> check_connected() would need to rescan only when there is at least
> one such new pack?  That way, you do not have to penalize normal
> callers of check_connected() that do not use lazy fetches at all,
> right?

I share your concern that it's not the best place for this. In practice,
I do doubt that callers of check_connected() would notice, as it's a
pretty heavy-weight operation by itself. But I'd be concerned about
going the other way: we know that calling it in check_connected() fixes
_this_ problem, but we don't know if there's other code that would need
a similar fix.

I was able to reproduce easily from Stolee's instructions, but I did
notice one interesting thing: the problem occurs only with the http
protocol, not with git:// or ssh. The obvious difference between them is
that http code is mostly running in the remote-helper program and in a
spawned git-fetch-pack. But the check_connected() we hit is in the
parent git-fetch process. So I _suspect_ that there's some low-level
code which calls reprepare() that happens in-process in most cases, but
not for the http case.

If I instrument Git like this:

diff --git a/common-main.c b/common-main.c
index 71e21dd20a..bf6ffe42b7 100644
--- a/common-main.c
+++ b/common-main.c
@@ -49,6 +49,11 @@ int main(int argc, const char **argv)
 	trace2_cmd_start(argv);
 	trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
 
+	trace_printf("pid=%d, cmd=%s",
+		     (int)getpid(),
+		     !ends_with(argv[0], "git") ? argv[0] :
+		     argv[1] ? argv[1] : argv[0]);
+
 	result = cmd_main(argc, argv);
 
 	trace2_cmd_exit(result);
diff --git a/packfile.c b/packfile.c
index f4e752996d..832d020e6e 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1004,6 +1004,8 @@ void reprepare_packed_git(struct repository *r)
 {
 	struct object_directory *odb;
 
+	trace_printf("repreparing in pid %d", (int)getpid());
+
 	obj_read_lock();
 	for (odb = r->objects->odb; odb; odb = odb->next)
 		odb_clear_loose_cache(odb);

then running over the git protocol gives me:

  17:13:52.642337 common-main.c:52        pid=882782, cmd=fetch
  17:13:52.642871 git.c:439               trace: built-in: git fetch origin
  17:13:52.725971 packfile.c:1007         repreparing in pid 882782
  remote: Enumerating objects: 49, done.
  17:13:52.793355 run-command.c:663       trace: run_command: git index-pack --stdin -v --fix-thin --promisor --pack_header=2,49
  remote: Total 49 (delta 0), reused 0 (delta 0), pack-reused 49
  17:13:52.796106 common-main.c:52        pid=882784, cmd=index-pack
  17:13:52.796655 git.c:439               trace: built-in: git index-pack --stdin -v --fix-thin --promisor --pack_header=2,49
  Receiving objects: 100% (49/49), 6.87 KiB | 6.87 MiB/s, done.
  Resolving deltas: 100% (17/17), done.
  17:13:52.811302 packfile.c:1007         repreparing in pid 882782
  From git://github.com/derrickstolee/TreeSearch
   * [new branch]      master     -> origin/master
  17:13:52.812955 run-command.c:1616      run_processes_parallel: preparing to run up to 1 tasks
  17:13:52.812986 run-command.c:1648      run_processes_parallel: done
  17:13:52.813034 run-command.c:663       trace: run_command: git gc --auto
  17:13:52.815435 common-main.c:52        pid=882788, cmd=gc
  17:13:52.815941 git.c:439               trace: built-in: git gc --auto

There's an early reprepare, but the one after index-pack finishes
resolving deltas is probably the interesting one. And it happens inside
the fetch process. But if we go back to http, I get:

  17:15:28.234866 common-main.c:52        pid=882836, cmd=fetch
  17:15:28.235403 git.c:439               trace: built-in: git fetch origin
  17:15:28.236329 run-command.c:663       trace: run_command: GIT_DIR=.git git-remote-https origin https://github.com/derrickstolee/TreeSearch
  17:15:28.244049 common-main.c:52        pid=882837, cmd=/home/peff/compile/git/git-remote-https
  17:15:28.382586 packfile.c:1007         repreparing in pid 882836
  17:15:28.382996 run-command.c:663       trace: run_command: git fetch-pack --stateless-rpc --stdin --lock-pack --include-tag --thin --from-promisor --filter=blob:none https://github.com/derrickstolee/TreeSearch/
  17:15:28.386284 common-main.c:52        pid=882839, cmd=fetch-pack
  17:15:28.386807 git.c:439               trace: built-in: git fetch-pack --stateless-rpc --stdin --lock-pack --include-tag --thin --from-promisor --filter=blob:none https://github.com/derrickstolee/TreeSearch/
  remote: Enumerating objects: 49, done.
  remote: Total 49 (delta 0), reused 0 (delta 0), pack-reused 49
  17:15:28.444799 run-command.c:663       trace: run_command: git index-pack --stdin -v --fix-thin --promisor --pack_header=2,49
  17:15:28.447441 common-main.c:52        pid=882841, cmd=index-pack
  17:15:28.447944 git.c:439               trace: built-in: git index-pack --stdin -v --fix-thin --promisor --pack_header=2,49
  Receiving objects: 100% (49/49), 6.87 KiB | 3.44 MiB/s, done.
  Resolving deltas: 100% (17/17), done.
  17:15:28.461871 packfile.c:1007         repreparing in pid 882839
  error: https://github.com/derrickstolee/TreeSearch did not send all necessary objects
  
  17:15:28.463446 run-command.c:663       trace: run_command: git gc --auto
  17:15:28.463949 common-main.c:52        pid=882845, cmd=gc
  17:15:28.464056 git.c:439               trace: built-in: git gc --auto

There we see that same reprepare happen in 882839, which is the child
fetch-pack. The parent fetch probably needs to reprepare itself after
fetch-pack completes.

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 21:16   ` Jeff King
@ 2020-03-12 21:26     ` Jeff King
  2020-03-13  0:54       ` Derrick Stolee
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2020-03-12 21:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, jonathantanmy, me,
	Derrick Stolee

On Thu, Mar 12, 2020 at 05:16:38PM -0400, Jeff King wrote:

> There we see that same reprepare happen in 882839, which is the child
> fetch-pack. The parent fetch probably needs to reprepare itself after
> fetch-pack completes.

Actually, it's not fetch which is running fetch-pack, but rather the
remote helper itself. So I think the simplest thing is for the
remote-helper layer to do something like this:

diff --git a/transport-helper.c b/transport-helper.c
index 20a7185ec4..25957e9a05 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -14,6 +14,7 @@
 #include "refspec.h"
 #include "transport-internal.h"
 #include "protocol.h"
+#include "packfile.h"
 
 static int debug;
 
@@ -672,6 +673,7 @@ static int fetch(struct transport *transport,
 {
 	struct helper_data *data = transport->data;
 	int i, count;
+	int ret;
 
 	get_helper(transport);
 
@@ -710,13 +712,18 @@ static int fetch(struct transport *transport,
 	if (data->transport_options.negotiation_tips)
 		warning("Ignoring --negotiation-tip because the protocol does not support it.");
 
-	if (data->fetch)
-		return fetch_with_fetch(transport, nr_heads, to_fetch);
+	ret = data->fetch ? fetch_with_fetch(transport, nr_heads, to_fetch) :
+	      data->import ? fetch_with_import(transport, nr_heads, to_fetch) :
+	      -1;
 
-	if (data->import)
-		return fetch_with_import(transport, nr_heads, to_fetch);
+	/*
+	 * We may have just received a pack through the helper sub-process;
+	 * refresh the pack list.
+	 */
+	if (!ret)
+		reprepare_packed_git(the_repository);
 
-	return -1;
+	return ret;
 }
 
 static int push_update_ref_status(struct strbuf *buf,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-12 21:26     ` Jeff King
@ 2020-03-13  0:54       ` Derrick Stolee
  2020-03-13  1:14         ` Junio C Hamano
  2020-03-13  2:30         ` Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Derrick Stolee @ 2020-03-13  0:54 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, jonathantanmy, me,
	Derrick Stolee, git

On 3/12/2020 5:26 PM, Jeff King wrote:
> On Thu, Mar 12, 2020 at 05:16:38PM -0400, Jeff King wrote:
> 
>> There we see that same reprepare happen in 882839, which is the child
>> fetch-pack. The parent fetch probably needs to reprepare itself after
>> fetch-pack completes.

I agree with you and Junio that where I put the reprepare was non-
optimal. The initial reason to put it there was that I found where
the error was happening, and thought that placing the reprepare there
was the best way to prevent this error from popping up in another case.

The result of a fetch failing and saying the remote did something wrong
is quite alarming, and I wanted to avoid that from happening again in
the future from some other path.

But you're right, it's better to be as correct as possible.

> Actually, it's not fetch which is running fetch-pack, but rather the
> remote helper itself. So I think the simplest thing is for the
> remote-helper layer to do something like this:

I appreciate your root-causing this into the multi-process nature of
fetch. I will update the commit message to include your details,
especially about how it does not reproduce over file or ssh protocol.
 
> diff --git a/transport-helper.c b/transport-helper.c
> index 20a7185ec4..25957e9a05 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -14,6 +14,7 @@
>  #include "refspec.h"
>  #include "transport-internal.h"
>  #include "protocol.h"
> +#include "packfile.h"
>  
>  static int debug;
>  
> @@ -672,6 +673,7 @@ static int fetch(struct transport *transport,
>  {
>  	struct helper_data *data = transport->data;
>  	int i, count;
> +	int ret;
>  
>  	get_helper(transport);
>  
> @@ -710,13 +712,18 @@ static int fetch(struct transport *transport,
>  	if (data->transport_options.negotiation_tips)
>  		warning("Ignoring --negotiation-tip because the protocol does not support it.");
>  
> -	if (data->fetch)
> -		return fetch_with_fetch(transport, nr_heads, to_fetch);
> +	ret = data->fetch ? fetch_with_fetch(transport, nr_heads, to_fetch) :
> +	      data->import ? fetch_with_import(transport, nr_heads, to_fetch) :
> +	      -1;
>  
> -	if (data->import)
> -		return fetch_with_import(transport, nr_heads, to_fetch);
> +	/*
> +	 * We may have just received a pack through the helper sub-process;
> +	 * refresh the pack list.
> +	 */
> +	if (!ret)
> +		reprepare_packed_git(the_repository);
>  
> -	return -1;
> +	return ret;
>  }

This code looks correct, and should be the fix for the short-term.

I wonder if we could do something more complicated in the long-term,
which was recommended to me by Jeff Hostetler: add the pack to the
packed_git list once we've indexed it. That way, we don't reprepare
and scan the packs one-by-one, but instead we insert to the list
a single pack that we already know about.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-13  0:54       ` Derrick Stolee
@ 2020-03-13  1:14         ` Junio C Hamano
  2020-03-13  2:30         ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2020-03-13  1:14 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Jeff King, Derrick Stolee via GitGitGadget, git, jonathantanmy,
	me, Derrick Stolee, git

Derrick Stolee <stolee@gmail.com> writes:

> I wonder if we could do something more complicated in the long-term,
> which was recommended to me by Jeff Hostetler: add the pack to the
> packed_git list once we've indexed it. That way, we don't reprepare
> and scan the packs one-by-one, but instead we insert to the list
> a single pack that we already know about.

Yup, that was the first thing that came to my mind, i.e. the code
that installs a new packfile on disk in our repository must be ours,
so we should be able to tell that code to add the new packfile to
the in-core list without having to scan.

We may still need to scan when that strategy "fails" to protect
against racing with simultaneous repacking, though.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-13  0:54       ` Derrick Stolee
  2020-03-13  1:14         ` Junio C Hamano
@ 2020-03-13  2:30         ` Jeff King
  2020-03-13  2:34           ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2020-03-13  2:30 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Derrick Stolee via GitGitGadget, git,
	jonathantanmy, me, Derrick Stolee, git

On Thu, Mar 12, 2020 at 08:54:34PM -0400, Derrick Stolee wrote:

> On 3/12/2020 5:26 PM, Jeff King wrote:
> > On Thu, Mar 12, 2020 at 05:16:38PM -0400, Jeff King wrote:
> > 
> >> There we see that same reprepare happen in 882839, which is the child
> >> fetch-pack. The parent fetch probably needs to reprepare itself after
> >> fetch-pack completes.
> 
> I agree with you and Junio that where I put the reprepare was non-
> optimal. The initial reason to put it there was that I found where
> the error was happening, and thought that placing the reprepare there
> was the best way to prevent this error from popping up in another case.

To be fair to you, the place you put it is almost certainly fine in
practice. The connectivity check is likely the first time we'll actually
look at the objects in the parent process, and once we've re-scanned,
all the code after us is protected.

But I do still think it's cleaner to put it closer to the responsible
code.

Although there was one thing that puzzled me when writing the previous
email. Why isn't this a problem for non-partial clones? And the answer
is that in a normal repo, as soon as we try to look up a missing object,
we'd trigger the usual re-scan. But we _don't_ do object lookups in
check_connected(). We do this funky loop over get_all_packs() ourselves,
looking only at promisor packs. And if we didn't find the object, then
we immediately complain in that loop with no fallback to re-scan.

So that would actually argue that your patch is putting it in the right
place. It's _not_ fetch's responsibility to reprepare_packed_git(). It's
the loop in check_connected() that is skipping the usual reprepare
logic, and shouldn't.

And one fix (which you did) is to just preemptively reprepare right
above that loop. Some other solutions are:

  - teach the loop to reprepare() when we don't find an object and see
    if we picked up a new promisor pack

  - reverse the loop logic: look up each object in the usual way (in all
    packs), and see if the resulting pack is a promisor. I guess that
    produces slightly different results though if an object is in the
    new promisor pack _and_ available elsewhere; but isn't the point of
    that check_refs_are_promisor_objects_only flag that we're doing a
    clone?

> I wonder if we could do something more complicated in the long-term,
> which was recommended to me by Jeff Hostetler: add the pack to the
> packed_git list once we've indexed it. That way, we don't reprepare
> and scan the packs one-by-one, but instead we insert to the list
> a single pack that we already know about.

I don't think the parent git-fetch process even knows about the pack,
though. It just asked the remote-helper to somehow make the objects
arrive, and it doesn't know what form that took.

reprepare_packed_git() should be pretty cheap, though. If we already
have a pack in the list, we won't re-open it. Finding out if we already
had one used to be O(n), making the whole re-scan quadratic. But that
was fixed recently in ec48540fe8 (packfile.c: speed up loading lots of
packfiles, 2019-11-27), where we switched to keeping a hashmap of
loaded pack names.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-13  2:30         ` Jeff King
@ 2020-03-13  2:34           ` Jeff King
  2020-03-13 12:43             ` Derrick Stolee
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2020-03-13  2:34 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Derrick Stolee via GitGitGadget, git,
	jonathantanmy, me, Derrick Stolee, git

On Thu, Mar 12, 2020 at 10:30:34PM -0400, Jeff King wrote:

> So that would actually argue that your patch is putting it in the right
> place. It's _not_ fetch's responsibility to reprepare_packed_git(). It's
> the loop in check_connected() that is skipping the usual reprepare
> logic, and shouldn't.
> 
> And one fix (which you did) is to just preemptively reprepare right
> above that loop. Some other solutions are:

I know I've now suggested that your patch is both wrong and right. :)

Just to be clear, at this point I think I'd be OK with either solution.

If it's going into check_connected(), the commit message should argue
that the loop there is at fault for not doing the usual fallback, and a
single explicit reprepare() is cheap enough to cover the case we care
about (and that we don't have to worry about racing with somebody else
repacking because the point of that flag is that we're in a brand new
repo).

Repreparing earlier in the transport-helper code _could_ still protect
against other similar loops, which is an argument for putting it there.
But I'd be inclined to say those other loops should be corrected.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] connected.c: reprepare packs for corner cases
  2020-03-13  2:34           ` Jeff King
@ 2020-03-13 12:43             ` Derrick Stolee
  0 siblings, 0 replies; 12+ messages in thread
From: Derrick Stolee @ 2020-03-13 12:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Derrick Stolee via GitGitGadget, git,
	jonathantanmy, me, Derrick Stolee, git

On 3/12/2020 10:34 PM, Jeff King wrote:
> On Thu, Mar 12, 2020 at 10:30:34PM -0400, Jeff King wrote:
> 
>> So that would actually argue that your patch is putting it in the right
>> place. It's _not_ fetch's responsibility to reprepare_packed_git(). It's
>> the loop in check_connected() that is skipping the usual reprepare
>> logic, and shouldn't.
>>
>> And one fix (which you did) is to just preemptively reprepare right
>> above that loop. Some other solutions are:
> 
> I know I've now suggested that your patch is both wrong and right. :)
> 
> Just to be clear, at this point I think I'd be OK with either solution.
> 
> If it's going into check_connected(), the commit message should argue
> that the loop there is at fault for not doing the usual fallback, and a
> single explicit reprepare() is cheap enough to cover the case we care
> about (and that we don't have to worry about racing with somebody else
> repacking because the point of that flag is that we're in a brand new
> repo).

Thank you for bringing extra clarity here. We don't need to put the
reprepare in the fetch logic because _most_ kinds of object lookups will
reprepare when failing to find an expected object. This loop is special
in that it is restricted to search the promisor packs. And, because it
only happens when a special mode (opt->check_refs_are_promisor_objects_only)
it is reasonable to assume that we are specifically in the case that a
pack was added recently.

We could save some work in most cases by repreparing only after failing
to find the object, but that also makes the code more complicated for
low value. At least, that is my opinion.
 
> Repreparing earlier in the transport-helper code _could_ still protect
> against other similar loops, which is an argument for putting it there.
> But I'd be inclined to say those other loops should be corrected.

Right. We should make it the responsibility of the code that is scanning
pack-files that they should be able to reprepare when failing to find
"expected" objects. This also handles the concurrent repack case that
Junio mentioned.

Taking a quick glance at the callers of get_all_packs(), I see this is
the only such loop that both has an ability to "fail" based on an
expected object and doesn't have a fallback to reprepare.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] connected.c: reprepare packs for corner cases
  2020-03-12 15:36 [PATCH] connected.c: reprepare packs for corner cases Derrick Stolee via GitGitGadget
  2020-03-12 16:39 ` Jonathan Tan
  2020-03-12 20:42 ` Junio C Hamano
@ 2020-03-13 21:11 ` Derrick Stolee via GitGitGadget
  2 siblings, 0 replies; 12+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-03-13 21:11 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, me, peff, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

While updating the microsoft/git fork on top of v2.26.0-rc0 and
consuming that build into Scalar, I noticed a corner case bug around
partial clone.

The "scalar clone" command can create a Git repository with the
proper config for using partial clone with the "blob:none" filter.
Instead of calling "git clone", it runs "git init" then sets a few
more config values before running "git fetch".

In our builds on v2.26.0-rc0, we noticed that our "git fetch"
command was failing with

  error: https://github.com/microsoft/scalar did not send all necessary objects

This does not happen if you copy the config file from a repository
created by "git clone --filter=blob:none <url>", but it does happen
when adding the config option "core.logAllRefUpdates = true".

By debugging, I was able to see that the loop inside
check_connnected() that checks if all refs are contained in
promisor packs actually did not have any packfiles in the packed_git
list.

I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
spot during the fetch operation. This approach requires a situation
where we use the remote helper process, which makes it difficult to
test.

It is possible to place a reprepare_packed_git() call in the fetch code
closer to where we receive a pack, but that leaves an opening for a
later change to re-introduce this problem. Further, a concurrent repack
operation could replace the pack-file list we already loaded into
memory, causing this issue in an even harder to reproduce scenario.

It is really the responsibility of anyone looping through the list of
pack-files for a certain object to fall back to reprepare_packed_git()
on a fail-to-find. The loop in check_connected() does not have this
fallback, leading to this bug.

We _could_ try looping through the packs and only reprepare the packs
after a miss, but that change is more involved and has little value.
Since this case is isolated to the case when
opt->check_refs_are_promisor_objects_only is true, we are confident that
we are verifying the refs after downloading new data. This implies that
calling reprepare_packed_git() in advance is not a huge cost compared to
the rest of the operations already made.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
    connected.c: reprepare packs for corner cases
    
    I included how I found this (integrating v2.26.0-rc0 into Scalar), but I
    am able to reproduce it on my Linux machine using real fetches from
    github.com. I'm not sure why I was unable to reproduce the issue in test
    cases using the file:// URLs or the HTTP tests.
    
    Update in V2: I've updated the commit message to discuss the options
    presented on-list, but also provide why I'm keeping the code unchanged
    in light of that discussion.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-579%2Fderrickstolee%2Ffetch-reprepare-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-579/derrickstolee/fetch-reprepare-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/579

Range-diff vs v1:

 1:  cde7aa20ca8 ! 1:  696a51bd5a0 connected.c: reprepare packs for corner cases
     @@ -27,17 +27,31 @@
      
          I'm not sure what corner-case issues caused this config option to
          prevent the reprepare_packed_git() from being called at the proper
     -    spot during the fetch operation. Even worse, I have failed to create
     -    a test case to prevent a regression.
     +    spot during the fetch operation. This approach requires a situation
     +    where we use the remote helper process, which makes it difficult to
     +    test.
      
     -    Placing a reprepare_packed_git() call inside chck_connected() before
     -    looping through the packed_git list seems like the safest way to
     -    avoid this issue in the future. While reprepare_packed_git() does
     -    another scan of the pack directory, it is not terribly expensive as
     -    long as we do not run it in a loop. We check connectivity only a
     -    few times per command, so this will not have a meaningful performance
     -    impact. In exchange, we get extra safety around this check.
     +    It is possible to place a reprepare_packed_git() call in the fetch code
     +    closer to where we receive a pack, but that leaves an opening for a
     +    later change to re-introduce this problem. Further, a concurrent repack
     +    operation could replace the pack-file list we already loaded into
     +    memory, causing this issue in an even harder to reproduce scenario.
      
     +    It is really the responsibility of anyone looping through the list of
     +    pack-files for a certain object to fall back to reprepare_packed_git()
     +    on a fail-to-find. The loop in check_connected() does not have this
     +    fallback, leading to this bug.
     +
     +    We _could_ try looping through the packs and only reprepare the packs
     +    after a miss, but that change is more involved and has little value.
     +    Since this case is isolated to the case when
     +    opt->check_refs_are_promisor_objects_only is true, we are confident that
     +    we are verifying the refs after downloading new data. This implies that
     +    calling reprepare_packed_git() in advance is not a huge cost compared to
     +    the rest of the operations already made.
     +
     +    Helped-by: Jeff King <peff@peff.net>
     +    Helped-by: Junio Hamano <gitster@pobox.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       diff --git a/connected.c b/connected.c


 connected.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/connected.c b/connected.c
index 7e9bd1bc622..ac52b07b474 100644
--- a/connected.c
+++ b/connected.c
@@ -61,7 +61,11 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		 * object is a promisor object. Instead, just make sure we
 		 * received, in a promisor packfile, the objects pointed to by
 		 * each wanted ref.
+		 *
+		 * Before checking for promisor packs, be sure we have the
+		 * latest pack-files loaded into memory.
 		 */
+		reprepare_packed_git(the_repository);
 		do {
 			struct packed_git *p;
 

base-commit: b4374e96c84ed9394fed363973eb540da308ed4f
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-03-13 21:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-12 15:36 [PATCH] connected.c: reprepare packs for corner cases Derrick Stolee via GitGitGadget
2020-03-12 16:39 ` Jonathan Tan
2020-03-12 17:34   ` Derrick Stolee
2020-03-12 20:42 ` Junio C Hamano
2020-03-12 21:16   ` Jeff King
2020-03-12 21:26     ` Jeff King
2020-03-13  0:54       ` Derrick Stolee
2020-03-13  1:14         ` Junio C Hamano
2020-03-13  2:30         ` Jeff King
2020-03-13  2:34           ` Jeff King
2020-03-13 12:43             ` Derrick Stolee
2020-03-13 21:11 ` [PATCH v2] " Derrick Stolee via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).