From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: jonathantanmy@google.com, me@ttaylorr.com, peff@peff.net,
Derrick Stolee <dstolee@microsoft.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v2] connected.c: reprepare packs for corner cases
Date: Fri, 13 Mar 2020 21:11:55 +0000 [thread overview]
Message-ID: <pull.579.v2.git.1584133915654.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.579.git.1584027403779.gitgitgadget@gmail.com>
From: Derrick Stolee <dstolee@microsoft.com>
While updating the microsoft/git fork on top of v2.26.0-rc0 and
consuming that build into Scalar, I noticed a corner case bug around
partial clone.
The "scalar clone" command can create a Git repository with the
proper config for using partial clone with the "blob:none" filter.
Instead of calling "git clone", it runs "git init" then sets a few
more config values before running "git fetch".
In our builds on v2.26.0-rc0, we noticed that our "git fetch"
command was failing with
error: https://github.com/microsoft/scalar did not send all necessary objects
This does not happen if you copy the config file from a repository
created by "git clone --filter=blob:none <url>", but it does happen
when adding the config option "core.logAllRefUpdates = true".
By debugging, I was able to see that the loop inside
check_connnected() that checks if all refs are contained in
promisor packs actually did not have any packfiles in the packed_git
list.
I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
spot during the fetch operation. This approach requires a situation
where we use the remote helper process, which makes it difficult to
test.
It is possible to place a reprepare_packed_git() call in the fetch code
closer to where we receive a pack, but that leaves an opening for a
later change to re-introduce this problem. Further, a concurrent repack
operation could replace the pack-file list we already loaded into
memory, causing this issue in an even harder to reproduce scenario.
It is really the responsibility of anyone looping through the list of
pack-files for a certain object to fall back to reprepare_packed_git()
on a fail-to-find. The loop in check_connected() does not have this
fallback, leading to this bug.
We _could_ try looping through the packs and only reprepare the packs
after a miss, but that change is more involved and has little value.
Since this case is isolated to the case when
opt->check_refs_are_promisor_objects_only is true, we are confident that
we are verifying the refs after downloading new data. This implies that
calling reprepare_packed_git() in advance is not a huge cost compared to
the rest of the operations already made.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
connected.c: reprepare packs for corner cases
I included how I found this (integrating v2.26.0-rc0 into Scalar), but I
am able to reproduce it on my Linux machine using real fetches from
github.com. I'm not sure why I was unable to reproduce the issue in test
cases using the file:// URLs or the HTTP tests.
Update in V2: I've updated the commit message to discuss the options
presented on-list, but also provide why I'm keeping the code unchanged
in light of that discussion.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-579%2Fderrickstolee%2Ffetch-reprepare-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-579/derrickstolee/fetch-reprepare-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/579
Range-diff vs v1:
1: cde7aa20ca8 ! 1: 696a51bd5a0 connected.c: reprepare packs for corner cases
@@ -27,17 +27,31 @@
I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
- spot during the fetch operation. Even worse, I have failed to create
- a test case to prevent a regression.
+ spot during the fetch operation. This approach requires a situation
+ where we use the remote helper process, which makes it difficult to
+ test.
- Placing a reprepare_packed_git() call inside chck_connected() before
- looping through the packed_git list seems like the safest way to
- avoid this issue in the future. While reprepare_packed_git() does
- another scan of the pack directory, it is not terribly expensive as
- long as we do not run it in a loop. We check connectivity only a
- few times per command, so this will not have a meaningful performance
- impact. In exchange, we get extra safety around this check.
+ It is possible to place a reprepare_packed_git() call in the fetch code
+ closer to where we receive a pack, but that leaves an opening for a
+ later change to re-introduce this problem. Further, a concurrent repack
+ operation could replace the pack-file list we already loaded into
+ memory, causing this issue in an even harder to reproduce scenario.
+ It is really the responsibility of anyone looping through the list of
+ pack-files for a certain object to fall back to reprepare_packed_git()
+ on a fail-to-find. The loop in check_connected() does not have this
+ fallback, leading to this bug.
+
+ We _could_ try looping through the packs and only reprepare the packs
+ after a miss, but that change is more involved and has little value.
+ Since this case is isolated to the case when
+ opt->check_refs_are_promisor_objects_only is true, we are confident that
+ we are verifying the refs after downloading new data. This implies that
+ calling reprepare_packed_git() in advance is not a huge cost compared to
+ the rest of the operations already made.
+
+ Helped-by: Jeff King <peff@peff.net>
+ Helped-by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
diff --git a/connected.c b/connected.c
connected.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/connected.c b/connected.c
index 7e9bd1bc622..ac52b07b474 100644
--- a/connected.c
+++ b/connected.c
@@ -61,7 +61,11 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
* object is a promisor object. Instead, just make sure we
* received, in a promisor packfile, the objects pointed to by
* each wanted ref.
+ *
+ * Before checking for promisor packs, be sure we have the
+ * latest pack-files loaded into memory.
*/
+ reprepare_packed_git(the_repository);
do {
struct packed_git *p;
base-commit: b4374e96c84ed9394fed363973eb540da308ed4f
--
gitgitgadget
prev parent reply other threads:[~2020-03-13 21:12 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-12 15:36 [PATCH] connected.c: reprepare packs for corner cases Derrick Stolee via GitGitGadget
2020-03-12 16:39 ` Jonathan Tan
2020-03-12 17:34 ` Derrick Stolee
2020-03-12 20:42 ` Junio C Hamano
2020-03-12 21:16 ` Jeff King
2020-03-12 21:26 ` Jeff King
2020-03-13 0:54 ` Derrick Stolee
2020-03-13 1:14 ` Junio C Hamano
2020-03-13 2:30 ` Jeff King
2020-03-13 2:34 ` Jeff King
2020-03-13 12:43 ` Derrick Stolee
2020-03-13 21:11 ` Derrick Stolee via GitGitGadget [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.579.v2.git.1584133915654.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).