* [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
@ 2021-10-05 0:19 ` Glen Choo
2021-10-05 0:19 ` [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
` (4 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05 0:19 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo
the_repository->settings is the preferred way to get certain
settings (such as "core.commitGraph") because it gets default values
from prepare_repo_settings(). However, cmd_fsck() reads the config
directly via git_config_get_bool(), which bypasses these default values.
This causes fsck to ignore the commit graph if "core.commitgraph" is not
explicitly set in the config. This worked fine until commit-graph was
enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
default, 2019-08-13).
Replace git_config_get_bool("core.commitgraph") in fsck with the
equivalent call to the_repository->settings.core_commit_graph.
The expected behavior is that fsck respects the config value when it is
set, but uses the default value when it is unset. For example, for
core.commitGraph, there are three cases:
- Config value is set to true -> enabled
- Config value is set to false -> disabled
- Config value is unset -> enabled
As such, tests cover all three cases.
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 3 ++-
t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
2 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
fsck_enable_object_names(&fsck_walk_options);
git_config(git_fsck_config, &fsck_obj_options);
+ prepare_repo_settings(the_repository);
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
check_connectivity();
- if (!git_config_get_bool("core.commitgraph", &i) && i) {
+ if (the_repository->settings.core_commit_graph) {
struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index af88f805aa..4fcfc2ebbc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -674,12 +674,33 @@ test_expect_success 'detect incorrect chunk count' '
$GRAPH_CHUNK_LOOKUP_OFFSET
'
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
cd "$TRASH_DIRECTORY/full" &&
git fsck &&
corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
"incorrect checksum" &&
cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ test_when_finished "git config core.commitGraph true" &&
+
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ test_unconfig core.commitGraph &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
test_must_fail git fsck
'
--
2.33.0.800.g4c38ced690-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
2021-10-05 0:19 ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-05 0:19 ` Glen Choo
2021-10-05 0:19 ` [PATCH v3 3/3] gc: perform incremental repack " Glen Choo
` (3 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05 0:19 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo
Like the previous commit, the_repository->settings.core_multi_pack_index
is preferable to reading "core.multiPackIndex" from the config because
prepare_repo_settings() sets a default value for
the_repository->settings. This worked fine until core.multiPackIndex was
enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25).
Replace git_config_get_bool("core.multiPackIndex") in fsck with the
equivalent call to the_repository->settings.multi_pack_index.
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 2 +-
t/t5319-multi-pack-index.sh | 5 ++++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
}
}
- if (!git_config_get_bool("core.multipackindex", &i) && i) {
+ if (the_repository->settings.core_multi_pack_index) {
struct child_process midx_verify = CHILD_PROCESS_INIT;
const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c3..53958bfd1f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -407,7 +407,10 @@ test_expect_success 'verify incorrect offset' '
test_expect_success 'git-fsck incorrect offset' '
corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
"incorrect object offset" \
- "git -c core.multipackindex=true fsck"
+ "git -c core.multiPackIndex=true fsck" &&
+ test_unconfig core.multiPackIndex &&
+ test_must_fail git fsck &&
+ git -c core.multiPackIndex=false fsck
'
test_expect_success 'corrupt MIDX is not reused' '
--
2.33.0.800.g4c38ced690-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v3 3/3] gc: perform incremental repack when implictly enabled
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
2021-10-05 0:19 ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-05 0:19 ` [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-05 0:19 ` Glen Choo
2021-10-05 11:57 ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
` (2 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05 0:19 UTC (permalink / raw)
To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo
builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
maintenance_task_incremental_repack()
The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.
In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.
Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/gc.c | 5 ++---
t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index f05d2f0a1a..070b7dccb1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1051,12 +1051,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
static int incremental_repack_auto_condition(void)
{
struct packed_git *p;
- int enabled;
int incremental_repack_auto_limit = 10;
int count = 0;
- if (git_config_get_bool("core.multiPackIndex", &enabled) ||
- !enabled)
+ prepare_repo_settings(the_repository);
+ if (!the_repository->settings.core_multi_pack_index)
return 0;
git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 58f46c77e6..7b1b2581af 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -321,15 +321,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
--no-progress --batch-size=2147483647 <run-2g.txt
'
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+ test_commit A &&
git repack -adk &&
- git config core.multiPackIndex true &&
git multi-pack-index write &&
GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
-c maintenance.incremental-repack.auto=1 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
- test_commit A &&
+ test_commit B &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -338,7 +338,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
- test_commit B &&
+ test_commit C &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -347,6 +347,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+ rm -rf incremental-repack-true &&
+ git init incremental-repack-true &&
+ (
+ cd incremental-repack-true &&
+ git config core.multiPackIndex true &&
+ run_incremental_repack_and_verify
+ )
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+ rm -rf incremental-repack-unset &&
+ git init incremental-repack-unset &&
+ (
+ cd incremental-repack-unset &&
+ test_unconfig core.multiPackIndex &&
+ run_incremental_repack_and_verify
+ )
'
test_expect_success 'pack-refs task' '
--
2.33.0.800.g4c38ced690-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
` (2 preceding siblings ...)
2021-10-05 0:19 ` [PATCH v3 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-05 11:57 ` Ævar Arnfjörð Bjarmason
2021-10-05 17:43 ` Derrick Stolee
2021-10-05 22:25 ` Glen Choo
2021-10-09 7:24 ` Junio C Hamano
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
5 siblings, 2 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-05 11:57 UTC (permalink / raw)
To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee
On Mon, Oct 04 2021, Glen Choo wrote:
> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
>
> Instead of using the config, let's use repository settings where
> available. Replace core.commitGraph and core.multiPackIndex with their
> equivalent repository settings in fsck and gc.
>
> I don't anticipate any more business logic changes and the previous comments
> were focused on testing style, so hopefully this should be good to merge.
>
> Changes since v2:
> * Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
> test_config because test_config can affect the values in subsequent tests.
> * Rewording fix in patch 3 commit message
> * Refactor tests in patch 3 so that we use a single helper function instead of
> copy-pasted code
>
> Changes since v1:
> * clean up typo in patch 1 commit message
> * document the commits that patches 1 and 2 address
> * use test helpers in patch 1
> * rewrite patch 2's tests so that it's easier to tell that each test
> does something different
> * reword patch 3 commit message to explain the bug
> * add tests to patch 3
>
> Glen Choo (3):
> fsck: verify commit graph when implicitly enabled
> fsck: verify multi-pack-index when implictly enabled
> gc: perform incremental repack when implictly enabled
I've looked this over carefully, it LGTM. I noticed some bugs in this
area, but they pre-date your commits. I've pushed a
"gc/use-repo-settings" to https://github.com/avar/git.git that you can
check out, consider that branch partially as commentary, but feel free
to pick/squash anything you find there.
The range-diff for that is quoted below. Notes:
* This is somewhat of a case of throwing rocks from a glass house, but
I think the commit messages could really be shorter/get to the
point. I.e. say right away what's being changed and why, instead of
giving the reader an overview of the whole control flow.
I just adjusted adusted 1/3 in that direction below, I think 2/3
could do with being squashed into it & adjusted, it's really the
exact same bug/improvement, just with a different config key that
behaves the same.
* The "test_unconfig" in your 2/3 is redundant, so is the "rm -rf" of
directories that can't exist by that point in 3/3.
* I wondered if we could just call prepare_repo_settings() in one place
earlier in gc/maintenance instead of adding yet another call deeper
in the control flow. I've got a 4/3 below that tries to do that, and
(as noted in its commit message) intentionally breaks "gc.c" in a way
that our tests don't catch, which is perhaps fallout from
31b1de6a09b?
So that's very much on the topic of extra improvements, i.e. your
patches don't break that anymore than it is now, but I for one would
very much like to see an end-state where we're confident that
gc/maintenance do the right thing on each of the tri-states of these
boolean variables.
It's also a point about how to write reliable tests (again,
pre-dating your series). I.e. instead of asserting "we do X when Y is
true" have the test assert all of Y=[unset|false|true] in a way that
breaks if any are different than expected.
One of the tests you're changing has a "graph_git_two_modes" helper
does 2/3 of that, maybe it would be easy to extend it?
* You use corrupting the graph as a shorthand for "did we run this
command?". See test_region for an approach that might be better,
i.e. just log with trace2 and grep that to see if we started the
relevant command. The test_region helper can't be used as-is for that
(you could manually grep the emitted PERF/EVENT output), but perhaps
that's easier/more reliable?
* This is again, something that pre-dates your patches, but I think
following the "is enabled?" behavior here for these variables in "git
fsck" is highly questionable. So just some thoughts after having
looked at that:
Just because your config doen't say "use the graph" doesn't mean
another process won't do so, and if the graph is broken they might
probably encounter an error.
Although I think we *should* have cleaned up those being fatal a
while ago). See 43d35618055 (commit-graph write: don't die if the
existing graph is corrupt, 2019-03-25).
But in general for these auxiliary files (commit-graph, midix, *.rev
etc.) I think there's a good argument to be made that fsck should
*always* check them, and at most use the config to say something
like:
Hey, you've got a 100% broken commit-graph/midx/rev in your .git,
looks like the core.XYZ enabling it is off *right now* though, so
maybe we won't use it. I hope you're feeling lucky...
:)
1: fd9a58e2c7b ! 1: c6041f3b633 fsck: verify commit graph when implicitly enabled
@@ Metadata
## Commit message ##
fsck: verify commit graph when implicitly enabled
- the_repository->settings is the preferred way to get certain
- settings (such as "core.commitGraph") because it gets default values
- from prepare_repo_settings(). However, cmd_fsck() reads the config
- directly via git_config_get_bool(), which bypasses these default values.
- This causes fsck to ignore the commit graph if "core.commitgraph" is not
- explicitly set in the config. This worked fine until commit-graph was
- enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
- default, 2019-08-13).
+ Change fsck to check the "core_commit_graph" variable set in
+ "repo-settings.c" instead of reading the "core.commitGraph" variable,
+ this fixes a bug where we wouldn't start "commit-graph verify" if the
+ config key was missing, which since 31b1de6a09 (commit-graph: turn on
+ commit-graph by default, 2019-08-13) has meant that it's on by
+ default.
- Replace git_config_get_bool("core.commitgraph") in fsck with the
- equivalent call to the_repository->settings.core_commit_graph.
-
- The expected behavior is that fsck respects the config value when it is
- set, but uses the default value when it is unset. For example, for
- core.commitGraph, there are three cases:
-
- - Config value is set to true -> enabled
- - Config value is set to false -> disabled
- - Config value is unset -> enabled
-
- As such, tests cover all three cases.
+ Add tests to "t5318-commit-graph.sh" to check that this works, using
+ the detection of a corrupt commit-graph as a method of seeing if
+ "commit-graph verify" was invoked.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
+ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## builtin/fsck.c ##
@@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
2: 9d6813bc280 ! 2: d11209ebfe1 fsck: verify multi-pack-index when implictly enabled
@@ Commit message
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
+ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## builtin/fsck.c ##
@@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
@@ t/t5319-multi-pack-index.sh: test_expect_success 'verify incorrect offset' '
"incorrect object offset" \
- "git -c core.multipackindex=true fsck"
+ "git -c core.multiPackIndex=true fsck" &&
-+ test_unconfig core.multiPackIndex &&
+ test_must_fail git fsck &&
+ git -c core.multiPackIndex=false fsck
'
3: 260f46568c6 ! 3: 017e2003e85 gc: perform incremental repack when implictly enabled
@@ Commit message
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
+ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## builtin/gc.c ##
@@ builtin/gc.c: static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
@@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
-+ rm -rf incremental-repack-true &&
+ git init incremental-repack-true &&
+ (
+ cd incremental-repack-true &&
@@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
-+ rm -rf incremental-repack-unset &&
+ git init incremental-repack-unset &&
+ (
+ cd incremental-repack-unset &&
-: ----------- > 4: 04d1527f180 WIP gc/maintenance: just call prepare_repo_settings() earlier
-- >8 --
Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier
Consolidate the various calls to prepare_repo_settings() to happen at
the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
intentionally breaks things, we seem to be lacking test coverage for
cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
commit-graph by default, 2019-08-13)?
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
builtin/gc.c | 5 +----
t/t5318-commit-graph.sh | 2 +-
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 26709311601..f59dbedc1fe 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
clean_pack_garbage();
}
- prepare_repo_settings(the_repository);
if (the_repository->settings.gc_write_commit_graph == 1)
write_commit_graph_reachable(the_repository->objects->odb,
!quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
@@ -860,7 +859,6 @@ static int run_write_commit_graph(struct maintenance_run_opts *opts)
static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
{
- prepare_repo_settings(the_repository);
if (!the_repository->settings.core_commit_graph)
return 0;
@@ -1052,7 +1050,6 @@ static int incremental_repack_auto_condition(void)
int incremental_repack_auto_limit = 10;
int count = 0;
- prepare_repo_settings(the_repository);
if (!the_repository->settings.core_multi_pack_index)
return 0;
@@ -1167,7 +1164,6 @@ static int multi_pack_index_repack(struct maintenance_run_opts *opts)
static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts)
{
- prepare_repo_settings(the_repository);
if (!the_repository->settings.core_multi_pack_index) {
warning(_("skipping incremental-repack task because core.multiPackIndex is disabled"));
return 0;
@@ -2492,6 +2488,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
(argc == 2 && !strcmp(argv[1], "-h")))
usage(builtin_maintenance_usage);
+ prepare_repo_settings(the_repository);
if (!strcmp(argv[1], "run"))
return maintenance_run(argc - 1, argv + 1, prefix);
if (!strcmp(argv[1], "start"))
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 4e4450af1ef..98123191e1e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -368,7 +368,7 @@ test_expect_success 'check that gc computes commit-graph' '
git commit-graph write --reachable &&
cp $objdir/info/commit-graph commit-graph-before-gc &&
git reset --hard HEAD~1 &&
- git config gc.writeCommitGraph true &&
+ # BUG: Due to 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13)?
git gc &&
cp $objdir/info/commit-graph commit-graph-after-gc &&
! test_cmp_bin commit-graph-before-gc commit-graph-after-gc &&
--
2.33.0.1441.gbbcdb4c3c66
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-05 11:57 ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
@ 2021-10-05 17:43 ` Derrick Stolee
2021-10-05 19:10 ` Ævar Arnfjörð Bjarmason
2021-10-05 22:25 ` Glen Choo
1 sibling, 1 reply; 65+ messages in thread
From: Derrick Stolee @ 2021-10-05 17:43 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason, Glen Choo
Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee
On 10/5/2021 7:57 AM, Ævar Arnfjörð Bjarmason wrote:
> Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier
>
> Consolidate the various calls to prepare_repo_settings() to happen at
> the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
> intentionally breaks things, we seem to be lacking test coverage for
> cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
> commit-graph by default, 2019-08-13)?
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
> builtin/gc.c | 5 +----
> t/t5318-commit-graph.sh | 2 +-
> 2 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 26709311601..f59dbedc1fe 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
> clean_pack_garbage();
> }
>
> - prepare_repo_settings(the_repository);
> if (the_repository->settings.gc_write_commit_graph == 1)
I think that removing these calls is dangerous. prepare_repo_settings()
already returns immediately if the repository already has its settings
populated. The pattern of "call prepare before using a setting" is a
safe way to future-proof the check from a movement of the call.
Putting that potential-future-problem aside, I don't see how this
change is a _benefit_ other than fewer lines of code, which is not a
quality measure in itself.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-05 17:43 ` Derrick Stolee
@ 2021-10-05 19:10 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-05 19:10 UTC (permalink / raw)
To: Derrick Stolee; +Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Derrick Stolee
On Tue, Oct 05 2021, Derrick Stolee wrote:
> On 10/5/2021 7:57 AM, Ævar Arnfjörð Bjarmason wrote:
>
>> Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier
>>
>> Consolidate the various calls to prepare_repo_settings() to happen at
>> the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
>> intentionally breaks things, we seem to be lacking test coverage for
>> cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
>> commit-graph by default, 2019-08-13)?
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>> builtin/gc.c | 5 +----
>> t/t5318-commit-graph.sh | 2 +-
>> 2 files changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/builtin/gc.c b/builtin/gc.c
>> index 26709311601..f59dbedc1fe 100644
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>> clean_pack_garbage();
>> }
>>
>> - prepare_repo_settings(the_repository);
>> if (the_repository->settings.gc_write_commit_graph == 1)
>
> I think that removing these calls is dangerous. prepare_repo_settings()
> already returns immediately if the repository already has its settings
> populated. The pattern of "call prepare before using a setting" is a
> safe way to future-proof the check from a movement of the call.
>
> Putting that potential-future-problem aside, I don't see how this
> change is a _benefit_ other than fewer lines of code, which is not a
> quality measure in itself.
Very dangerous, yes :) As I noted this hunk intentionaly breaks the "git
gc" command. Search for "I wondered if we could just call
prepare_repo_settings[...]" upthread.
I.e. I was commenting on how Glen's patch makes an attempt to test this
tri-state config for some cases, but not others, and that in this case
we can break "git gc" without any existing test catching it.
Which as noted might have started happening when you flipped the
core.commitGraph default, but I didn't test or confirm that, it just
seemed the most likely place to start digging from a quick "git log
-G...".
Anyway, as also noted I think this series is fine as-is, just some
feedback in case Glen or anyone would be interested in either doing a v4
or a follow-up, i.e. when I tested this I found that some code/tests in
this area were (probably) either fragile or already broken.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-05 11:57 ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
2021-10-05 17:43 ` Derrick Stolee
@ 2021-10-05 22:25 ` Glen Choo
1 sibling, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05 22:25 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> I've looked this over carefully, it LGTM. I noticed some bugs in this
> area, but they pre-date your commits. I've pushed a
> "gc/use-repo-settings" to https://github.com/avar/git.git that you can
> check out, consider that branch partially as commentary, but feel free
> to pick/squash anything you find there.
Your review is very thoughtful, thank you :)
> * This is somewhat of a case of throwing rocks from a glass house, but
> I think the commit messages could really be shorter/get to the
> point. I.e. say right away what's being changed and why, instead of
> giving the reader an overview of the whole control flow.
>
> I just adjusted adusted 1/3 in that direction below, I think 2/3
> could do with being squashed into it & adjusted, it's really the
> exact same bug/improvement, just with a different config key that
> behaves the same.
In hindsight, I think this is how I should have initially structured my
commits because this improves the signal-noise ratio. I'll keep it in
mind in the future for sure. I'm hesitant to send a v4, but I can be
persuaded to do so :)
> * The "test_unconfig" in your 2/3 is redundant, so is the "rm -rf" of
> directories that can't exist by that point in 3/3.
These are redundant, but they serve a defensive purpose. By being extra
defensive, I was hoping to make it clear to the reader that the test is
*guaranteed* not to conflict with/rely on previous state and reduce the
number of greps needed.
> * You use corrupting the graph as a shorthand for "did we run this
> command?". See test_region for an approach that might be better,
> i.e. just log with trace2 and grep that to see if we started the
> relevant command. The test_region helper can't be used as-is for that
> (you could manually grep the emitted PERF/EVENT output), but perhaps
> that's easier/more reliable?
My intention here was not to assert "did git fsck call the function I
expected?", but something more along the lines of "did git fsck catch the
corrupted commit-graph?". Calling the function seems like an
implementation detail, but the actual desired behavior is that git fsck
catches the breakage.
Perhaps the test title should be reflect what I meant, like
- git fsck (checks commit-graph when config set to true)
+ git fsck detects corrupted commit-graph
> * This is again, something that pre-dates your patches, but I think
> following the "is enabled?" behavior here for these variables in "git
> fsck" is highly questionable. So just some thoughts after having
> looked at that:
I think it depends on how we think about core.commitGraph (and others).
Do we want Git to be able to pretend that commit-graph doesn't exist at
all? If so, then we should skip the check in git fsck. Alternatively, is
commit-graph actually part of Git's core offering? If so,
core.commitGraph is more like a way to opt *out* of some commit-graph
behavior, but fsck should still verify our core data structures. Given
that we default to true, it sounds like more of the latter, but I'm not
sure what use case this serves.
> Just because your config doen't say "use the graph" doesn't mean
> another process won't do so, and if the graph is broken they might
> probably encounter an error.
While that's true, I think that the user assumes a certain degree of
risk if they are using different commands with different config values.
I don't think it's unreasonable to say that "git -c
core.commitGraph=false fsck" won't catch issues that will arise in "git
-c core.commitGraph=true foo".
> But in general for these auxiliary files (commit-graph, midix, *.rev
> etc.) I think there's a good argument to be made that fsck should
> *always* check them, and at most use the config to say something
> like:
>
> Hey, you've got a 100% broken commit-graph/midx/rev in your .git,
> looks like the core.XYZ enabling it is off *right now* though, so
> maybe we won't use it. I hope you're feeling lucky...
I think "always check" sounds like the best way to maximize visibility
for users, though I don't think core.commitGraph should control the
severity of the warning/error. If we pursue "always check", I think we
could adopt fsck.<msg-id> instead.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
` (3 preceding siblings ...)
2021-10-05 11:57 ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
@ 2021-10-09 7:24 ` Junio C Hamano
2021-10-11 19:58 ` Glen Choo
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
5 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-09 7:24 UTC (permalink / raw)
To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau
Glen Choo <chooglen@google.com> writes:
> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
With this merged to 'seen', the CI job with the extra set of
GIT_TEST_X settings fail. When this topic is excluded, with
all the other topics in flight in 'seen', everything seems to
be OK.
For which GIT_TEST_X environment variables to set and export while
testing to trigger the problem, see [*1*]
For a successful test run of 'seen' without this topic, see [*2*]
For the test log of the failing run with this topic, see [*3*];
you'd need to be logged into GitHub to see the details of the errors
(e.g. click on "regular (linux-gcc...)" with red X sign on the left
hand side, then open "Run ci/print-test-failures.sh" and look for
"not ok").
[References]
*1* https://github.com/git/git/runs/3843549095?check_suite_focus=true#step:4:1677
export GIT_TEST_SPLIT_INDEX=yes
export GIT_TEST_MERGE_ALGORITHM=recursive
export GIT_TEST_FULL_IN_PACK_ARRAY=true
export GIT_TEST_OE_SIZE=10
export GIT_TEST_OE_DELTA_SIZE=5
export GIT_TEST_COMMIT_GRAPH=1
export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
export GIT_TEST_MULTI_PACK_INDEX=1
export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1
export GIT_TEST_ADD_I_USE_BUILTIN=1
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
export GIT_TEST_WRITE_REV_INDEX=1
export GIT_TEST_CHECKOUT_WORKERS=2
*2* https://github.com/git/git/actions/runs/1322907901 (feff65d)
A successful CI run of 'seen' without gc/use-repo-settings.
*3* https://github.com/git/git/actions/runs/1322842689 (54a31af)
CI run of 'seen' with gc/use-repo-settings that fails.
The commits that is in the failing 'seen' but not in the
succeeding tree are those from this topic, as can be seen here:
$ git shortlog --no-merges 54a31af ^feff65d
Glen Choo (3):
fsck: verify commit graph when implicitly enabled
fsck: verify multi-pack-index when implictly enabled
gc: perform incremental repack when implictly enabled
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-09 7:24 ` Junio C Hamano
@ 2021-10-11 19:58 ` Glen Choo
2021-10-11 20:08 ` Junio C Hamano
0 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-11 19:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Eric Sunshine, Taylor Blau
Junio C Hamano <gitster@pobox.com> writes:
> With this merged to 'seen', the CI job with the extra set of
> GIT_TEST_X settings fail. When this topic is excluded, with
> all the other topics in flight in 'seen', everything seems to
> be OK.
Thanks for the feedback, I did two CI runs on my fork in response:
* [1] uses the same tree as seen. Looks like osx-clang passes but
linux-gcc fails.
* [2] is rebased onto master. Again, osx-clang passes but linux-gcc
fails.
This should be enough for me to hunt down the issue, though I would
appreciate any hints if anyone has any :)
[1] https://github.com/chooglen/git/runs/3862097340?check_suite_focus=true
[2] https://github.com/chooglen/git/runs/3862670035?check_suite_focus=true
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-11 19:58 ` Glen Choo
@ 2021-10-11 20:08 ` Junio C Hamano
2021-10-11 20:48 ` Glen Choo
0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-11 20:08 UTC (permalink / raw)
To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau
Glen Choo <chooglen@google.com> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> With this merged to 'seen', the CI job with the extra set of
>> GIT_TEST_X settings fail. When this topic is excluded, with
>> all the other topics in flight in 'seen', everything seems to
>> be OK.
>
> Thanks for the feedback, I did two CI runs on my fork in response:
>
> * [1] uses the same tree as seen. Looks like osx-clang passes but
> linux-gcc fails.
> * [2] is rebased onto master. Again, osx-clang passes but linux-gcc
> fails.
>
> This should be enough for me to hunt down the issue, though I would
> appreciate any hints if anyone has any :)
Using the set of GIT_TEST_FOO... settings I cited in the
Reference#1, in the message you are responding to, will let you
reproduce the issue locally on glinux boxes.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v3 0/3] Use default values from settings instead of config
2021-10-11 20:08 ` Junio C Hamano
@ 2021-10-11 20:48 ` Glen Choo
0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-11 20:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Eric Sunshine, Taylor Blau
Junio C Hamano <gitster@pobox.com> writes:
> Using the set of GIT_TEST_FOO... settings I cited in the
> Reference#1, in the message you are responding to, will let you
> reproduce the issue locally on glinux boxes.
Ah, I didn't notice that ci/run-build-and-tests.sh uses different
settings for different environments. With those settings, I can
reproduce the failure on my OS X machine.
Thanks so much!
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v4 0/3] Use default values from settings instead of config
2021-10-05 0:19 ` [PATCH v3 " Glen Choo
` (4 preceding siblings ...)
2021-10-09 7:24 ` Junio C Hamano
@ 2021-10-12 17:42 ` Glen Choo
2021-10-12 17:42 ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
` (5 more replies)
5 siblings, 6 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau,
Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo
Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.
Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.
This re-roll is primarily motivated by the CI failures noted by Junio in [1].
The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
job) causes commands like "git commit" to write commit-graphs, but certain tests
like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
fixes a bug where fsck did not check commit-graphs by default, which means that
these tests will now fail because they have invalid commit-graphs. The easiest
solution I found is to disable this confusing and noisy behavior with
GIT_TEST_COMMIT_GRAPH=0.
And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
separate.
This series has also seen a healthy amount of interest in test style and
coverage. We haven't converged on a path for the future, but I'd like to think
that the tests here are still a step in (approximately) the right direction.
Hopefully this version LGT us while we figure out what to do with tests :)
[1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/
Changes since v3
* Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
way that is incompatible with commit-graphs.
* Make patch 1 and 2's commit messages more concise (thanks Ævar!).
Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
copy-pasted code
Changes since v1:
* clean up typo in patch 1 commit message
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3
Glen Choo (3):
fsck: verify commit graph when implicitly enabled
fsck: verify multi-pack-index when implictly enabled
gc: perform incremental repack when implictly enabled
builtin/fsck.c | 5 +++--
builtin/gc.c | 5 ++---
t/t0410-partial-clone.sh | 6 +++++-
t/t3800-mktag.sh | 5 +++++
t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
t/t5319-multi-pack-index.sh | 5 ++++-
t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
7 files changed, 65 insertions(+), 12 deletions(-)
Range-diff against v3:
1: 2f9ff949e6 ! 1: aac1253e7b fsck: verify commit graph when implicitly enabled
@@ Metadata
## Commit message ##
fsck: verify commit graph when implicitly enabled
- the_repository->settings is the preferred way to get certain
- settings (such as "core.commitGraph") because it gets default values
- from prepare_repo_settings(). However, cmd_fsck() reads the config
- directly via git_config_get_bool(), which bypasses these default values.
- This causes fsck to ignore the commit graph if "core.commitgraph" is not
- explicitly set in the config. This worked fine until commit-graph was
- enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
- default, 2019-08-13).
+ Change fsck to check the "core_commit_graph" variable set in
+ "repo-settings.c" instead of reading the "core.commitGraph" variable.
+ This fixes a bug where we wouldn't verify the commit-graph if the
+ config key was missing. This bug was introduced in
+ 31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
+ where core.commitGraph was turned on by default.
- Replace git_config_get_bool("core.commitgraph") in fsck with the
- equivalent call to the_repository->settings.core_commit_graph.
-
- The expected behavior is that fsck respects the config value when it is
- set, but uses the default value when it is unset. For example, for
- core.commitGraph, there are three cases:
-
- - Config value is set to true -> enabled
- - Config value is set to false -> disabled
- - Config value is unset -> enabled
-
- As such, tests cover all three cases.
+ Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
+ commit-graph as expected for the 3 values of core.commitGraph. Also,
+ disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
+ assume that commit-graph checking is disabled (t/t3800-mktag.sh,
+ t/t0410-partial-clone.sh).
+ Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
## builtin/fsck.c ##
@@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
+ ## t/t0410-partial-clone.sh ##
+@@ t/t0410-partial-clone.sh: test_description='partial clone'
+
+ # missing promisor objects cause repacks which write bitmaps to fail
+ GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
++# When enabled, some commands will write commit-graphs. This causes fsck
++# to fail when delete_object() is called because fsck will attempt to
++# verify the out-of-sync commit graph.
++GIT_TEST_COMMIT_GRAPH=0
+
+ delete_object () {
+ rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
+@@ t/t0410-partial-clone.sh: test_expect_success 'rev-list stops traversal at missing and promised commit' '
+
+ git -C repo config core.repositoryformatversion 1 &&
+ git -C repo config extensions.partialclone "arbitrary string" &&
+- GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
++ git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
+ grep $(git -C repo rev-parse bar) out &&
+ ! grep $FOO out
+ '
+
+ ## t/t3800-mktag.sh ##
+@@ t/t3800-mktag.sh: test_description='git mktag: tag object verify test'
+
+ . ./test-lib.sh
+
++# When enabled, some commands will automatically write commit-graphs.
++# This will cause the mktag tests to fail because fsck will attempt to
++# verify the out-of-sync commit graph.
++GIT_TEST_COMMIT_GRAPH=0
++
+ ###########################################################
+ # check the tag.sig file, expecting verify_tag() to fail,
+ # and checking that the error message matches the pattern
+
## t/t5318-commit-graph.sh ##
@@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
$GRAPH_CHUNK_LOOKUP_OFFSET
2: b13ca2a695 ! 2: ed64983430 fsck: verify multi-pack-index when implictly enabled
@@ Metadata
## Commit message ##
fsck: verify multi-pack-index when implictly enabled
- Like the previous commit, the_repository->settings.core_multi_pack_index
- is preferable to reading "core.multiPackIndex" from the config because
- prepare_repo_settings() sets a default value for
- the_repository->settings. This worked fine until core.multiPackIndex was
- enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
- default, 2020-09-25).
-
- Replace git_config_get_bool("core.multiPackIndex") in fsck with the
- equivalent call to the_repository->settings.multi_pack_index.
+ Like the previous commit, change fsck to check the
+ "core_multi_pack_index" variable set in "repo-settings.c" instead of
+ reading the "core.multiPackIndex" config variable. This fixes a bug
+ where we wouldn't verify midx if the config key was missing. This bug
+ was introduced in 18e449f86b (midx: enable core.multiPackIndex by
+ default, 2020-09-25) where core.multiPackIndex was turned on by default.
+ Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
## builtin/fsck.c ##
3: 76943aff80 = 3: 821b492d8b gc: perform incremental repack when implictly enabled
--
2.33.0.882.g93a45727a2-goog
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
@ 2021-10-12 17:42 ` Glen Choo
2021-10-12 17:42 ` [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
` (4 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau,
Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo
Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.
Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
assume that commit-graph checking is disabled (t/t3800-mktag.sh,
t/t0410-partial-clone.sh).
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 3 ++-
t/t0410-partial-clone.sh | 6 +++++-
t/t3800-mktag.sh | 5 +++++
t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
4 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
fsck_enable_object_names(&fsck_walk_options);
git_config(git_fsck_config, &fsck_obj_options);
+ prepare_repo_settings(the_repository);
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
check_connectivity();
- if (!git_config_get_bool("core.commitgraph", &i) && i) {
+ if (the_repository->settings.core_commit_graph) {
struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bba679685f..c76485b1b6 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -6,6 +6,10 @@ test_description='partial clone'
# missing promisor objects cause repacks which write bitmaps to fail
GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+# When enabled, some commands will write commit-graphs. This causes fsck
+# to fail when delete_object() is called because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
delete_object () {
rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
@@ -322,7 +326,7 @@ test_expect_success 'rev-list stops traversal at missing and promised commit' '
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "arbitrary string" &&
- GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
+ git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
grep $(git -C repo rev-parse bar) out &&
! grep $FOO out
'
diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh
index 0544d58a6e..27fb15c83b 100755
--- a/t/t3800-mktag.sh
+++ b/t/t3800-mktag.sh
@@ -6,6 +6,11 @@ test_description='git mktag: tag object verify test'
. ./test-lib.sh
+# When enabled, some commands will automatically write commit-graphs.
+# This will cause the mktag tests to fail because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
+
###########################################################
# check the tag.sig file, expecting verify_tag() to fail,
# and checking that the error message matches the pattern
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 295c5bd94d..4e4450af1e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -693,12 +693,33 @@ test_expect_success 'detect incorrect chunk count' '
$GRAPH_CHUNK_LOOKUP_OFFSET
'
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
cd "$TRASH_DIRECTORY/full" &&
git fsck &&
corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
"incorrect checksum" &&
cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ test_when_finished "git config core.commitGraph true" &&
+
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ test_unconfig core.commitGraph &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
test_must_fail git fsck
'
--
2.33.0.882.g93a45727a2-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
2021-10-12 17:42 ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-12 17:42 ` Glen Choo
2021-10-12 17:42 ` [PATCH v4 3/3] gc: perform incremental repack " Glen Choo
` (3 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau,
Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo
Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 2 +-
t/t5319-multi-pack-index.sh | 5 ++++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
}
}
- if (!git_config_get_bool("core.multipackindex", &i) && i) {
+ if (the_repository->settings.core_multi_pack_index) {
struct child_process midx_verify = CHILD_PROCESS_INIT;
const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index bd17f308b3..38999f115f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -452,7 +452,10 @@ test_expect_success 'verify incorrect offset' '
test_expect_success 'git-fsck incorrect offset' '
corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
"incorrect object offset" \
- "git -c core.multipackindex=true fsck"
+ "git -c core.multiPackIndex=true fsck" &&
+ test_unconfig core.multiPackIndex &&
+ test_must_fail git fsck &&
+ git -c core.multiPackIndex=false fsck
'
test_expect_success 'corrupt MIDX is not reused' '
--
2.33.0.882.g93a45727a2-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v4 3/3] gc: perform incremental repack when implictly enabled
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
2021-10-12 17:42 ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-12 17:42 ` [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-12 17:42 ` Glen Choo
2021-10-12 20:23 ` [PATCH v4 0/3] Use default values from settings instead of config Junio C Hamano
` (2 subsequent siblings)
5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau,
Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo
builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
maintenance_task_incremental_repack()
The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.
In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.
Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/gc.c | 5 ++---
t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 6b3de3dd51..2670931160 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1049,12 +1049,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
static int incremental_repack_auto_condition(void)
{
struct packed_git *p;
- int enabled;
int incremental_repack_auto_limit = 10;
int count = 0;
- if (git_config_get_bool("core.multiPackIndex", &enabled) ||
- !enabled)
+ prepare_repo_settings(the_repository);
+ if (!the_repository->settings.core_multi_pack_index)
return 0;
git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 9b9f11a8e7..74aa638475 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -336,15 +336,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
--no-progress --batch-size=2147483647 <run-2g.txt
'
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+ test_commit A &&
git repack -adk &&
- git config core.multiPackIndex true &&
git multi-pack-index write &&
GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
-c maintenance.incremental-repack.auto=1 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
- test_commit A &&
+ test_commit B &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -353,7 +353,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
- test_commit B &&
+ test_commit C &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -362,6 +362,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+ rm -rf incremental-repack-true &&
+ git init incremental-repack-true &&
+ (
+ cd incremental-repack-true &&
+ git config core.multiPackIndex true &&
+ run_incremental_repack_and_verify
+ )
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+ rm -rf incremental-repack-unset &&
+ git init incremental-repack-unset &&
+ (
+ cd incremental-repack-unset &&
+ test_unconfig core.multiPackIndex &&
+ run_incremental_repack_and_verify
+ )
'
test_expect_success 'pack-refs task' '
--
2.33.0.882.g93a45727a2-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
` (2 preceding siblings ...)
2021-10-12 17:42 ` [PATCH v4 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-12 20:23 ` Junio C Hamano
2021-10-12 20:34 ` Ævar Arnfjörð Bjarmason
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
5 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-12 20:23 UTC (permalink / raw)
To: Glen Choo
Cc: git, Eric Sunshine, Taylor Blau,
Ævar Arnfjörð Bjarmason
Glen Choo <chooglen@google.com> writes:
> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.
OK. Will replace. Thanks.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
` (3 preceding siblings ...)
2021-10-12 20:23 ` [PATCH v4 0/3] Use default values from settings instead of config Junio C Hamano
@ 2021-10-12 20:34 ` Ævar Arnfjörð Bjarmason
2021-10-12 22:29 ` Glen Choo
2021-10-13 13:12 ` Derrick Stolee
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
5 siblings, 2 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-12 20:34 UTC (permalink / raw)
To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee
On Tue, Oct 12 2021, Glen Choo wrote:
> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
>
> Instead of using the config, let's use repository settings where
> available. Replace core.commitGraph and core.multiPackIndex with their
> equivalent repository settings in fsck and gc.
>
> This re-roll is primarily motivated by the CI failures noted by Junio in [1].
> The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
> job) causes commands like "git commit" to write commit-graphs, but certain tests
> like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
> corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
> fixes a bug where fsck did not check commit-graphs by default, which means that
> these tests will now fail because they have invalid commit-graphs. The easiest
> solution I found is to disable this confusing and noisy behavior with
> GIT_TEST_COMMIT_GRAPH=0.
That's easy, but...
> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.
>
> This series has also seen a healthy amount of interest in test style and
> coverage. We haven't converged on a path for the future, but I'd like to think
> that the tests here are still a step in (approximately) the right direction.
> Hopefully this version LGT us while we figure out what to do with tests :)
>
> [1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/
>
> Changes since v3
> * Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
> way that is incompatible with commit-graphs.
> * Make patch 1 and 2's commit messages more concise (thanks Ævar!).
...how isn't disabling those t3800-mktag.sh tests just plasting over
corruption that we're noticing because of your changes to (rightly) fix
the bug where "fsck" wasn't checking the graph at all?
IOW haven't we just found exactly the sort of bug that
"GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
of fixing it we're hiding it?
If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
I see that we fail N number of tests, but all of them are actually
fallout of just this test:
git replace $head_parent $head &&
git replace -f $tree $blob
I.e. we've created a replacement object replacing a tree with a blob, as
part of tests I added to test how mktag handles those sorts of weird
edge cases.
This then causes the graph verify code to throw a hissy fit with:
root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
0fbca9850869684085d654f9e1380c9780802570
I.e. when we wrote the graph we somehow didn't notice that the root tree
node we wrote is to an object that's not actually a tree? Isn't this a
bug where some part of the commit graph writing should be doing its own
extended OID lookup that's replacement-object aware, it didn't, and we
wrote a corrupt graph as a result?
If there is a legitimate reason why we're not just hiding a bug we've
turned up with these fixes let's disable that one test, not the entire
test file.
If you don't run the one test that fails (which is split up into 3
individual pieces) there's still 143 other tests that are run, all of
those presumably benefit from finding future bugs with
GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
just have turned up one just now...
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-12 20:34 ` Ævar Arnfjörð Bjarmason
@ 2021-10-12 22:29 ` Glen Choo
2021-10-14 15:53 ` Ævar Arnfjörð Bjarmason
2021-10-13 13:12 ` Derrick Stolee
1 sibling, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-12 22:29 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> ...how isn't disabling those t3800-mktag.sh tests just plasting over
> corruption that we're noticing because of your changes to (rightly) fix
> the bug where "fsck" wasn't checking the graph at all?
>
> IOW haven't we just found exactly the sort of bug that
> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
> of fixing it we're hiding it?
>
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
>
> git replace $head_parent $head &&
> git replace -f $tree $blob
>
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
>
> This then causes the graph verify code to throw a hissy fit with:
>
> root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
> commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
> 0fbca9850869684085d654f9e1380c9780802570
>
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
>
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.
>
> If you don't run the one test that fails (which is split up into 3
> individual pieces) there's still 143 other tests that are run, all of
> those presumably benefit from finding future bugs with
> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
> just have turned up one just now...
I think this falls on my shoulders. I assumed that the failures were
expected behavior, not bugs. You are right that we shouldn't be
plastering over bugs.
I'll have to ask for help here because I don't know enough about mktag
to distinguish between 'expected' and 'unexpected' failures. The best I
can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
tests. But if that's good enough for now, I'll just do that :)
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-12 22:29 ` Glen Choo
@ 2021-10-14 15:53 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-14 15:53 UTC (permalink / raw)
To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee
On Tue, Oct 12 2021, Glen Choo wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> ...how isn't disabling those t3800-mktag.sh tests just plasting over
>> corruption that we're noticing because of your changes to (rightly) fix
>> the bug where "fsck" wasn't checking the graph at all?
>>
>> IOW haven't we just found exactly the sort of bug that
>> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
>> of fixing it we're hiding it?
>>
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>>
>> git replace $head_parent $head &&
>> git replace -f $tree $blob
>>
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>>
>> This then causes the graph verify code to throw a hissy fit with:
>>
>> root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>> commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>> 0fbca9850869684085d654f9e1380c9780802570
>>
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>>
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>>
>> If you don't run the one test that fails (which is split up into 3
>> individual pieces) there's still 143 other tests that are run, all of
>> those presumably benefit from finding future bugs with
>> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
>> just have turned up one just now...
>
> I think this falls on my shoulders. I assumed that the failures were
> expected behavior, not bugs. You are right that we shouldn't be
> plastering over bugs.
>
> I'll have to ask for help here because I don't know enough about mktag
> to distinguish between 'expected' and 'unexpected' failures. The best I
> can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
> tests. But if that's good enough for now, I'll just do that :)
I don't think your series needs to be tasked with fixing a generic issue
in the commit-graph you stumbled across.
So for your series I'd think the only required fix would be to narrowlry
skip *just* the broken tests, not the entire test file.
But in this case (see
http://lore.kernel.org/git/87pms8hq4s.fsf@evledraar.gmail.com) the fix
seems rather trivial to me.
I.e. just adding one line to builtin/commit-graph.c, so perhaps you can
see if that works for you and such a "this bug was missed because this
mode didn't work" fix could be part of a re-roll of yours?
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-12 20:34 ` Ævar Arnfjörð Bjarmason
2021-10-12 22:29 ` Glen Choo
@ 2021-10-13 13:12 ` Derrick Stolee
2021-10-13 15:57 ` Ævar Arnfjörð Bjarmason
1 sibling, 1 reply; 65+ messages in thread
From: Derrick Stolee @ 2021-10-13 13:12 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason, Glen Choo
Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee
On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
>
> git replace $head_parent $head &&
> git replace -f $tree $blob
>
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
>
> This then causes the graph verify code to throw a hissy fit with:
>
> root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
> commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
> 0fbca9850869684085d654f9e1380c9780802570
>
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
>
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.
The commit-graph should be disabled if replace-objects are enabled. If
there is a bug being introduced here it is because the commit-graph is
being checked during fsck even though it would never be read when the
replace-objects exist.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-13 13:12 ` Derrick Stolee
@ 2021-10-13 15:57 ` Ævar Arnfjörð Bjarmason
2021-10-14 16:53 ` Derrick Stolee
0 siblings, 1 reply; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 15:57 UTC (permalink / raw)
To: Derrick Stolee
Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
Derrick Stolee
On Wed, Oct 13 2021, Derrick Stolee wrote:
> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>>
>> git replace $head_parent $head &&
>> git replace -f $tree $blob
>>
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>>
>> This then causes the graph verify code to throw a hissy fit with:
>>
>> root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>> commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>> 0fbca9850869684085d654f9e1380c9780802570
>>
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>>
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>
> The commit-graph should be disabled if replace-objects are enabled. If
> there is a bug being introduced here it is because the commit-graph is
> being checked during fsck even though it would never be read when the
> replace-objects exist.
>
> Thanks,
> -Stolee
Thanks, isn't the obvious fix for this to extend your d6538246d3d
(commit-graph: not compatible with replace objects, 2018-08-20) to do
"read_replace_refs = 0;" in graph_verify()? That works for me on this
case.
I.e. if we ignore replace refs when writing the graph, and we don't use
it if there are any due to commit_graph_compatible(), why would we
consider them when verifying it?
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-13 15:57 ` Ævar Arnfjörð Bjarmason
@ 2021-10-14 16:53 ` Derrick Stolee
2021-10-14 22:21 ` Glen Choo
2021-10-14 22:25 ` Junio C Hamano
0 siblings, 2 replies; 65+ messages in thread
From: Derrick Stolee @ 2021-10-14 16:53 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
Derrick Stolee
On 10/13/2021 11:57 AM, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, Oct 13 2021, Derrick Stolee wrote:
>
>> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>>> I see that we fail N number of tests, but all of them are actually
>>> fallout of just this test:
>>>
>>> git replace $head_parent $head &&
>>> git replace -f $tree $blob
>>>
>>> I.e. we've created a replacement object replacing a tree with a blob, as
>>> part of tests I added to test how mktag handles those sorts of weird
>>> edge cases.
>>>
>>> This then causes the graph verify code to throw a hissy fit with:
>>>
>>> root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>> commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>> 0fbca9850869684085d654f9e1380c9780802570
>>>
>>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>>> node we wrote is to an object that's not actually a tree? Isn't this a
>>> bug where some part of the commit graph writing should be doing its own
>>> extended OID lookup that's replacement-object aware, it didn't, and we
>>> wrote a corrupt graph as a result?
>>>
>>> If there is a legitimate reason why we're not just hiding a bug we've
>>> turned up with these fixes let's disable that one test, not the entire
>>> test file.
>>
>> The commit-graph should be disabled if replace-objects are enabled. If
>> there is a bug being introduced here it is because the commit-graph is
>> being checked during fsck even though it would never be read when the
>> replace-objects exist.
>>
>> Thanks,
>> -Stolee
>
> Thanks, isn't the obvious fix for this to extend your d6538246d3d
> (commit-graph: not compatible with replace objects, 2018-08-20) to do
> "read_replace_refs = 0;" in graph_verify()? That works for me on this
> case.
Ignoring the replace refs while verifying will allow you to verify the
on-disk commit-graph file without issue.
> I.e. if we ignore replace refs when writing the graph, and we don't use
> it if there are any due to commit_graph_compatible(), why would we
> consider them when verifying it?
I generally consider any interaction with the commit-graph while
replace refs are enabled to be the bug. We don't read the commit-graph
when replace refs are on, so why are we verifying it?
But, I understand the desire to check the on-disk content even though
it is not being used, so disabling replace refs to do the verify should
be sufficient.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-14 16:53 ` Derrick Stolee
@ 2021-10-14 22:21 ` Glen Choo
2021-10-14 23:38 ` Ævar Arnfjörð Bjarmason
2021-10-14 22:25 ` Junio C Hamano
1 sibling, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-14 22:21 UTC (permalink / raw)
To: Derrick Stolee, Ævar Arnfjörð Bjarmason
Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee
Derrick Stolee <stolee@gmail.com> writes:
>>> The commit-graph should be disabled if replace-objects are enabled. If
>>> there is a bug being introduced here it is because the commit-graph is
>>> being checked during fsck even though it would never be read when the
>>> replace-objects exist.
>>>
>>> Thanks,
>>> -Stolee
>>
>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>> case.
>
> Ignoring the replace refs while verifying will allow you to verify the
> on-disk commit-graph file without issue.
It seems like we've converged on doing read_replace_refs = 0 \o/
If we are going to do this twice in graph_verify() and graph_write(), is
there any reason why I shouldn't just do "read_replace_refs = 0" once in
cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
should just ignore replace refs because they're incompatible.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-14 22:21 ` Glen Choo
@ 2021-10-14 23:38 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-14 23:38 UTC (permalink / raw)
To: Glen Choo
Cc: Derrick Stolee, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
Derrick Stolee
On Thu, Oct 14 2021, Glen Choo wrote:
> Derrick Stolee <stolee@gmail.com> writes:
>
>>>> The commit-graph should be disabled if replace-objects are enabled. If
>>>> there is a bug being introduced here it is because the commit-graph is
>>>> being checked during fsck even though it would never be read when the
>>>> replace-objects exist.
>>>>
>>>> Thanks,
>>>> -Stolee
>>>
>>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>>> case.
>>
>> Ignoring the replace refs while verifying will allow you to verify the
>> on-disk commit-graph file without issue.
>
> It seems like we've converged on doing read_replace_refs = 0 \o/
>
> If we are going to do this twice in graph_verify() and graph_write(), is
> there any reason why I shouldn't just do "read_replace_refs = 0" once in
> cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
> should just ignore replace refs because they're incompatible.
No reason, I think that's the best way to do this.
I've submitted a series to fix that verify issue, as noted in the CL
these patches on top of it without that disabling of the mktag tests
will pass with GIT_TEST_COMMIT_GRAPH=true):
https://lore.kernel.org/git/cover-0.3-00000000000-20211014T233343Z-avarab@gmail.com
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-14 16:53 ` Derrick Stolee
2021-10-14 22:21 ` Glen Choo
@ 2021-10-14 22:25 ` Junio C Hamano
2021-10-15 15:57 ` Junio C Hamano
1 sibling, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-14 22:25 UTC (permalink / raw)
To: Derrick Stolee
Cc: Ævar Arnfjörð Bjarmason, Glen Choo, git,
Eric Sunshine, Taylor Blau, Derrick Stolee
Derrick Stolee <stolee@gmail.com> writes:
> I generally consider any interaction with the commit-graph while
> replace refs are enabled to be the bug. We don't read the commit-graph
> when replace refs are on, so why are we verifying it?
>
> But, I understand the desire to check the on-disk content even though
> it is not being used, so disabling replace refs to do the verify should
> be sufficient.
That's quite a sensible argument. Sounds like we have a way to
solve this that everybody agrees with.
Thanks.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH v4 0/3] Use default values from settings instead of config
2021-10-14 22:25 ` Junio C Hamano
@ 2021-10-15 15:57 ` Junio C Hamano
0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-15 15:57 UTC (permalink / raw)
To: Derrick Stolee
Cc: Ævar Arnfjörð Bjarmason, Glen Choo, git,
Eric Sunshine, Taylor Blau, Derrick Stolee
Junio C Hamano <gitster@pobox.com> writes:
> Derrick Stolee <stolee@gmail.com> writes:
>
>> I generally consider any interaction with the commit-graph while
>> replace refs are enabled to be the bug. We don't read the commit-graph
>> when replace refs are on, so why are we verifying it?
>>
>> But, I understand the desire to check the on-disk content even though
>> it is not being used, so disabling replace refs to do the verify should
>> be sufficient.
>
> That's quite a sensible argument. Sounds like we have a way to
> solve this that everybody agrees with.
Would it be sensible to silently ignore, though? Would a warning,
when we see replacement is in use and an explicit request by an end
user is made to make use of or update the commit-graph, be too loud?
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v5 0/3] Use default values from settings instead of config
2021-10-12 17:42 ` [PATCH v4 " Glen Choo
` (4 preceding siblings ...)
2021-10-12 20:34 ` Ævar Arnfjörð Bjarmason
@ 2021-10-15 20:16 ` Glen Choo
2021-10-15 20:16 ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
` (3 more replies)
5 siblings, 4 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
Ævar Arnfjörð Bjarmason, Glen Choo
Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.
Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.
v5 is based off Ævar's fix (commit-graph: make "verify" work with replace refs,
[1]). As Ævar noted in v4, the mktag test failures indicated buggy behavior
"git commit-graph verify" when replace refs are enabled, so we should fix the
bug instead of just disabling commit-graph.
Ævar, Derrick, thanks for the productive discussion, I feel like I understand
this area a little better now.
Barring any unseen bugs, this should be ready to merge.
[1] https://lore.kernel.org/git/cover-0.3-00000000000-20211014T233343Z-avarab@gmail.com/
Changes since v4
* Rebase onto [1], which fixes a bug with commit-graph verify with replace refs
enabled.
* Remove GIT_TEST_COMMIT_GRAPH=0 in t3800-mktag.sh because the bug that it was
covering up is now fixed.
Changes since v3
* Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
way that is incompatible with commit-graphs.
* Make patch 1 and 2's commit messages more concise (thanks Ævar!).
Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
copy-pasted code
Changes since v1:
* clean up typo in patch 1 commit message
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3
Glen Choo (3):
fsck: verify commit graph when implicitly enabled
fsck: verify multi-pack-index when implictly enabled
gc: perform incremental repack when implictly enabled
builtin/fsck.c | 5 +++--
builtin/gc.c | 5 ++---
t/t0410-partial-clone.sh | 6 +++++-
t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
t/t5319-multi-pack-index.sh | 5 ++++-
t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
6 files changed, 60 insertions(+), 12 deletions(-)
Range-diff against v4:
1: aac1253e7b ! 1: 567d40849a fsck: verify commit graph when implicitly enabled
@@ Commit message
Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
- disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
- assume that commit-graph checking is disabled (t/t3800-mktag.sh,
- t/t0410-partial-clone.sh).
+ disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
+ test cases use fsck in ways that assume that commit-graph checking is
+ disabled.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
@@ t/t0410-partial-clone.sh: test_expect_success 'rev-list stops traversal at missi
! grep $FOO out
'
- ## t/t3800-mktag.sh ##
-@@ t/t3800-mktag.sh: test_description='git mktag: tag object verify test'
-
- . ./test-lib.sh
-
-+# When enabled, some commands will automatically write commit-graphs.
-+# This will cause the mktag tests to fail because fsck will attempt to
-+# verify the out-of-sync commit graph.
-+GIT_TEST_COMMIT_GRAPH=0
-+
- ###########################################################
- # check the tag.sig file, expecting verify_tag() to fail,
- # and checking that the error message matches the pattern
-
## t/t5318-commit-graph.sh ##
@@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
$GRAPH_CHUNK_LOOKUP_OFFSET
2: ed64983430 = 2: 7cb44a7aeb fsck: verify multi-pack-index when implictly enabled
3: 821b492d8b = 3: a436c45af7 gc: perform incremental repack when implictly enabled
--
2.33.0.1079.g6e70778dc9-goog
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
@ 2021-10-15 20:16 ` Glen Choo
2021-10-15 20:16 ` [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
` (2 subsequent siblings)
3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
Ævar Arnfjörð Bjarmason, Glen Choo
Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.
Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
test cases use fsck in ways that assume that commit-graph checking is
disabled.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 3 ++-
t/t0410-partial-clone.sh | 6 +++++-
t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
3 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
fsck_enable_object_names(&fsck_walk_options);
git_config(git_fsck_config, &fsck_obj_options);
+ prepare_repo_settings(the_repository);
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
check_connectivity();
- if (!git_config_get_bool("core.commitgraph", &i) && i) {
+ if (the_repository->settings.core_commit_graph) {
struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bba679685f..c76485b1b6 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -6,6 +6,10 @@ test_description='partial clone'
# missing promisor objects cause repacks which write bitmaps to fail
GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+# When enabled, some commands will write commit-graphs. This causes fsck
+# to fail when delete_object() is called because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
delete_object () {
rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
@@ -322,7 +326,7 @@ test_expect_success 'rev-list stops traversal at missing and promised commit' '
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "arbitrary string" &&
- GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
+ git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
grep $(git -C repo rev-parse bar) out &&
! grep $FOO out
'
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 84d122a7ae..f516fda7cc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -694,12 +694,33 @@ test_expect_success 'detect incorrect chunk count' '
$GRAPH_CHUNK_LOOKUP_OFFSET
'
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
cd "$TRASH_DIRECTORY/full" &&
git fsck &&
corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
"incorrect checksum" &&
cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+ git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+ cd "$TRASH_DIRECTORY/full" &&
+ test_when_finished "git config core.commitGraph true" &&
+
+ git fsck &&
+ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+ "incorrect checksum" &&
+ test_unconfig core.commitGraph &&
+ cp commit-graph-pre-write-test $objdir/info/commit-graph &&
test_must_fail git fsck
'
--
2.33.0.1079.g6e70778dc9-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
2021-10-15 20:16 ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-15 20:16 ` Glen Choo
2021-10-15 20:16 ` [PATCH v5 3/3] gc: perform incremental repack " Glen Choo
2021-10-15 21:31 ` [PATCH v5 0/3] Use default values from settings instead of config Junio C Hamano
3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
Ævar Arnfjörð Bjarmason, Glen Choo
Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/fsck.c | 2 +-
t/t5319-multi-pack-index.sh | 5 ++++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
}
}
- if (!git_config_get_bool("core.multipackindex", &i) && i) {
+ if (the_repository->settings.core_multi_pack_index) {
struct child_process midx_verify = CHILD_PROCESS_INIT;
const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index bd17f308b3..38999f115f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -452,7 +452,10 @@ test_expect_success 'verify incorrect offset' '
test_expect_success 'git-fsck incorrect offset' '
corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
"incorrect object offset" \
- "git -c core.multipackindex=true fsck"
+ "git -c core.multiPackIndex=true fsck" &&
+ test_unconfig core.multiPackIndex &&
+ test_must_fail git fsck &&
+ git -c core.multiPackIndex=false fsck
'
test_expect_success 'corrupt MIDX is not reused' '
--
2.33.0.1079.g6e70778dc9-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH v5 3/3] gc: perform incremental repack when implictly enabled
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
2021-10-15 20:16 ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-15 20:16 ` [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-15 20:16 ` Glen Choo
2021-10-15 21:31 ` [PATCH v5 0/3] Use default values from settings instead of config Junio C Hamano
3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
To: git
Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
Ævar Arnfjörð Bjarmason, Glen Choo
builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
maintenance_task_incremental_repack()
The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.
In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.
Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.
Signed-off-by: Glen Choo <chooglen@google.com>
---
builtin/gc.c | 5 ++---
t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 6b3de3dd51..2670931160 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1049,12 +1049,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
static int incremental_repack_auto_condition(void)
{
struct packed_git *p;
- int enabled;
int incremental_repack_auto_limit = 10;
int count = 0;
- if (git_config_get_bool("core.multiPackIndex", &enabled) ||
- !enabled)
+ prepare_repo_settings(the_repository);
+ if (!the_repository->settings.core_multi_pack_index)
return 0;
git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 9b9f11a8e7..74aa638475 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -336,15 +336,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
--no-progress --batch-size=2147483647 <run-2g.txt
'
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+ test_commit A &&
git repack -adk &&
- git config core.multiPackIndex true &&
git multi-pack-index write &&
GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
-c maintenance.incremental-repack.auto=1 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
- test_commit A &&
+ test_commit B &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -353,7 +353,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
- test_commit B &&
+ test_commit C &&
git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
HEAD
^HEAD~1
@@ -362,6 +362,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
-c maintenance.incremental-repack.auto=2 \
maintenance run --auto --task=incremental-repack 2>/dev/null &&
test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+ rm -rf incremental-repack-true &&
+ git init incremental-repack-true &&
+ (
+ cd incremental-repack-true &&
+ git config core.multiPackIndex true &&
+ run_incremental_repack_and_verify
+ )
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+ rm -rf incremental-repack-unset &&
+ git init incremental-repack-unset &&
+ (
+ cd incremental-repack-unset &&
+ test_unconfig core.multiPackIndex &&
+ run_incremental_repack_and_verify
+ )
'
test_expect_success 'pack-refs task' '
--
2.33.0.1079.g6e70778dc9-goog
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH v5 0/3] Use default values from settings instead of config
2021-10-15 20:16 ` [PATCH v5 " Glen Choo
` (2 preceding siblings ...)
2021-10-15 20:16 ` [PATCH v5 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-15 21:31 ` Junio C Hamano
3 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-15 21:31 UTC (permalink / raw)
To: Glen Choo
Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee,
Ævar Arnfjörð Bjarmason
Thanks, all three changes look quite well reasoned.
Will queue.
^ permalink raw reply [flat|nested] 65+ messages in thread