git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Use default values from settings instead of config
@ 2021-09-13 18:12 Glen Choo
  2021-09-13 18:12 ` [PATCH 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
                   ` (4 more replies)
  0 siblings, 5 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-13 18:12 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, derrickstolee

Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.

Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.

I'm fairly new to the codebase (this is my first patch!), so I have some
questions/concerns that I wasn't able to figure out:
- prepare_repo_settings() may have undesirable side effects or may
  not always be callable
- I can't find tests that cover the changes to gc, so I'm not sure if my
  understanding of gc is correct

Let me know if there is something I'm missing :)

CC-ing Derrick Stolee who has done work on both prepare_repo_settings and gc.

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 +++--
 builtin/gc.c                |  5 ++---
 t/t5318-commit-graph.sh     | 23 ++++++++++++++++++++++-
 t/t5319-multi-pack-index.sh | 15 ++++++++++++++-
 4 files changed, 41 insertions(+), 7 deletions(-)

-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
@ 2021-09-13 18:12 ` Glen Choo
  2021-09-13 19:29   ` Taylor Blau
  2021-09-13 18:12 ` [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-13 18:12 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, derrickstolee

the_repository->settings() is the preferred way to get certain
settings (such as "core.commitGraph") because it gets default values
from prepare_repo_settings(). However, cmd_fsck() reads the config
directly via git_config_get_bool(), which bypasses these default values.
This causes fsck to ignore the commit graph if "core.commitgraph" is not
explicitly set in the config, even though commit graph is enabled by
default.

Replace git_config_get_bool("core.commitgraph") in fsck with the
equivalent call to the_repository->settings.core_commit_graph.

The expected behavior is that fsck respects the config value when it is
set, but uses the default value when it is unset. For example, for
core.commitGraph, there are three cases:

- Config value is set to true -> enabled
- Config value is set to false -> disabled
- Config value is unset -> enabled

As such, tests cover all three cases.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c          |  3 ++-
 t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		fsck_enable_object_names(&fsck_walk_options);
 
 	git_config(git_fsck_config, &fsck_obj_options);
+	prepare_repo_settings(the_repository);
 
 	if (connectivity_only) {
 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	check_connectivity();
 
-	if (!git_config_get_bool("core.commitgraph", &i) && i) {
+	if (the_repository->settings.core_commit_graph) {
 		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
 		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index af88f805aa..48c5096757 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -674,7 +674,7 @@ test_expect_success 'detect incorrect chunk count' '
 		$GRAPH_CHUNK_LOOKUP_OFFSET
 '
 
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
@@ -683,6 +683,27 @@ test_expect_success 'git fsck (checks commit-graph)' '
 	test_must_fail git fsck
 '
 
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	test_when_finished "git config core.commitGraph true" &&
+
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+        git config --unset core.commitGraph &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_must_fail git fsck
+'
+
 test_expect_success 'setup non-the_repository tests' '
 	rm -rf repo &&
 	git init repo &&
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
  2021-09-13 18:12 ` [PATCH 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-09-13 18:12 ` Glen Choo
  2021-09-13 19:35   ` Taylor Blau
  2021-09-13 18:12 ` [PATCH 3/3] gc: perform incremental repack " Glen Choo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-13 18:12 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, derrickstolee

Like the previous commit, the_repository->settings.core_multi_pack_index
is preferable to reading "core.multiPackIndex" from the config because
prepare_repo_settings() sets a default value for
the_repository->settings.

Replace git_config_get_bool("core.multiPackIndex") in fsck with the
equivalent call to the_repository->settings.multi_pack_index.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c              |  2 +-
 t/t5319-multi-pack-index.sh | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (!git_config_get_bool("core.multipackindex", &i) && i) {
+	if (the_repository->settings.core_multi_pack_index) {
 		struct child_process midx_verify = CHILD_PROCESS_INIT;
 		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c3..1a17446cf0 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -404,12 +404,25 @@ test_expect_success 'verify incorrect offset' '
 		"incorrect object offset"
 '
 
-test_expect_success 'git-fsck incorrect offset' '
+test_expect_success 'git-fsck incorrect offset (config set to true)' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" \
 		"git -c core.multipackindex=true fsck"
 '
 
+test_expect_success 'git-fsck incorrect offset (config set to false)' '
+	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
+		"incorrect object offset" \
+		"git -c core.multipackindex=true fsck" &&
+		git -c core.multipackindex=false fsck
+'
+
+test_expect_success 'git-fsck incorrect offset (config unset)' '
+	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
+		"incorrect object offset" \
+		"git fsck"
+'
+
 test_expect_success 'corrupt MIDX is not reused' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" &&
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 3/3] gc: perform incremental repack when implictly enabled
  2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
  2021-09-13 18:12 ` [PATCH 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
  2021-09-13 18:12 ` [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-09-13 18:12 ` Glen Choo
  2021-09-13 19:37   ` Taylor Blau
  2021-09-14  4:00   ` Bagas Sanjaya
  2021-09-13 19:19 ` [PATCH 0/3] Use default values from settings instead of config Taylor Blau
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
  4 siblings, 2 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-13 18:12 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, derrickstolee

builtin/gc.c has two ways of checking of multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

the_repository->settings.core_multi_pack_index should be preferred
because it has default values from prepare_repo_settings().

Standardize on the_repository->settings.core_multi_pack_index to check
if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/gc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index f05d2f0a1a..070b7dccb1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1051,12 +1051,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
 static int incremental_repack_auto_condition(void)
 {
 	struct packed_git *p;
-	int enabled;
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	if (git_config_get_bool("core.multiPackIndex", &enabled) ||
-	    !enabled)
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
 	git_config_get_int("maintenance.incremental-repack.auto",
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 0/3] Use default values from settings instead of config
  2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
                   ` (2 preceding siblings ...)
  2021-09-13 18:12 ` [PATCH 3/3] gc: perform incremental repack " Glen Choo
@ 2021-09-13 19:19 ` Taylor Blau
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
  4 siblings, 0 replies; 65+ messages in thread
From: Taylor Blau @ 2021-09-13 19:19 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, derrickstolee

On Mon, Sep 13, 2021 at 11:12:18AM -0700, Glen Choo wrote:
> I'm fairly new to the codebase (this is my first patch!), so I have some
> questions/concerns that I wasn't able to figure out:

Welcome to the Git community! :-).

> - prepare_repo_settings() may have undesirable side effects or may
>   not always be callable

Calling prepare_repo_settings() is definitely the right thing to do,
because (as you note) it centralizes the default values for settings
that it keeps track of.

You can call prepare_repo_settings() so long as you have a repository to
call it on. Since fsck and gc must run inside of a repository, the
callers you added are safe. And note that prepare_repo_settings() is
idempotent, i.e., that it is a noop when called more than once.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 18:12 ` [PATCH 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-09-13 19:29   ` Taylor Blau
  2021-09-13 19:33     ` Eric Sunshine
  2021-09-13 23:15     ` Glen Choo
  0 siblings, 2 replies; 65+ messages in thread
From: Taylor Blau @ 2021-09-13 19:29 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, derrickstolee

On Mon, Sep 13, 2021 at 11:12:19AM -0700, Glen Choo wrote:
> the_repository->settings() is the preferred way to get certain
> settings (such as "core.commitGraph") because it gets default values
> from prepare_repo_settings(). However, cmd_fsck() reads the config
> directly via git_config_get_bool(), which bypasses these default values.
> This causes fsck to ignore the commit graph if "core.commitgraph" is not
> explicitly set in the config, even though commit graph is enabled by
> default.

Small nit; "the_repository->settings()" should be spelled as
"the_repository->settings", since "settings" is not a function.

It may be worth noting that this was totally fine before
core.commitGraph's default changed to true. That happened in 31b1de6a09
(commit-graph: turn on commit-graph by default, 2019-08-13), which is
the first time this sort of regression would have appeared.

>  	if (connectivity_only) {
>  		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
> @@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>
>  	check_connectivity();
>
> -	if (!git_config_get_bool("core.commitgraph", &i) && i) {
> +	if (the_repository->settings.core_commit_graph) {
>  		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
>  		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };

Here's the main part of your change, which is obviously correct (I'm
glossing over the earlier part where you call prepare_repo_settings();
also required and obviously correct).

> +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
> +	cd "$TRASH_DIRECTORY/full" &&
> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> +	git -c core.commitGraph=false fsck

Nit; I recommend replacing the `-c` style configuration with
`test_config`, which modifies `$GIT_DIR/config` but only for the
duration of the sub-shell.

> +'
> +
> +test_expect_success 'git fsck (checks commit-graph when config unset)' '
> +	test_when_finished "git config core.commitGraph true" &&

... which would allow you to get rid of something like this (since you
can avoid modifying the state visible outside of this test).

> +	cd "$TRASH_DIRECTORY/full" &&
> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +        git config --unset core.commitGraph &&

But I'm not aware of a way to temporarily unset a configuration variable
for the duration of a test, so here I would probably write:

    test_must_fail git -c core.commitGraph= fsck

which Git interprets as "pretend this variable is unset in-core".

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 19:29   ` Taylor Blau
@ 2021-09-13 19:33     ` Eric Sunshine
  2021-09-13 19:36       ` Taylor Blau
  2021-09-13 23:15     ` Glen Choo
  1 sibling, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-13 19:33 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Glen Choo, Git List, Derrick Stolee

On Mon, Sep 13, 2021 at 3:29 PM Taylor Blau <me@ttaylorr.com> wrote:
> On Mon, Sep 13, 2021 at 11:12:19AM -0700, Glen Choo wrote:
> > +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
> > +     cd "$TRASH_DIRECTORY/full" &&
> > +     git fsck &&
> > +     corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> > +             "incorrect checksum" &&
> > +     cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> > +     git -c core.commitGraph=false fsck
>
> Nit; I recommend replacing the `-c` style configuration with
> `test_config`, which modifies `$GIT_DIR/config` but only for the
> duration of the sub-shell.

Small correction: There is no subshell here, so it would be more
accurate to say "... for the duration of the test".

Aside: In fact, test_config() can't be used in subshells due to
implementation limitations which I won't go into.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-09-13 18:12 ` [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-09-13 19:35   ` Taylor Blau
  0 siblings, 0 replies; 65+ messages in thread
From: Taylor Blau @ 2021-09-13 19:35 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, derrickstolee

On Mon, Sep 13, 2021 at 11:12:20AM -0700, Glen Choo wrote:
> Like the previous commit, the_repository->settings.core_multi_pack_index
> is preferable to reading "core.multiPackIndex" from the config because
> prepare_repo_settings() sets a default value for
> the_repository->settings.

My same suggestion about referencing the first commit for which this
would have been broken applies here, too. In this case, that'd be
18e449f86b (midx: enable core.multiPackIndex by default, 2020-09-25).

> Replace git_config_get_bool("core.multiPackIndex") in fsck with the
> equivalent call to the_repository->settings.multi_pack_index.
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  builtin/fsck.c              |  2 +-
>  t/t5319-multi-pack-index.sh | 15 ++++++++++++++-
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 1c4e485b66..5bbe8068ec 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>  		}
>  	}
>
> -	if (!git_config_get_bool("core.multipackindex", &i) && i) {
> +	if (the_repository->settings.core_multi_pack_index) {
>  		struct child_process midx_verify = CHILD_PROCESS_INIT;
>  		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };

Good.

> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 3d4d9f10c3..1a17446cf0 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -404,12 +404,25 @@ test_expect_success 'verify incorrect offset' '
>  		"incorrect object offset"
>  '
>
> -test_expect_success 'git-fsck incorrect offset' '
> +test_expect_success 'git-fsck incorrect offset (config set to true)' '
>  	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
>  		"incorrect object offset" \
>  		"git -c core.multipackindex=true fsck"
>  '
>
> +test_expect_success 'git-fsck incorrect offset (config set to false)' '
> +	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
> +		"incorrect object offset" \
> +		"git -c core.multipackindex=true fsck" &&
> +		git -c core.multipackindex=false fsck
> +'

This test is a little awkward, since it looks like the first command is
the same as the previous test (making it unnecessary). But it turns out
that it *is* necessary, since it sets up the corrupt MIDX (which is
thrown away at the end of each test).

So I would suggest one of two things:

  - Either replace this with a more flexible version of
    corrupt_midx_and_verify that allows the enclosing test to pass. Note
    that this may be awkward, since the whole point is that we want to
    notice the corruption in the first place.

  - Or, (more favorably) combine this all into a single test, like so:

        corrupt_midx_and_verify "$MIDX_BYTE_OFFSET" "\377" $objdir \
          "incorrect object offset" \
          "git fsck" &&
        test_must_fail git -c core.multiPackIndex=true fsck &&
        git -c core.multiPackIndex=false fsck

Which does all of the setup for the three tests at once, and then
specifies each behavior individually.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 19:33     ` Eric Sunshine
@ 2021-09-13 19:36       ` Taylor Blau
  0 siblings, 0 replies; 65+ messages in thread
From: Taylor Blau @ 2021-09-13 19:36 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Taylor Blau, Glen Choo, Git List, Derrick Stolee

On Mon, Sep 13, 2021 at 03:33:40PM -0400, Eric Sunshine wrote:
> On Mon, Sep 13, 2021 at 3:29 PM Taylor Blau <me@ttaylorr.com> wrote:
> > On Mon, Sep 13, 2021 at 11:12:19AM -0700, Glen Choo wrote:
> > > +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
> > > +     cd "$TRASH_DIRECTORY/full" &&
> > > +     git fsck &&
> > > +     corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> > > +             "incorrect checksum" &&
> > > +     cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> > > +     git -c core.commitGraph=false fsck
> >
> > Nit; I recommend replacing the `-c` style configuration with
> > `test_config`, which modifies `$GIT_DIR/config` but only for the
> > duration of the sub-shell.
>
> Small correction: There is no subshell here, so it would be more
> accurate to say "... for the duration of the test".

Yes, sorry about that: I did not mean to say "subshell" here (Glen:
hopefully you didn't read too far into that, since test_config() does
not work in subshells).

Thanks for catching that, Eric.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 3/3] gc: perform incremental repack when implictly enabled
  2021-09-13 18:12 ` [PATCH 3/3] gc: perform incremental repack " Glen Choo
@ 2021-09-13 19:37   ` Taylor Blau
  2021-09-14 17:41     ` Glen Choo
  2021-09-14  4:00   ` Bagas Sanjaya
  1 sibling, 1 reply; 65+ messages in thread
From: Taylor Blau @ 2021-09-13 19:37 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, derrickstolee

On Mon, Sep 13, 2021 at 11:12:21AM -0700, Glen Choo wrote:
> builtin/gc.c has two ways of checking of multi-pack-index is enabled:
> - git_config_get_bool() in incremental_repack_auto_condition()
> - the_repository->settings.core_multi_pack_index in
>   maintenance_task_incremental_repack()

Is this hinting at why there are no new tests added here? If so, it may
need to be explained more clearly, since I was a tad confused after
reading it. But if not, this patch message deserves an extra sentence or
two saying why tests aren't necessary.

Or if none of that is the case, and tests *are* necessary, then they
should be added ;).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 19:29   ` Taylor Blau
  2021-09-13 19:33     ` Eric Sunshine
@ 2021-09-13 23:15     ` Glen Choo
  2021-09-13 23:32       ` Eric Sunshine
  2021-09-14  1:07       ` Taylor Blau
  1 sibling, 2 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-13 23:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Glen Choo, git, derrickstolee

Thanks for the thorough review! I really appreciate it :)

On Mon, Sep 13, 2021 at 03:29:25PM -0400, Taylor Blau wrote:
> Small nit; "the_repository->settings()" should be spelled as
> "the_repository->settings", since "settings" is not a function.

Oh that's a good catch. Thanks!

> It may be worth noting that this was totally fine before
> core.commitGraph's default changed to true. That happened in 31b1de6a09
> (commit-graph: turn on commit-graph by default, 2019-08-13), which is
> the first time this sort of regression would have appeared.

Sounds good, I'll note that down.

> Nit; I recommend replacing the `-c` style configuration with
> `test_config`, which modifies `$GIT_DIR/config` but only for the
> duration of the sub-shell.

Sounds good. This works great :)

> > +	cd "$TRASH_DIRECTORY/full" &&
> > +	git fsck &&
> > +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> > +		"incorrect checksum" &&
> > +        git config --unset core.commitGraph &&
> 
> But I'm not aware of a way to temporarily unset a configuration variable
> for the duration of a test, so here I would probably write:
> 
>     test_must_fail git -c core.commitGraph= fsck
> 
> which Git interprets as "pretend this variable is unset in-core".

From my testing, I suspect that git does not pretend the variable is
unset, but rather, it pretends that the variable is set to the empty
string. It seems that git behaves as if the variable is set to "false".
This is called out in Documentation/git.txt:

  Including the equals but with an empty value (like `git -c
  foo.bar= ...`) sets `foo.bar` to the empty string which `git config
  --type=bool` will convert to `false`.

If the variable really is set to false, how might we proceed here? Shall
we stick with test_when_finished?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 23:15     ` Glen Choo
@ 2021-09-13 23:32       ` Eric Sunshine
  2021-09-14  1:09         ` Taylor Blau
  2021-09-14  1:07       ` Taylor Blau
  1 sibling, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-13 23:32 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, Git List, Derrick Stolee

On Mon, Sep 13, 2021 at 7:19 PM Glen Choo <chooglen@google.com> wrote:
> On Mon, Sep 13, 2021 at 03:29:25PM -0400, Taylor Blau wrote:
> > > +        git config --unset core.commitGraph &&
> >
> > But I'm not aware of a way to temporarily unset a configuration variable
> > for the duration of a test, so here I would probably write:
> >
> >     test_must_fail git -c core.commitGraph= fsck
> >
> > which Git interprets as "pretend this variable is unset in-core".
>
> From my testing, I suspect that git does not pretend the variable is
> unset, but rather, it pretends that the variable is set to the empty
> string. It seems that git behaves as if the variable is set to "false".
> This is called out in Documentation/git.txt:
>
>   Including the equals but with an empty value (like `git -c
>   foo.bar= ...`) sets `foo.bar` to the empty string which `git config
>   --type=bool` will convert to `false`.
>
> If the variable really is set to false, how might we proceed here? Shall
> we stick with test_when_finished?

That's probably reasonable, however, for robustness, you should
probably use test_unconfig() rather than raw `git config --unset` to
clear the variable.

Aside: This certainly makes one wonder if we should have a new
function in t/test-lib-functions.sh which unsets a variable for the
duration of a test only. However, that's outside the scope of this
submission.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 23:15     ` Glen Choo
  2021-09-13 23:32       ` Eric Sunshine
@ 2021-09-14  1:07       ` Taylor Blau
  1 sibling, 0 replies; 65+ messages in thread
From: Taylor Blau @ 2021-09-14  1:07 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, git, derrickstolee

On Mon, Sep 13, 2021 at 04:15:09PM -0700, Glen Choo wrote:
> > > +	cd "$TRASH_DIRECTORY/full" &&
> > > +	git fsck &&
> > > +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> > > +		"incorrect checksum" &&
> > > +        git config --unset core.commitGraph &&
> >
> > But I'm not aware of a way to temporarily unset a configuration variable
> > for the duration of a test, so here I would probably write:
> >
> >     test_must_fail git -c core.commitGraph= fsck
> >
> > which Git interprets as "pretend this variable is unset in-core".
>
> From my testing, I suspect that git does not pretend the variable is
> unset, but rather, it pretends that the variable is set to the empty
> string. It seems that git behaves as if the variable is set to "false".
> This is called out in Documentation/git.txt:
>
>   Including the equals but with an empty value (like `git -c
>   foo.bar= ...`) sets `foo.bar` to the empty string which `git config
>   --type=bool` will convert to `false`.
>
> If the variable really is set to false, how might we proceed here? Shall
> we stick with test_when_finished?

Hmm, I thought that we did support `git -c foo.bar=` as shorthand to
unset the key `foo.bar`, but I must have been mistaken, because the
documentation there is quite clear.

In that case, I think your original approach to use test_when_finished
makes sense and is good. Thanks for double checking!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-13 23:32       ` Eric Sunshine
@ 2021-09-14  1:09         ` Taylor Blau
  2021-09-14  2:05           ` Eric Sunshine
  0 siblings, 1 reply; 65+ messages in thread
From: Taylor Blau @ 2021-09-14  1:09 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Glen Choo, Taylor Blau, Git List, Derrick Stolee

On Mon, Sep 13, 2021 at 07:32:07PM -0400, Eric Sunshine wrote:
> On Mon, Sep 13, 2021 at 7:19 PM Glen Choo <chooglen@google.com> wrote:
> > On Mon, Sep 13, 2021 at 03:29:25PM -0400, Taylor Blau wrote:
> > > > +        git config --unset core.commitGraph &&
> > >
> > > But I'm not aware of a way to temporarily unset a configuration variable
> > > for the duration of a test, so here I would probably write:
> > >
> > >     test_must_fail git -c core.commitGraph= fsck
> > >
> > > which Git interprets as "pretend this variable is unset in-core".
> >
> > From my testing, I suspect that git does not pretend the variable is
> > unset, but rather, it pretends that the variable is set to the empty
> > string. It seems that git behaves as if the variable is set to "false".
> > This is called out in Documentation/git.txt:
> >
> >   Including the equals but with an empty value (like `git -c
> >   foo.bar= ...`) sets `foo.bar` to the empty string which `git config
> >   --type=bool` will convert to `false`.
> >
> > If the variable really is set to false, how might we proceed here? Shall
> > we stick with test_when_finished?
>
> That's probably reasonable, however, for robustness, you should
> probably use test_unconfig() rather than raw `git config --unset` to
> clear the variable.

Hmm. I'm not so sure, do other tests rely on the value of that variable?
If so, test_unconfig() won't restore them.

Even if we don't have any such tests now, it seems like we should err on
the side of leaving it alone (although I suppose that future tests could
set core.commitGraph to whatever value they need as long as they use the
test_config function).

> Aside: This certainly makes one wonder if we should have a new
> function in t/test-lib-functions.sh which unsets a variable for the
> duration of a test only. However, that's outside the scope of this
> submission.

:-). I thought the same thing to myself when reviewing earlier today.
That's why I recommended using test_when_finished upthread, but either
approach is fine (my comments are definitely cosmetic, and don't matter
to the substance of this thread, so ultimately I am fine with either).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-14  1:09         ` Taylor Blau
@ 2021-09-14  2:05           ` Eric Sunshine
  0 siblings, 0 replies; 65+ messages in thread
From: Eric Sunshine @ 2021-09-14  2:05 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Glen Choo, Git List, Derrick Stolee

On Mon, Sep 13, 2021 at 9:09 PM Taylor Blau <me@ttaylorr.com> wrote:
> On Mon, Sep 13, 2021 at 07:32:07PM -0400, Eric Sunshine wrote:
> > On Mon, Sep 13, 2021 at 7:19 PM Glen Choo <chooglen@google.com> wrote:
> > > If the variable really is set to false, how might we proceed here? Shall
> > > we stick with test_when_finished?
> >
> > That's probably reasonable, however, for robustness, you should
> > probably use test_unconfig() rather than raw `git config --unset` to
> > clear the variable.
>
> Hmm. I'm not so sure, do other tests rely on the value of that variable?
> If so, test_unconfig() won't restore them.

There may be a misunderstanding. I wasn't saying that test_unconfig()
alone would work. My "That's probably reasonable" referred to Glen's
proposal of combining `git config --unset` with test_when_finished()
to restore the variable. In addition to that, I suggested
test_unconfig() as being a more robust choice in that recipe.

> > Aside: This certainly makes one wonder if we should have a new
> > function in t/test-lib-functions.sh which unsets a variable for the
> > duration of a test only. However, that's outside the scope of this
> > submission.
>
> :-). I thought the same thing to myself when reviewing earlier today.
> That's why I recommended using test_when_finished upthread, but either
> approach is fine (my comments are definitely cosmetic, and don't matter
> to the substance of this thread, so ultimately I am fine with either).

Yep.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 3/3] gc: perform incremental repack when implictly enabled
  2021-09-13 18:12 ` [PATCH 3/3] gc: perform incremental repack " Glen Choo
  2021-09-13 19:37   ` Taylor Blau
@ 2021-09-14  4:00   ` Bagas Sanjaya
  2021-09-16 17:15     ` Glen Choo
  1 sibling, 1 reply; 65+ messages in thread
From: Bagas Sanjaya @ 2021-09-14  4:00 UTC (permalink / raw)
  To: Glen Choo, git; +Cc: derrickstolee

On 14/09/21 01.12, Glen Choo wrote:
> builtin/gc.c has two ways of checking of multi-pack-index is enabled:
> - git_config_get_bool() in incremental_repack_auto_condition()
> - the_repository->settings.core_multi_pack_index in
>    maintenance_task_incremental_repack()
> 
> the_repository->settings.core_multi_pack_index should be preferred
> because it has default values from prepare_repo_settings().
> 
> Standardize on the_repository->settings.core_multi_pack_index to check
> if multi-pack-index is enabled.

I was wonder what the correlations between checking if MIDX is enabled 
and performing incremental repack are. What are these?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 3/3] gc: perform incremental repack when implictly enabled
  2021-09-13 19:37   ` Taylor Blau
@ 2021-09-14 17:41     ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-14 17:41 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Glen Choo, git, derrickstolee

On Mon, Sep 13, 2021 at 03:37:26PM -0400, Taylor Blau wrote:
> > builtin/gc.c has two ways of checking of multi-pack-index is enabled:
> > - git_config_get_bool() in incremental_repack_auto_condition()
> > - the_repository->settings.core_multi_pack_index in
> >   maintenance_task_incremental_repack()
> 
> Is this hinting at why there are no new tests added here? If so, it may
> need to be explained more clearly, since I was a tad confused after
> reading it.

Looks like I'll need to be clearer; this wasn't my intention at all. I
was hoping to describe the current state of affairs and to show that
there are two different approaches. Thus, this patch is an attempt to
'standardize' the two approaches.

> But if not, this patch message deserves an extra sentence or
> two saying why tests aren't necessary.
>
> Or if none of that is the case, and tests *are* necessary, then they
> should be added ;).

I initially did not include tests because it *seemed* to me that there
were no tests for this. But on a second pass, it looks like that
assumption was completely wrong (the tests are in t7900-maintenance.sh).

So, tests are necessary, and so I will add them :)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 3/3] gc: perform incremental repack when implictly enabled
  2021-09-14  4:00   ` Bagas Sanjaya
@ 2021-09-16 17:15     ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-16 17:15 UTC (permalink / raw)
  To: Bagas Sanjaya; +Cc: Glen Choo, git, derrickstolee

On Tue, Sep 14, 2021 at 11:00:34AM +0700, Bagas Sanjaya wrote:
> I was wonder what the correlations between checking if MIDX is enabled and
> performing incremental repack are. What are these?

(I'm not an expert on the issue, so don't take what I say as definitive :))

Here's what I was able to find in Documentation/git-maintenance.txt:

  The `incremental-repack` job repacks the object directory
	using the `multi-pack-index` feature.[...]

My guess at the intended behavior is that disabling midx should make
git-maintenance behave as if it were unaware of midx features, such as
incremental repack. This is quite akin to the documented behavior of fsck in
this patch series, where fsck only verifies midx and commit-graphs when the
respective features are enabled.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 0/3] Use default values from settings instead of config
  2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
                   ` (3 preceding siblings ...)
  2021-09-13 19:19 ` [PATCH 0/3] Use default values from settings instead of config Taylor Blau
@ 2021-09-17 22:54 ` Glen Choo
  2021-09-17 22:54   ` [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
                     ` (4 more replies)
  4 siblings, 5 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-17 22:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

Thanks so much for the review, the comments have been really helpful!
v2 doesn't change any business logic, it's mostly focused on
incorporating the feedback around commit messages and tests.

Changes since v1:

* clean up typo in patch 1 commit message 
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
  does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 ++-
 builtin/gc.c                |  5 +--
 t/t5318-commit-graph.sh     | 24 +++++++++-
 t/t5319-multi-pack-index.sh |  4 +-
 t/t7900-maintenance.sh      | 88 ++++++++++++++++++++++++++-----------
 5 files changed, 94 insertions(+), 32 deletions(-)

Range-diff against v1:
1:  6bddc4e158 ! 1:  97ab2bb529 fsck: verify commit graph when implicitly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify commit graph when implicitly enabled
     
    -    the_repository->settings() is the preferred way to get certain
    +    the_repository->settings is the preferred way to get certain
         settings (such as "core.commitGraph") because it gets default values
         from prepare_repo_settings(). However, cmd_fsck() reads the config
         directly via git_config_get_bool(), which bypasses these default values.
         This causes fsck to ignore the commit graph if "core.commitgraph" is not
    -    explicitly set in the config, even though commit graph is enabled by
    -    default.
    +    explicitly set in the config. This worked fine until commit-graph was
    +    enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
    +    default, 2019-08-13).
     
         Replace git_config_get_bool("core.commitgraph") in fsck with the
         equivalent call to the_repository->settings.core_commit_graph.
    @@ t/t5318-commit-graph.sh: test_expect_success 'git fsck (checks commit-graph)' '
     +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
     +		"incorrect checksum" &&
     +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
    -+	git -c core.commitGraph=false fsck
    ++	test_config core.commitGraph false &&
    ++	git fsck
     +'
     +
     +test_expect_success 'git fsck (checks commit-graph when config unset)' '
    @@ t/t5318-commit-graph.sh: test_expect_success 'git fsck (checks commit-graph)' '
     +	git fsck &&
     +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
     +		"incorrect checksum" &&
    -+        git config --unset core.commitGraph &&
    ++	test_unconfig core.commitGraph &&
     +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
     +	test_must_fail git fsck
     +'
2:  3b86afc45b ! 2:  ace92453ca fsck: verify multi-pack-index when implictly enabled
    @@ Commit message
         Like the previous commit, the_repository->settings.core_multi_pack_index
         is preferable to reading "core.multiPackIndex" from the config because
         prepare_repo_settings() sets a default value for
    -    the_repository->settings.
    +    the_repository->settings. This worked fine until core.multiPackIndex was
    +    enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
    +    default, 2020-09-25).
     
         Replace git_config_get_bool("core.multiPackIndex") in fsck with the
         equivalent call to the_repository->settings.multi_pack_index.
    @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
     
      ## t/t5319-multi-pack-index.sh ##
     @@ t/t5319-multi-pack-index.sh: test_expect_success 'verify incorrect offset' '
    - 		"incorrect object offset"
    - '
    - 
    --test_expect_success 'git-fsck incorrect offset' '
    -+test_expect_success 'git-fsck incorrect offset (config set to true)' '
    + test_expect_success 'git-fsck incorrect offset' '
      	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
      		"incorrect object offset" \
    - 		"git -c core.multipackindex=true fsck"
    - '
    - 
    -+test_expect_success 'git-fsck incorrect offset (config set to false)' '
    -+	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
    -+		"incorrect object offset" \
    +-		"git -c core.multipackindex=true fsck"
     +		"git -c core.multipackindex=true fsck" &&
    ++		test_must_fail git fsck &&
     +		git -c core.multipackindex=false fsck
    -+'
    -+
    -+test_expect_success 'git-fsck incorrect offset (config unset)' '
    -+	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
    -+		"incorrect object offset" \
    -+		"git fsck"
    -+'
    -+
    + '
    + 
      test_expect_success 'corrupt MIDX is not reused' '
    - 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
    - 		"incorrect object offset" &&
3:  d122ce3bf8 ! 3:  d6959d61c1 gc: perform incremental repack when implictly enabled
    @@ Metadata
      ## Commit message ##
         gc: perform incremental repack when implictly enabled
     
    -    builtin/gc.c has two ways of checking of multi-pack-index is enabled:
    +    builtin/gc.c has two ways of checking if multi-pack-index is enabled:
         - git_config_get_bool() in incremental_repack_auto_condition()
         - the_repository->settings.core_multi_pack_index in
           maintenance_task_incremental_repack()
     
    -    the_repository->settings.core_multi_pack_index should be preferred
    -    because it has default values from prepare_repo_settings().
    +    The two implementations have existed since the incremental-repack task
    +    was introduced in e841a79a13 (maintenance: add incremental-repack auto
    +    condition, 2020-09-25). These two values can diverge because
    +    prepare_repo_settings() enables the feature in the_repository->settings
    +    by default.
     
    -    Standardize on the_repository->settings.core_multi_pack_index to check
    -    if multi-pack-index is enabled.
    +    In the case where core.multiPackIndex is not set in the config, the auto
    +    condition would fail, causing the incremental-repack task to not be
    +    run. Because we always want to consider the default values, we should
    +    just always just use the_repository->settings.
    +
    +    Standardize on using the_repository->settings.core_multi_pack_index to
    +    check if multi-pack-index is enabled.
     
         Signed-off-by: Glen Choo <chooglen@google.com>
     
    @@ builtin/gc.c: static int maintenance_task_loose_objects(struct maintenance_run_o
      		return 0;
      
      	git_config_get_int("maintenance.incremental-repack.auto",
    +
    + ## t/t7900-maintenance.sh ##
    +@@ t/t7900-maintenance.sh: test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
    + '
    + 
    + test_expect_success 'maintenance.incremental-repack.auto' '
    +-	git repack -adk &&
    +-	git config core.multiPackIndex true &&
    +-	git multi-pack-index write &&
    +-	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
    +-		-c maintenance.incremental-repack.auto=1 \
    +-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
    +-	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
    +-	test_commit A &&
    +-	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    +-	HEAD
    +-	^HEAD~1
    +-	EOF
    +-	GIT_TRACE2_EVENT=$(pwd)/trace-A git \
    +-		-c maintenance.incremental-repack.auto=2 \
    +-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
    +-	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
    +-	test_commit B &&
    +-	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    +-	HEAD
    +-	^HEAD~1
    +-	EOF
    +-	GIT_TRACE2_EVENT=$(pwd)/trace-B git \
    +-		-c maintenance.incremental-repack.auto=2 \
    +-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
    +-	test_subcommand git multi-pack-index write --no-progress <trace-B
    ++	(
    ++		git init incremental-repack-true &&
    ++		cd incremental-repack-true &&
    ++		git config core.multiPackIndex true &&
    ++		test_commit A &&
    ++		git repack -adk &&
    ++		git multi-pack-index write &&
    ++		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
    ++			-c maintenance.incremental-repack.auto=1 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
    ++		test_commit B &&
    ++		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    ++		HEAD
    ++		^HEAD~1
    ++		EOF
    ++		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
    ++			-c maintenance.incremental-repack.auto=2 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
    ++		test_commit C &&
    ++		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    ++		HEAD
    ++		^HEAD~1
    ++		EOF
    ++		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
    ++			-c maintenance.incremental-repack.auto=2 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand git multi-pack-index write --no-progress <trace-B
    ++	)
    ++'
    ++
    ++test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
    ++	(
    ++		git init incremental-repack-unset &&
    ++		cd incremental-repack-unset &&
    ++		test_unconfig core.multiPackIndex &&
    ++		test_commit A &&
    ++		git repack -adk &&
    ++		git multi-pack-index write &&
    ++		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
    ++			-c maintenance.incremental-repack.auto=1 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
    ++		test_commit B &&
    ++		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    ++		HEAD
    ++		^HEAD~1
    ++		EOF
    ++		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
    ++			-c maintenance.incremental-repack.auto=2 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
    ++		test_commit C &&
    ++		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
    ++		HEAD
    ++		^HEAD~1
    ++		EOF
    ++		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
    ++			-c maintenance.incremental-repack.auto=2 \
    ++			maintenance run --auto --task=incremental-repack 2>/dev/null &&
    ++		test_subcommand git multi-pack-index write --no-progress <trace-B
    ++	)
    + '
    + 
    + test_expect_success 'pack-refs task' '
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
@ 2021-09-17 22:54   ` Glen Choo
  2021-09-29  6:09     ` Eric Sunshine
  2021-09-17 22:54   ` [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-17 22:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

the_repository->settings is the preferred way to get certain
settings (such as "core.commitGraph") because it gets default values
from prepare_repo_settings(). However, cmd_fsck() reads the config
directly via git_config_get_bool(), which bypasses these default values.
This causes fsck to ignore the commit graph if "core.commitgraph" is not
explicitly set in the config. This worked fine until commit-graph was
enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
default, 2019-08-13).

Replace git_config_get_bool("core.commitgraph") in fsck with the
equivalent call to the_repository->settings.core_commit_graph.

The expected behavior is that fsck respects the config value when it is
set, but uses the default value when it is unset. For example, for
core.commitGraph, there are three cases:

- Config value is set to true -> enabled
- Config value is set to false -> disabled
- Config value is unset -> enabled

As such, tests cover all three cases.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c          |  3 ++-
 t/t5318-commit-graph.sh | 24 +++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		fsck_enable_object_names(&fsck_walk_options);
 
 	git_config(git_fsck_config, &fsck_obj_options);
+	prepare_repo_settings(the_repository);
 
 	if (connectivity_only) {
 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	check_connectivity();
 
-	if (!git_config_get_bool("core.commitgraph", &i) && i) {
+	if (the_repository->settings.core_commit_graph) {
 		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
 		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index af88f805aa..42e785cb6e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -674,7 +674,7 @@ test_expect_success 'detect incorrect chunk count' '
 		$GRAPH_CHUNK_LOOKUP_OFFSET
 '
 
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
@@ -683,6 +683,28 @@ test_expect_success 'git fsck (checks commit-graph)' '
 	test_must_fail git fsck
 '
 
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_config core.commitGraph false &&
+	git fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	test_when_finished "git config core.commitGraph true" &&
+
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	test_unconfig core.commitGraph &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_must_fail git fsck
+'
+
 test_expect_success 'setup non-the_repository tests' '
 	rm -rf repo &&
 	git init repo &&
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
  2021-09-17 22:54   ` [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-09-17 22:54   ` Glen Choo
  2021-09-29  6:20     ` Eric Sunshine
  2021-09-17 22:54   ` [PATCH v2 3/3] gc: perform incremental repack " Glen Choo
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-17 22:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

Like the previous commit, the_repository->settings.core_multi_pack_index
is preferable to reading "core.multiPackIndex" from the config because
prepare_repo_settings() sets a default value for
the_repository->settings. This worked fine until core.multiPackIndex was
enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25).

Replace git_config_get_bool("core.multiPackIndex") in fsck with the
equivalent call to the_repository->settings.multi_pack_index.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c              | 2 +-
 t/t5319-multi-pack-index.sh | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (!git_config_get_bool("core.multipackindex", &i) && i) {
+	if (the_repository->settings.core_multi_pack_index) {
 		struct child_process midx_verify = CHILD_PROCESS_INIT;
 		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c3..704db122b1 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -407,7 +407,9 @@ test_expect_success 'verify incorrect offset' '
 test_expect_success 'git-fsck incorrect offset' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" \
-		"git -c core.multipackindex=true fsck"
+		"git -c core.multipackindex=true fsck" &&
+		test_must_fail git fsck &&
+		git -c core.multipackindex=false fsck
 '
 
 test_expect_success 'corrupt MIDX is not reused' '
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v2 3/3] gc: perform incremental repack when implictly enabled
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
  2021-09-17 22:54   ` [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
  2021-09-17 22:54   ` [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-09-17 22:54   ` Glen Choo
  2021-09-29  6:39     ` Eric Sunshine
  2021-09-27 17:59   ` [PATCH v2 0/3] Use default values from settings instead of config Glen Choo
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-17 22:54 UTC (permalink / raw)
  To: git; +Cc: Glen Choo

builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.

In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
just always just use the_repository->settings.

Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/gc.c           |  5 +--
 t/t7900-maintenance.sh | 88 ++++++++++++++++++++++++++++++------------
 2 files changed, 65 insertions(+), 28 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index f05d2f0a1a..070b7dccb1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1051,12 +1051,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
 static int incremental_repack_auto_condition(void)
 {
 	struct packed_git *p;
-	int enabled;
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	if (git_config_get_bool("core.multiPackIndex", &enabled) ||
-	    !enabled)
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
 	git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 58f46c77e6..2c77ddded1 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -322,31 +322,69 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
 '
 
 test_expect_success 'maintenance.incremental-repack.auto' '
-	git repack -adk &&
-	git config core.multiPackIndex true &&
-	git multi-pack-index write &&
-	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
-		-c maintenance.incremental-repack.auto=1 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
-	test_commit A &&
-	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
-	HEAD
-	^HEAD~1
-	EOF
-	GIT_TRACE2_EVENT=$(pwd)/trace-A git \
-		-c maintenance.incremental-repack.auto=2 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
-	test_commit B &&
-	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
-	HEAD
-	^HEAD~1
-	EOF
-	GIT_TRACE2_EVENT=$(pwd)/trace-B git \
-		-c maintenance.incremental-repack.auto=2 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand git multi-pack-index write --no-progress <trace-B
+	(
+		git init incremental-repack-true &&
+		cd incremental-repack-true &&
+		git config core.multiPackIndex true &&
+		test_commit A &&
+		git repack -adk &&
+		git multi-pack-index write &&
+		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
+			-c maintenance.incremental-repack.auto=1 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
+		test_commit B &&
+		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
+		HEAD
+		^HEAD~1
+		EOF
+		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
+			-c maintenance.incremental-repack.auto=2 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
+		test_commit C &&
+		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
+		HEAD
+		^HEAD~1
+		EOF
+		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
+			-c maintenance.incremental-repack.auto=2 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand git multi-pack-index write --no-progress <trace-B
+	)
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+	(
+		git init incremental-repack-unset &&
+		cd incremental-repack-unset &&
+		test_unconfig core.multiPackIndex &&
+		test_commit A &&
+		git repack -adk &&
+		git multi-pack-index write &&
+		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
+			-c maintenance.incremental-repack.auto=1 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
+		test_commit B &&
+		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
+		HEAD
+		^HEAD~1
+		EOF
+		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
+			-c maintenance.incremental-repack.auto=2 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
+		test_commit C &&
+		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
+		HEAD
+		^HEAD~1
+		EOF
+		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
+			-c maintenance.incremental-repack.auto=2 \
+			maintenance run --auto --task=incremental-repack 2>/dev/null &&
+		test_subcommand git multi-pack-index write --no-progress <trace-B
+	)
 '
 
 test_expect_success 'pack-refs task' '
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 0/3] Use default values from settings instead of config
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
                     ` (2 preceding siblings ...)
  2021-09-17 22:54   ` [PATCH v2 3/3] gc: perform incremental repack " Glen Choo
@ 2021-09-27 17:59   ` Glen Choo
  2021-09-29  6:43     ` Eric Sunshine
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
  4 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-27 17:59 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Eric Sunshine


I wasn't clear about this in the original message, but I think this is
ready to merge in its current form. I'd love to hear from reviewers who
can poke holes in that :)

Cc Eric and Taylor who have given the most review on v1 (thanks again!).
I believe I've addressed your comments and I'd be interesting in hearing
your thoughts on v2.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-17 22:54   ` [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-09-29  6:09     ` Eric Sunshine
  2021-09-30 18:00       ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-29  6:09 UTC (permalink / raw)
  To: Glen Choo, git; +Cc: Taylor Blau

On 9/17/21 6:54 PM, Glen Choo wrote:
> the_repository->settings is the preferred way to get certain
> settings (such as "core.commitGraph") because it gets default values
> from prepare_repo_settings(). However, cmd_fsck() reads the config
> directly via git_config_get_bool(), which bypasses these default values.
> This causes fsck to ignore the commit graph if "core.commitgraph" is not
> explicitly set in the config. This worked fine until commit-graph was
> enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
> default, 2019-08-13).
> 
> Replace git_config_get_bool("core.commitgraph") in fsck with the
> equivalent call to the_repository->settings.core_commit_graph.
> [...]
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> @@ -674,7 +674,7 @@ test_expect_success 'detect incorrect chunk count' '
> -test_expect_success 'git fsck (checks commit-graph)' '
> +test_expect_success 'git fsck (checks commit-graph when config set to true)' '
>   	cd "$TRASH_DIRECTORY/full" &&
>   	git fsck &&
>   	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> @@ -683,6 +683,28 @@ test_expect_success 'git fsck (checks commit-graph)' '
>   	test_must_fail git fsck
>   '

Because I took the time to scan backward through this test script, I 
understand that `core.commitGraph` is already `true` by the time this 
test is reached, thus the new test title is accurate (for now). However, 
it would be a bit easier to reason about this and make it more robust by 
having the test itself guarantee that it is set to `true` by invoking 
`git config core.commitGraph true`.

> +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
> +	cd "$TRASH_DIRECTORY/full" &&
> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> +	test_config core.commitGraph false &&
> +	git fsck
> +'

test_config() unsets the config variable when the test completes, so I'm 
wondering if its use is in fact desirable here. I ask because, from a 
quick scan through the file, it appears that many tests in this script 
assume that `core.commitGraph` is `true`, so it's not clear that 
unsetting it at the end of this test is a good idea in general.

> +test_expect_success 'git fsck (checks commit-graph when config unset)' '
> +	test_when_finished "git config core.commitGraph true" &&
> +
> +	cd "$TRASH_DIRECTORY/full" &&

I find it somewhat confusing to reason about the behavior of 
test_when_finished() when it is invoked like this before the `cd`. It's 
true that the end-of-test `git config core.commitGraph true` will be 
invoked within `full` as desired (except in the very unusual case of the 
`cd` failing), so it's probably correct, but it requires extra thought 
to come to that conclusion. Switching the order of these two lines might 
lower the cognitive load a bit.

> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +	test_unconfig core.commitGraph &&
> +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> +	test_must_fail git fsck
> +'

Taking Taylor's comment[1] on v1 patch [2/3] into account, I wonder if 
it would be simpler for these all to be combined into a single test. 
Something like (untested):

     cd "$TRASH_DIRECTORY/full" &&
     test_when_finished "git config core.commitGraph true" &&
     git config core.commitGraph true &&
     git fsck &&
     corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
         "incorrect checksum" &&
     cp commit-graph-pre-write-test $objdir/info/commit-graph &&
     test_must_fail git fsck &&
     git config core.commitGraph false &&
     git fsck &&
     git config --unset-all core.commitGraph &&
     test_must_fail git fsck

I didn't bother using test_unconfig() since, in this case, we _know_ 
that the config variable is set, thus don't have to worry about `git 
config --unset-all` failing.

A downside of combining the tests like this is that it makes it a bit 
more cumbersome to diagnose a failure (because there is more code and 
more cases to wade through).

Anyhow, I don't have a strong feeling one way or the other about 
combining the test or leaving them separate.

[1]: https://lore.kernel.org/git/YT+n81yGPYEiMXwQ@nand.local/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-09-17 22:54   ` [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-09-29  6:20     ` Eric Sunshine
  2021-09-29 22:56       ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-29  6:20 UTC (permalink / raw)
  To: Glen Choo, git; +Cc: Taylor Blau

On 9/17/21 6:54 PM, Glen Choo wrote:
> Like the previous commit, the_repository->settings.core_multi_pack_index
> is preferable to reading "core.multiPackIndex" from the config because
> prepare_repo_settings() sets a default value for
> the_repository->settings. This worked fine until core.multiPackIndex was
> enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
> default, 2020-09-25).
> 
> Replace git_config_get_bool("core.multiPackIndex") in fsck with the
> equivalent call to the_repository->settings.multi_pack_index.
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> @@ -407,7 +407,9 @@ test_expect_success 'verify incorrect offset' '
>   test_expect_success 'git-fsck incorrect offset' '
>   	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
>   		"incorrect object offset" \
> -		"git -c core.multipackindex=true fsck"
> +		"git -c core.multipackindex=true fsck" &&
> +		test_must_fail git fsck &&
> +		git -c core.multipackindex=false fsck
>   '

I guess the newly-added `test_must_fail git fsck` line is checking the 
fallback case then `core.multipackindex` is not set. We could make this 
check a bit more robust by _ensuring_ that it is unset rather than 
relying upon whatever state the configuration is in by the time this 
test is reached. Perhaps something like this:

     ...
     "git -c core.multipackindex=true fsck" &&
     test_unconfig core.multipackindex &&
     test_must_fail git fsck &&
     git -c core.multipackindex=false fsck

The indentation is a bit unusual. It aligns nicely and is, in some 
sense, easy to read, but the two new lines are over-indented considering 
that they are siblings of the corrupt_midx_and_verify() call.

Don't know if any of this is worth a re-roll, though.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 3/3] gc: perform incremental repack when implictly enabled
  2021-09-17 22:54   ` [PATCH v2 3/3] gc: perform incremental repack " Glen Choo
@ 2021-09-29  6:39     ` Eric Sunshine
  0 siblings, 0 replies; 65+ messages in thread
From: Eric Sunshine @ 2021-09-29  6:39 UTC (permalink / raw)
  To: Glen Choo, git; +Cc: Taylor Blau

On 9/17/21 6:54 PM, Glen Choo wrote:
> builtin/gc.c has two ways of checking if multi-pack-index is enabled:
> - git_config_get_bool() in incremental_repack_auto_condition()
> - the_repository->settings.core_multi_pack_index in
>    maintenance_task_incremental_repack()
> 
> The two implementations have existed since the incremental-repack task
> was introduced in e841a79a13 (maintenance: add incremental-repack auto
> condition, 2020-09-25). These two values can diverge because
> prepare_repo_settings() enables the feature in the_repository->settings
> by default.
> 
> In the case where core.multiPackIndex is not set in the config, the auto
> condition would fail, causing the incremental-repack task to not be
> run. Because we always want to consider the default values, we should
> just always just use the_repository->settings.
> 
> Standardize on using the_repository->settings.core_multi_pack_index to
> check if multi-pack-index is enabled.
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> @@ -322,31 +322,69 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
>   test_expect_success 'maintenance.incremental-repack.auto' '
> +	(
> +		git init incremental-repack-true &&
> +		cd incremental-repack-true &&
> +		git config core.multiPackIndex true &&
> +		test_commit A &&
> +		git repack -adk &&
> +		git multi-pack-index write &&
> +		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
> +			-c maintenance.incremental-repack.auto=1 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
> +		test_commit B &&
> +		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
> +		HEAD
> +		^HEAD~1
> +		EOF
> +		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
> +			-c maintenance.incremental-repack.auto=2 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
> +		test_commit C &&
> +		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
> +		HEAD
> +		^HEAD~1
> +		EOF
> +		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
> +			-c maintenance.incremental-repack.auto=2 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand git multi-pack-index write --no-progress <trace-B
> +	)
> +'
> +
> +test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
> +	(
> +		git init incremental-repack-unset &&
> +		cd incremental-repack-unset &&
> +		test_unconfig core.multiPackIndex &&
> +		test_commit A &&
> +		git repack -adk &&
> +		git multi-pack-index write &&
> +		GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
> +			-c maintenance.incremental-repack.auto=1 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
> +		test_commit B &&
> +		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
> +		HEAD
> +		^HEAD~1
> +		EOF
> +		GIT_TRACE2_EVENT=$(pwd)/trace-A git \
> +			-c maintenance.incremental-repack.auto=2 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
> +		test_commit C &&
> +		git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
> +		HEAD
> +		^HEAD~1
> +		EOF
> +		GIT_TRACE2_EVENT=$(pwd)/trace-B git \
> +			-c maintenance.incremental-repack.auto=2 \
> +			maintenance run --auto --task=incremental-repack 2>/dev/null &&
> +		test_subcommand git multi-pack-index write --no-progress <trace-B
> +	)
>   '

The amount of code common between the two tests is significant. In 
simple cases, such duplication may not be worth worrying about, but it 
feels like we can do better here. There are a number of ways to re-use 
the code between tests. One such way would be to do something like this:

     run_incremental_repack () {
         test_commit A &&
         ...
         test_subcommand ...
     }

     test_expect_success 'maintenance.incremental-repack.auto' '
         rm -rf incremental-repack &&
         git init incremental-repack &&
         (
             cd incremental-repack &&
             git config core.multiPackIndex true &&
             run_incremental_repack
         )
     '

     test_expect_success 'maintenance.incremental-repack.auto (config 
unset)' '
         rm -rf incremental-repack &&
         git init incremental-repack &&
         (
             cd incremental-repack &&
             test_unconfig core.multiPackIndex &&
             run_incremental_repack
         )
     '

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 0/3] Use default values from settings instead of config
  2021-09-27 17:59   ` [PATCH v2 0/3] Use default values from settings instead of config Glen Choo
@ 2021-09-29  6:43     ` Eric Sunshine
  2021-09-29 22:53       ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-29  6:43 UTC (permalink / raw)
  To: Glen Choo; +Cc: Git List, Taylor Blau

On Mon, Sep 27, 2021 at 1:59 PM Glen Choo <chooglen@google.com> wrote:
> I wasn't clear about this in the original message, but I think this is
> ready to merge in its current form. I'd love to hear from reviewers who
> can poke holes in that :)
>
> Cc Eric and Taylor who have given the most review on v1 (thanks again!).
> I believe I've addressed your comments and I'd be interesting in hearing
> your thoughts on v2.

I re-read the entire thread and left a few comments on the v2 patches.
One or two of the comments might be actionable (i.e. deserve a
re-roll), but most are probably subjective.

Oh, I forgot to mention in my review of [3/3] that in:

    In the case where core.multiPackIndex is not set in the config, the auto
    condition would fail, causing the incremental-repack task to not be
    run. Because we always want to consider the default values, we should
    just always just use the_repository->settings.

you can probably drop one or both "just"s.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 0/3] Use default values from settings instead of config
  2021-09-29  6:43     ` Eric Sunshine
@ 2021-09-29 22:53       ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-29 22:53 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Taylor Blau

Eric Sunshine <sunshine@sunshineco.com> writes:

> I re-read the entire thread and left a few comments on the v2 patches.
> One or two of the comments might be actionable (i.e. deserve a
> re-roll), but most are probably subjective.

Thanks so much again!

> Oh, I forgot to mention in my review of [3/3] that in:
>
>     In the case where core.multiPackIndex is not set in the config, the auto
>     condition would fail, causing the incremental-repack task to not be
>     run. Because we always want to consider the default values, we should
>     just always just use the_repository->settings.
>
> you can probably drop one or both "just"s.

Yes, this is definitely a typo from me.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-09-29  6:20     ` Eric Sunshine
@ 2021-09-29 22:56       ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-09-29 22:56 UTC (permalink / raw)
  To: Eric Sunshine, git; +Cc: Taylor Blau

Eric Sunshine <sunshine@sunshineco.com> writes:

>>   test_expect_success 'git-fsck incorrect offset' '
>>   	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
>>   		"incorrect object offset" \
>> -		"git -c core.multipackindex=true fsck"
>> +		"git -c core.multipackindex=true fsck" &&
>> +		test_must_fail git fsck &&
>> +		git -c core.multipackindex=false fsck
>>   '
>
> I guess the newly-added `test_must_fail git fsck` line is checking the 
> fallback case then `core.multipackindex` is not set. We could make this 
> check a bit more robust by _ensuring_ that it is unset rather than 
> relying upon whatever state the configuration is in by the time this 
> test is reached. Perhaps something like this:
>
>      ...
>      "git -c core.multipackindex=true fsck" &&
>      test_unconfig core.multipackindex &&
>      test_must_fail git fsck &&
>      git -c core.multipackindex=false fsck

I think this extra robustness is worth it. I sometimes find that the
tests are a bit too interdependent to read on their own, so this is a
good step forward.

> The indentation is a bit unusual. It aligns nicely and is, in some 
> sense, easy to read, but the two new lines are over-indented considering 
> that they are siblings of the corrupt_midx_and_verify() call.

I agree. This was a typo from me, not a conscious choice.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-29  6:09     ` Eric Sunshine
@ 2021-09-30 18:00       ` Glen Choo
  2021-09-30 18:35         ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-30 18:00 UTC (permalink / raw)
  To: Eric Sunshine, git; +Cc: Taylor Blau

Eric Sunshine <sunshine@sunshineco.com> writes:

> Because I took the time to scan backward through this test script, I 
> understand that `core.commitGraph` is already `true` by the time this 
> test is reached, thus the new test title is accurate (for now). However, 
> it would be a bit easier to reason about this and make it more robust by 
> having the test itself guarantee that it is set to `true` by invoking 
> `git config core.commitGraph true`.

Reading this and..

> I find it somewhat confusing to reason about the behavior of 
> test_when_finished() when it is invoked like this before the `cd`. It's 
> true that the end-of-test `git config core.commitGraph true` will be 
> invoked within `full` as desired (except in the very unusual case of the 
> `cd` failing), so it's probably correct, but it requires extra thought 
> to come to that conclusion. Switching the order of these two lines might 
> lower the cognitive load a bit.

I'll make both of these changes. Test readability is important and
people shouldn't be wasting time thinking about test correctness.

> test_config() unsets the config variable when the test completes, so I'm 
> wondering if its use is in fact desirable here. I ask because, from a 
> quick scan through the file, it appears that many tests in this script 
> assume that `core.commitGraph` is `true`, so it's not clear that 
> unsetting it at the end of this test is a good idea in general.

This is a good point. I made the original change in response to Taylor's
comment, but I think we both didn't consider that this script assumes
`core.commitGraph` is `true`. The rest of the tests pass, but only
because the default value is true and I'd rather not have tests rely on
a happy accident.

> Taking Taylor's comment[1] on v1 patch [2/3] into account, I wonder if 
> it would be simpler for these all to be combined into a single test. 

Interesting thought. The marginal benefit here is much lower than in
patch [2/3]. The difference is that corrupt_midx_and_verify() uses an
additional $COMMAND to perform verification, but
corrupt_graph_and_verify() does not. The result is that
corrupt_midx_and_verify() is subtle and confusing when used in separate
tests, but corrupt_graph_and_verify() is not so bad.

> A downside of combining the tests like this is that it makes it a bit 
> more cumbersome to diagnose a failure (because there is more code and 
> more cases to wade through).

Yes, and we also lose the test labels that would guide the reader. It
might not be that obvious why the tests for a boolean value has 3
cases {true,false,unset}.

Taking churn into account, I'm inclined to keep the tests separate.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-30 18:00       ` Glen Choo
@ 2021-09-30 18:35         ` Glen Choo
  2021-09-30 18:39           ` Eric Sunshine
  0 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-09-30 18:35 UTC (permalink / raw)
  To: Eric Sunshine, git; +Cc: Taylor Blau


>> Because I took the time to scan backward through this test script, I 
>> understand that `core.commitGraph` is already `true` by the time this 
>> test is reached, thus the new test title is accurate (for now). However, 
>> it would be a bit easier to reason about this and make it more robust by 
>> having the test itself guarantee that it is set to `true` by invoking 
>> `git config core.commitGraph true`.

>> test_config() unsets the config variable when the test completes, so I'm 
>> wondering if its use is in fact desirable here. I ask because, from a 
>> quick scan through the file, it appears that many tests in this script 
>> assume that `core.commitGraph` is `true`, so it's not clear that 
>> unsetting it at the end of this test is a good idea in general.
>
> This is a good point. I made the original change in response to Taylor's
> comment, but I think we both didn't consider that this script assumes
> `core.commitGraph` is `true`. The rest of the tests pass, but only
> because the default value is true and I'd rather not have tests rely on
> a happy accident.

I said I would incorporate these suggestions, but I didn't propose what
changes I would actually make.

Given that each test depends on a global config value in the setup
phase, it might be simplest to read if we try to avoid anything that
touches that value. The easiest way to achieve this is to use git -c
core.commitGraph {true,false} for the {true,false} cases. Since there is
no -c equivalent for the unset case, I'll continue to use
test_unconfig() + test_when_finished() to temporarily unset the value.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-30 18:35         ` Glen Choo
@ 2021-09-30 18:39           ` Eric Sunshine
  2021-10-01 17:28             ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Eric Sunshine @ 2021-09-30 18:39 UTC (permalink / raw)
  To: Glen Choo; +Cc: Git List, Taylor Blau

On Thu, Sep 30, 2021 at 2:35 PM Glen Choo <chooglen@google.com> wrote:
> >> test_config() unsets the config variable when the test completes, so I'm
> >> wondering if its use is in fact desirable here. I ask because, from a
> >> quick scan through the file, it appears that many tests in this script
> >> assume that `core.commitGraph` is `true`, so it's not clear that
> >> unsetting it at the end of this test is a good idea in general.
> >
> > This is a good point. I made the original change in response to Taylor's
> > comment, but I think we both didn't consider that this script assumes
> > `core.commitGraph` is `true`. The rest of the tests pass, but only
> > because the default value is true and I'd rather not have tests rely on
> > a happy accident.
>
> I said I would incorporate these suggestions, but I didn't propose what
> changes I would actually make.
>
> Given that each test depends on a global config value in the setup
> phase, it might be simplest to read if we try to avoid anything that
> touches that value. The easiest way to achieve this is to use git -c
> core.commitGraph {true,false} for the {true,false} cases. Since there is
> no -c equivalent for the unset case, I'll continue to use
> test_unconfig() + test_when_finished() to temporarily unset the value.

That was my thought, as well. (I wasn't quite sure why Taylor
recommended test_config() over `-c` which you used originally. It may
just be his personal preference. Perhaps he can chime in?)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled
  2021-09-30 18:39           ` Eric Sunshine
@ 2021-10-01 17:28             ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-01 17:28 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Taylor Blau

Eric Sunshine <sunshine@sunshineco.com> writes:

> That was my thought, as well. (I wasn't quite sure why Taylor
> recommended test_config() over `-c` which you used originally. It may
> just be his personal preference. Perhaps he can chime in?)

I'd also appreciate thoughts on test_config() over `-c` :). I won't
re-roll this series yet in case there's more to the test_config() story,
but I've included a patch that shows the `-c` work that we discussed.

---
 t/t5318-commit-graph.sh | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 42e785cb6e..4fcfc2ebbc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -680,7 +680,7 @@ test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
-	test_must_fail git fsck
+	test_must_fail git -c core.commitGraph=true fsck
 '
 
 test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
@@ -689,14 +689,13 @@ test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
-	test_config core.commitGraph false &&
-	git fsck
+	git -c core.commitGraph=false fsck
 '
 
 test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	cd "$TRASH_DIRECTORY/full" &&
 	test_when_finished "git config core.commitGraph true" &&
 
-	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 0/3] Use default values from settings instead of config
  2021-09-17 22:54 ` [PATCH v2 " Glen Choo
                     ` (3 preceding siblings ...)
  2021-09-27 17:59   ` [PATCH v2 0/3] Use default values from settings instead of config Glen Choo
@ 2021-10-05  0:19   ` Glen Choo
  2021-10-05  0:19     ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
                       ` (5 more replies)
  4 siblings, 6 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05  0:19 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo

Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.

Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.

I don't anticipate any more business logic changes and the previous comments
were focused on testing style, so hopefully this should be good to merge.

Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
  test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
  copy-pasted code

Changes since v1:
* clean up typo in patch 1 commit message 
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
  does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 +++--
 builtin/gc.c                |  5 ++---
 t/t5318-commit-graph.sh     | 23 ++++++++++++++++++++++-
 t/t5319-multi-pack-index.sh |  5 ++++-
 t/t7900-maintenance.sh      | 28 ++++++++++++++++++++++++----
 5 files changed, 55 insertions(+), 11 deletions(-)

Range-diff against v2:
1:  97ab2bb529 ! 1:  2f9ff949e6 fsck: verify commit graph when implicitly enabled
    @@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
      	cd "$TRASH_DIRECTORY/full" &&
      	git fsck &&
      	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
    -@@ t/t5318-commit-graph.sh: test_expect_success 'git fsck (checks commit-graph)' '
    - 	test_must_fail git fsck
    - '
    - 
    + 		"incorrect checksum" &&
    + 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
    ++	test_must_fail git -c core.commitGraph=true fsck
    ++'
    ++
     +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
     +	cd "$TRASH_DIRECTORY/full" &&
     +	git fsck &&
     +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
     +		"incorrect checksum" &&
     +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
    -+	test_config core.commitGraph false &&
    -+	git fsck
    ++	git -c core.commitGraph=false fsck
     +'
     +
     +test_expect_success 'git fsck (checks commit-graph when config unset)' '
    ++	cd "$TRASH_DIRECTORY/full" &&
     +	test_when_finished "git config core.commitGraph true" &&
     +
    -+	cd "$TRASH_DIRECTORY/full" &&
     +	git fsck &&
     +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
     +		"incorrect checksum" &&
     +	test_unconfig core.commitGraph &&
     +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
    -+	test_must_fail git fsck
    -+'
    -+
    - test_expect_success 'setup non-the_repository tests' '
    - 	rm -rf repo &&
    - 	git init repo &&
    + 	test_must_fail git fsck
    + '
    + 
2:  ace92453ca ! 2:  b13ca2a695 fsck: verify multi-pack-index when implictly enabled
    @@ t/t5319-multi-pack-index.sh: test_expect_success 'verify incorrect offset' '
      	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
      		"incorrect object offset" \
     -		"git -c core.multipackindex=true fsck"
    -+		"git -c core.multipackindex=true fsck" &&
    -+		test_must_fail git fsck &&
    -+		git -c core.multipackindex=false fsck
    ++		"git -c core.multiPackIndex=true fsck" &&
    ++	test_unconfig core.multiPackIndex &&
    ++	test_must_fail git fsck &&
    ++	git -c core.multiPackIndex=false fsck
      '
      
      test_expect_success 'corrupt MIDX is not reused' '
3:  d6959d61c1 < -:  ---------- gc: perform incremental repack when implictly enabled
-:  ---------- > 3:  76943aff80 gc: perform incremental repack when implictly enabled
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
@ 2021-10-05  0:19     ` Glen Choo
  2021-10-05  0:19     ` [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05  0:19 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo

the_repository->settings is the preferred way to get certain
settings (such as "core.commitGraph") because it gets default values
from prepare_repo_settings(). However, cmd_fsck() reads the config
directly via git_config_get_bool(), which bypasses these default values.
This causes fsck to ignore the commit graph if "core.commitgraph" is not
explicitly set in the config. This worked fine until commit-graph was
enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
default, 2019-08-13).

Replace git_config_get_bool("core.commitgraph") in fsck with the
equivalent call to the_repository->settings.core_commit_graph.

The expected behavior is that fsck respects the config value when it is
set, but uses the default value when it is unset. For example, for
core.commitGraph, there are three cases:

- Config value is set to true -> enabled
- Config value is set to false -> disabled
- Config value is unset -> enabled

As such, tests cover all three cases.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c          |  3 ++-
 t/t5318-commit-graph.sh | 23 ++++++++++++++++++++++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		fsck_enable_object_names(&fsck_walk_options);
 
 	git_config(git_fsck_config, &fsck_obj_options);
+	prepare_repo_settings(the_repository);
 
 	if (connectivity_only) {
 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	check_connectivity();
 
-	if (!git_config_get_bool("core.commitgraph", &i) && i) {
+	if (the_repository->settings.core_commit_graph) {
 		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
 		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index af88f805aa..4fcfc2ebbc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -674,12 +674,33 @@ test_expect_success 'detect incorrect chunk count' '
 		$GRAPH_CHUNK_LOOKUP_OFFSET
 '
 
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	test_when_finished "git config core.commitGraph true" &&
+
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	test_unconfig core.commitGraph &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
 	test_must_fail git fsck
 '
 
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
  2021-10-05  0:19     ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-05  0:19     ` Glen Choo
  2021-10-05  0:19     ` [PATCH v3 3/3] gc: perform incremental repack " Glen Choo
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05  0:19 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo

Like the previous commit, the_repository->settings.core_multi_pack_index
is preferable to reading "core.multiPackIndex" from the config because
prepare_repo_settings() sets a default value for
the_repository->settings. This worked fine until core.multiPackIndex was
enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25).

Replace git_config_get_bool("core.multiPackIndex") in fsck with the
equivalent call to the_repository->settings.multi_pack_index.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c              | 2 +-
 t/t5319-multi-pack-index.sh | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (!git_config_get_bool("core.multipackindex", &i) && i) {
+	if (the_repository->settings.core_multi_pack_index) {
 		struct child_process midx_verify = CHILD_PROCESS_INIT;
 		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3d4d9f10c3..53958bfd1f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -407,7 +407,10 @@ test_expect_success 'verify incorrect offset' '
 test_expect_success 'git-fsck incorrect offset' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" \
-		"git -c core.multipackindex=true fsck"
+		"git -c core.multiPackIndex=true fsck" &&
+	test_unconfig core.multiPackIndex &&
+	test_must_fail git fsck &&
+	git -c core.multiPackIndex=false fsck
 '
 
 test_expect_success 'corrupt MIDX is not reused' '
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v3 3/3] gc: perform incremental repack when implictly enabled
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
  2021-10-05  0:19     ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
  2021-10-05  0:19     ` [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-05  0:19     ` Glen Choo
  2021-10-05 11:57     ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05  0:19 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Taylor Blau, Glen Choo

builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.

In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.

Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/gc.c           |  5 ++---
 t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index f05d2f0a1a..070b7dccb1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1051,12 +1051,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
 static int incremental_repack_auto_condition(void)
 {
 	struct packed_git *p;
-	int enabled;
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	if (git_config_get_bool("core.multiPackIndex", &enabled) ||
-	    !enabled)
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
 	git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 58f46c77e6..7b1b2581af 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -321,15 +321,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
 		 --no-progress --batch-size=2147483647 <run-2g.txt
 '
 
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+	test_commit A &&
 	git repack -adk &&
-	git config core.multiPackIndex true &&
 	git multi-pack-index write &&
 	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
 		-c maintenance.incremental-repack.auto=1 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
-	test_commit A &&
+	test_commit B &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -338,7 +338,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
-	test_commit B &&
+	test_commit C &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -347,6 +347,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+	rm -rf incremental-repack-true &&
+	git init incremental-repack-true &&
+	(
+		cd incremental-repack-true &&
+		git config core.multiPackIndex true &&
+		run_incremental_repack_and_verify
+	)
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+	rm -rf incremental-repack-unset &&
+	git init incremental-repack-unset &&
+	(
+		cd incremental-repack-unset &&
+		test_unconfig core.multiPackIndex &&
+		run_incremental_repack_and_verify
+	)
 '
 
 test_expect_success 'pack-refs task' '
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
                       ` (2 preceding siblings ...)
  2021-10-05  0:19     ` [PATCH v3 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-05 11:57     ` Ævar Arnfjörð Bjarmason
  2021-10-05 17:43       ` Derrick Stolee
  2021-10-05 22:25       ` Glen Choo
  2021-10-09  7:24     ` Junio C Hamano
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
  5 siblings, 2 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-05 11:57 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee


On Mon, Oct 04 2021, Glen Choo wrote:

> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
>
> Instead of using the config, let's use repository settings where
> available. Replace core.commitGraph and core.multiPackIndex with their
> equivalent repository settings in fsck and gc.
>
> I don't anticipate any more business logic changes and the previous comments
> were focused on testing style, so hopefully this should be good to merge.
>
> Changes since v2:
> * Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
>   test_config because test_config can affect the values in subsequent tests.
> * Rewording fix in patch 3 commit message
> * Refactor tests in patch 3 so that we use a single helper function instead of
>   copy-pasted code
>
> Changes since v1:
> * clean up typo in patch 1 commit message 
> * document the commits that patches 1 and 2 address
> * use test helpers in patch 1
> * rewrite patch 2's tests so that it's easier to tell that each test
>   does something different
> * reword patch 3 commit message to explain the bug
> * add tests to patch 3
>
> Glen Choo (3):
>   fsck: verify commit graph when implicitly enabled
>   fsck: verify multi-pack-index when implictly enabled
>   gc: perform incremental repack when implictly enabled

I've looked this over carefully, it LGTM. I noticed some bugs in this
area, but they pre-date your commits. I've pushed a
"gc/use-repo-settings" to https://github.com/avar/git.git that you can
check out, consider that branch partially as commentary, but feel free
to pick/squash anything you find there.

The range-diff for that is quoted below. Notes:

 * This is somewhat of a case of throwing rocks from a glass house, but
   I think the commit messages could really be shorter/get to the
   point. I.e. say right away what's being changed and why, instead of
   giving the reader an overview of the whole control flow.

   I just adjusted adusted 1/3 in that direction below, I think 2/3
   could do with being squashed into it & adjusted, it's really the
   exact same bug/improvement, just with a different config key that
   behaves the same.

 * The "test_unconfig" in your 2/3 is redundant, so is the "rm -rf" of
   directories that can't exist by that point in 3/3.

 * I wondered if we could just call prepare_repo_settings() in one place
   earlier in gc/maintenance instead of adding yet another call deeper
   in the control flow. I've got a 4/3 below that tries to do that, and
   (as noted in its commit message) intentionally breaks "gc.c" in a way
   that our tests don't catch, which is perhaps fallout from
   31b1de6a09b?

   So that's very much on the topic of extra improvements, i.e. your
   patches don't break that anymore than it is now, but I for one would
   very much like to see an end-state where we're confident that
   gc/maintenance do the right thing on each of the tri-states of these
   boolean variables.

   It's also a point about how to write reliable tests (again,
   pre-dating your series). I.e. instead of asserting "we do X when Y is
   true" have the test assert all of Y=[unset|false|true] in a way that
   breaks if any are different than expected.

   One of the tests you're changing has a "graph_git_two_modes" helper
   does 2/3 of that, maybe it would be easy to extend it?

 * You use corrupting the graph as a shorthand for "did we run this
   command?". See test_region for an approach that might be better,
   i.e. just log with trace2 and grep that to see if we started the
   relevant command. The test_region helper can't be used as-is for that
   (you could manually grep the emitted PERF/EVENT output), but perhaps
   that's easier/more reliable?

 * This is again, something that pre-dates your patches, but I think
   following the "is enabled?" behavior here for these variables in "git
   fsck" is highly questionable. So just some thoughts after having
   looked at that:

   Just because your config doen't say "use the graph" doesn't mean
   another process won't do so, and if the graph is broken they might
   probably encounter an error.

   Although I think we *should* have cleaned up those being fatal a
   while ago). See 43d35618055 (commit-graph write: don't die if the
   existing graph is corrupt, 2019-03-25).

   But in general for these auxiliary files (commit-graph, midix, *.rev
   etc.) I think there's a good argument to be made that fsck should
   *always* check them, and at most use the config to say something
   like:

       Hey, you've got a 100% broken commit-graph/midx/rev in your .git,
       looks like the core.XYZ enabling it is off *right now* though, so
       maybe we won't use it.  I hope you're feeling lucky...

   :)


1:  fd9a58e2c7b ! 1:  c6041f3b633 fsck: verify commit graph when implicitly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify commit graph when implicitly enabled
     
    -    the_repository->settings is the preferred way to get certain
    -    settings (such as "core.commitGraph") because it gets default values
    -    from prepare_repo_settings(). However, cmd_fsck() reads the config
    -    directly via git_config_get_bool(), which bypasses these default values.
    -    This causes fsck to ignore the commit graph if "core.commitgraph" is not
    -    explicitly set in the config. This worked fine until commit-graph was
    -    enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
    -    default, 2019-08-13).
    +    Change fsck to check the "core_commit_graph" variable set in
    +    "repo-settings.c" instead of reading the "core.commitGraph" variable,
    +    this fixes a bug where we wouldn't start "commit-graph verify" if the
    +    config key was missing, which since 31b1de6a09 (commit-graph: turn on
    +    commit-graph by default, 2019-08-13) has meant that it's on by
    +    default.
     
    -    Replace git_config_get_bool("core.commitgraph") in fsck with the
    -    equivalent call to the_repository->settings.core_commit_graph.
    -
    -    The expected behavior is that fsck respects the config value when it is
    -    set, but uses the default value when it is unset. For example, for
    -    core.commitGraph, there are three cases:
    -
    -    - Config value is set to true -> enabled
    -    - Config value is set to false -> disabled
    -    - Config value is unset -> enabled
    -
    -    As such, tests cover all three cases.
    +    Add tests to "t5318-commit-graph.sh" to check that this works, using
    +    the detection of a corrupt commit-graph as a method of seeing if
    +    "commit-graph verify" was invoked.
     
         Signed-off-by: Glen Choo <chooglen@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/fsck.c ##
     @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
2:  9d6813bc280 ! 2:  d11209ebfe1 fsck: verify multi-pack-index when implictly enabled
    @@ Commit message
     
         Signed-off-by: Glen Choo <chooglen@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/fsck.c ##
     @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
    @@ t/t5319-multi-pack-index.sh: test_expect_success 'verify incorrect offset' '
      		"incorrect object offset" \
     -		"git -c core.multipackindex=true fsck"
     +		"git -c core.multiPackIndex=true fsck" &&
    -+	test_unconfig core.multiPackIndex &&
     +	test_must_fail git fsck &&
     +	git -c core.multiPackIndex=false fsck
      '
3:  260f46568c6 ! 3:  017e2003e85 gc: perform incremental repack when implictly enabled
    @@ Commit message
     
         Signed-off-by: Glen Choo <chooglen@google.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/gc.c ##
     @@ builtin/gc.c: static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
    @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto
     +}
     +
     +test_expect_success 'maintenance.incremental-repack.auto' '
    -+	rm -rf incremental-repack-true &&
     +	git init incremental-repack-true &&
     +	(
     +		cd incremental-repack-true &&
    @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.incremental-repack.auto
     +'
     +
     +test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
    -+	rm -rf incremental-repack-unset &&
     +	git init incremental-repack-unset &&
     +	(
     +		cd incremental-repack-unset &&
-:  ----------- > 4:  04d1527f180 WIP gc/maintenance: just call prepare_repo_settings() earlier

-- >8 --
Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier

Consolidate the various calls to prepare_repo_settings() to happen at
the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
intentionally breaks things, we seem to be lacking test coverage for
cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
commit-graph by default, 2019-08-13)?

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/gc.c            | 5 +----
 t/t5318-commit-graph.sh | 2 +-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 26709311601..f59dbedc1fe 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		clean_pack_garbage();
 	}
 
-	prepare_repo_settings(the_repository);
 	if (the_repository->settings.gc_write_commit_graph == 1)
 		write_commit_graph_reachable(the_repository->objects->odb,
 					     !quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
@@ -860,7 +859,6 @@ static int run_write_commit_graph(struct maintenance_run_opts *opts)
 
 static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
 {
-	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_commit_graph)
 		return 0;
 
@@ -1052,7 +1050,6 @@ static int incremental_repack_auto_condition(void)
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
@@ -1167,7 +1164,6 @@ static int multi_pack_index_repack(struct maintenance_run_opts *opts)
 
 static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts)
 {
-	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_multi_pack_index) {
 		warning(_("skipping incremental-repack task because core.multiPackIndex is disabled"));
 		return 0;
@@ -2492,6 +2488,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix)
 	    (argc == 2 && !strcmp(argv[1], "-h")))
 		usage(builtin_maintenance_usage);
 
+	prepare_repo_settings(the_repository);
 	if (!strcmp(argv[1], "run"))
 		return maintenance_run(argc - 1, argv + 1, prefix);
 	if (!strcmp(argv[1], "start"))
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 4e4450af1ef..98123191e1e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -368,7 +368,7 @@ test_expect_success 'check that gc computes commit-graph' '
 	git commit-graph write --reachable &&
 	cp $objdir/info/commit-graph commit-graph-before-gc &&
 	git reset --hard HEAD~1 &&
-	git config gc.writeCommitGraph true &&
+	# BUG: Due to 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13)?
 	git gc &&
 	cp $objdir/info/commit-graph commit-graph-after-gc &&
 	! test_cmp_bin commit-graph-before-gc commit-graph-after-gc &&
-- 
2.33.0.1441.gbbcdb4c3c66






^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-05 11:57     ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
@ 2021-10-05 17:43       ` Derrick Stolee
  2021-10-05 19:10         ` Ævar Arnfjörð Bjarmason
  2021-10-05 22:25       ` Glen Choo
  1 sibling, 1 reply; 65+ messages in thread
From: Derrick Stolee @ 2021-10-05 17:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Glen Choo
  Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee

On 10/5/2021 7:57 AM, Ævar Arnfjörð Bjarmason wrote:

> Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier
> 
> Consolidate the various calls to prepare_repo_settings() to happen at
> the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
> intentionally breaks things, we seem to be lacking test coverage for
> cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
> commit-graph by default, 2019-08-13)?
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/gc.c            | 5 +----
>  t/t5318-commit-graph.sh | 2 +-
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 26709311601..f59dbedc1fe 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  		clean_pack_garbage();
>  	}
>  
> -	prepare_repo_settings(the_repository);
>  	if (the_repository->settings.gc_write_commit_graph == 1)

I think that removing these calls is dangerous. prepare_repo_settings()
already returns immediately if the repository already has its settings
populated. The pattern of "call prepare before using a setting" is a
safe way to future-proof the check from a movement of the call.

Putting that potential-future-problem aside, I don't see how this
change is a _benefit_ other than fewer lines of code, which is not a
quality measure in itself.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-05 17:43       ` Derrick Stolee
@ 2021-10-05 19:10         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-05 19:10 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Derrick Stolee


On Tue, Oct 05 2021, Derrick Stolee wrote:

> On 10/5/2021 7:57 AM, Ævar Arnfjörð Bjarmason wrote:
>
>> Subject: [PATCH] WIP gc/maintenance: just call prepare_repo_settings() earlier
>> 
>> Consolidate the various calls to prepare_repo_settings() to happen at
>> the start of cmd_maintenance() (TODO: and cmd_gc()). This WIP commit
>> intentionally breaks things, we seem to be lacking test coverage for
>> cmd_gc(), perhaps since 31b1de6a09b (commit-graph: turn on
>> commit-graph by default, 2019-08-13)?
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/gc.c            | 5 +----
>>  t/t5318-commit-graph.sh | 2 +-
>>  2 files changed, 2 insertions(+), 5 deletions(-)
>> 
>> diff --git a/builtin/gc.c b/builtin/gc.c
>> index 26709311601..f59dbedc1fe 100644
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -695,7 +695,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>>  		clean_pack_garbage();
>>  	}
>>  
>> -	prepare_repo_settings(the_repository);
>>  	if (the_repository->settings.gc_write_commit_graph == 1)
>
> I think that removing these calls is dangerous. prepare_repo_settings()
> already returns immediately if the repository already has its settings
> populated. The pattern of "call prepare before using a setting" is a
> safe way to future-proof the check from a movement of the call.
>
> Putting that potential-future-problem aside, I don't see how this
> change is a _benefit_ other than fewer lines of code, which is not a
> quality measure in itself.

Very dangerous, yes :) As I noted this hunk intentionaly breaks the "git
gc" command. Search for "I wondered if we could just call
prepare_repo_settings[...]" upthread.

I.e. I was commenting on how Glen's patch makes an attempt to test this
tri-state config for some cases, but not others, and that in this case
we can break "git gc" without any existing test catching it.

Which as noted might have started happening when you flipped the
core.commitGraph default, but I didn't test or confirm that, it just
seemed the most likely place to start digging from a quick "git log
-G...".

Anyway, as also noted I think this series is fine as-is, just some
feedback in case Glen or anyone would be interested in either doing a v4
or a follow-up, i.e. when I tested this I found that some code/tests in
this area were (probably) either fragile or already broken.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-05 11:57     ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
  2021-10-05 17:43       ` Derrick Stolee
@ 2021-10-05 22:25       ` Glen Choo
  1 sibling, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-05 22:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I've looked this over carefully, it LGTM. I noticed some bugs in this
> area, but they pre-date your commits. I've pushed a
> "gc/use-repo-settings" to https://github.com/avar/git.git that you can
> check out, consider that branch partially as commentary, but feel free
> to pick/squash anything you find there.

Your review is very thoughtful, thank you :)

>  * This is somewhat of a case of throwing rocks from a glass house, but
>    I think the commit messages could really be shorter/get to the
>    point. I.e. say right away what's being changed and why, instead of
>    giving the reader an overview of the whole control flow.
>
>    I just adjusted adusted 1/3 in that direction below, I think 2/3
>    could do with being squashed into it & adjusted, it's really the
>    exact same bug/improvement, just with a different config key that
>    behaves the same.

In hindsight, I think this is how I should have initially structured my
commits because this improves the signal-noise ratio. I'll keep it in
mind in the future for sure. I'm hesitant to send a v4, but I can be
persuaded to do so :)

>  * The "test_unconfig" in your 2/3 is redundant, so is the "rm -rf" of
>    directories that can't exist by that point in 3/3.

These are redundant, but they serve a defensive purpose. By being extra
defensive, I was hoping to make it clear to the reader that the test is
*guaranteed* not to conflict with/rely on previous state and reduce the
number of greps needed.

>  * You use corrupting the graph as a shorthand for "did we run this
>    command?". See test_region for an approach that might be better,
>    i.e. just log with trace2 and grep that to see if we started the
>    relevant command. The test_region helper can't be used as-is for that
>    (you could manually grep the emitted PERF/EVENT output), but perhaps
>    that's easier/more reliable?

My intention here was not to assert "did git fsck call the function I
expected?", but something more along the lines of "did git fsck catch the
corrupted commit-graph?". Calling the function seems like an
implementation detail, but the actual desired behavior is that git fsck
catches the breakage.

Perhaps the test title should be reflect what I meant, like

  - git fsck (checks commit-graph when config set to true)
  + git fsck detects corrupted commit-graph

>  * This is again, something that pre-dates your patches, but I think
>    following the "is enabled?" behavior here for these variables in "git
>    fsck" is highly questionable. So just some thoughts after having
>    looked at that:

I think it depends on how we think about core.commitGraph (and others).
Do we want Git to be able to pretend that commit-graph doesn't exist at
all? If so, then we should skip the check in git fsck. Alternatively, is
commit-graph actually part of Git's core offering? If so,
core.commitGraph is more like a way to opt *out* of some commit-graph
behavior, but fsck should still verify our core data structures. Given
that we default to true, it sounds like more of the latter, but I'm not
sure what use case this serves.

>    Just because your config doen't say "use the graph" doesn't mean
>    another process won't do so, and if the graph is broken they might
>    probably encounter an error.

While that's true, I think that the user assumes a certain degree of
risk if they are using different commands with different config values.
I don't think it's unreasonable to say that "git -c
core.commitGraph=false fsck" won't catch issues that will arise in "git
-c core.commitGraph=true foo".

>    But in general for these auxiliary files (commit-graph, midix, *.rev
>    etc.) I think there's a good argument to be made that fsck should
>    *always* check them, and at most use the config to say something
>    like:
>
>        Hey, you've got a 100% broken commit-graph/midx/rev in your .git,
>        looks like the core.XYZ enabling it is off *right now* though, so
>        maybe we won't use it.  I hope you're feeling lucky...

I think "always check" sounds like the best way to maximize visibility
for users, though I don't think core.commitGraph should control the
severity of the warning/error. If we pursue "always check", I think we
could adopt fsck.<msg-id> instead.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
                       ` (3 preceding siblings ...)
  2021-10-05 11:57     ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
@ 2021-10-09  7:24     ` Junio C Hamano
  2021-10-11 19:58       ` Glen Choo
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
  5 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-09  7:24 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau

Glen Choo <chooglen@google.com> writes:

> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.

With this merged to 'seen', the CI job with the extra set of
GIT_TEST_X settings fail.  When this topic is excluded, with
all the other topics in flight in 'seen', everything seems to
be OK.

For which GIT_TEST_X environment variables to set and export while
testing to trigger the problem, see [*1*]

For a successful test run of 'seen' without this topic, see [*2*]

For the test log of the failing run with this topic, see [*3*];
you'd need to be logged into GitHub to see the details of the errors
(e.g. click on "regular (linux-gcc...)" with red X sign on the left
hand side, then open "Run ci/print-test-failures.sh" and look for
"not ok").


[References]

*1* https://github.com/git/git/runs/3843549095?check_suite_focus=true#step:4:1677

    export GIT_TEST_SPLIT_INDEX=yes
    export GIT_TEST_MERGE_ALGORITHM=recursive
    export GIT_TEST_FULL_IN_PACK_ARRAY=true
    export GIT_TEST_OE_SIZE=10
    export GIT_TEST_OE_DELTA_SIZE=5
    export GIT_TEST_COMMIT_GRAPH=1
    export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
    export GIT_TEST_MULTI_PACK_INDEX=1
    export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1
    export GIT_TEST_ADD_I_USE_BUILTIN=1
    export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
    export GIT_TEST_WRITE_REV_INDEX=1
    export GIT_TEST_CHECKOUT_WORKERS=2

*2* https://github.com/git/git/actions/runs/1322907901 (feff65d)

    A successful CI run of 'seen' without gc/use-repo-settings.

*3* https://github.com/git/git/actions/runs/1322842689 (54a31af)

    CI run of 'seen' with gc/use-repo-settings that fails.

    The commits that is in the failing 'seen' but not in the
    succeeding tree are those from this topic, as can be seen here:

    $ git shortlog --no-merges 54a31af ^feff65d
    Glen Choo (3):
          fsck: verify commit graph when implicitly enabled
          fsck: verify multi-pack-index when implictly enabled
          gc: perform incremental repack when implictly enabled


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-09  7:24     ` Junio C Hamano
@ 2021-10-11 19:58       ` Glen Choo
  2021-10-11 20:08         ` Junio C Hamano
  0 siblings, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-11 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Eric Sunshine, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> With this merged to 'seen', the CI job with the extra set of
> GIT_TEST_X settings fail.  When this topic is excluded, with
> all the other topics in flight in 'seen', everything seems to
> be OK.

Thanks for the feedback, I did two CI runs on my fork in response:

* [1] uses the same tree as seen. Looks like osx-clang passes but
  linux-gcc fails.
* [2] is rebased onto master. Again, osx-clang passes but linux-gcc
  fails.

This should be enough for me to hunt down the issue, though I would
appreciate any hints if anyone has any :)

[1] https://github.com/chooglen/git/runs/3862097340?check_suite_focus=true
[2] https://github.com/chooglen/git/runs/3862670035?check_suite_focus=true

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-11 19:58       ` Glen Choo
@ 2021-10-11 20:08         ` Junio C Hamano
  2021-10-11 20:48           ` Glen Choo
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-11 20:08 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau

Glen Choo <chooglen@google.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> With this merged to 'seen', the CI job with the extra set of
>> GIT_TEST_X settings fail.  When this topic is excluded, with
>> all the other topics in flight in 'seen', everything seems to
>> be OK.
>
> Thanks for the feedback, I did two CI runs on my fork in response:
>
> * [1] uses the same tree as seen. Looks like osx-clang passes but
>   linux-gcc fails.
> * [2] is rebased onto master. Again, osx-clang passes but linux-gcc
>   fails.
>
> This should be enough for me to hunt down the issue, though I would
> appreciate any hints if anyone has any :)

Using the set of GIT_TEST_FOO... settings I cited in the
Reference#1, in the message you are responding to, will let you
reproduce the issue locally on glinux boxes.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v3 0/3] Use default values from settings instead of config
  2021-10-11 20:08         ` Junio C Hamano
@ 2021-10-11 20:48           ` Glen Choo
  0 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-11 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Eric Sunshine, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> Using the set of GIT_TEST_FOO... settings I cited in the
> Reference#1, in the message you are responding to, will let you
> reproduce the issue locally on glinux boxes.

Ah, I didn't notice that ci/run-build-and-tests.sh uses different
settings for different environments. With those settings, I can
reproduce the failure on my OS X machine.

Thanks so much!

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-05  0:19   ` [PATCH v3 " Glen Choo
                       ` (4 preceding siblings ...)
  2021-10-09  7:24     ` Junio C Hamano
@ 2021-10-12 17:42     ` Glen Choo
  2021-10-12 17:42       ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
                         ` (5 more replies)
  5 siblings, 6 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau,
	Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo

Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.

Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.

This re-roll is primarily motivated by the CI failures noted by Junio in [1].
The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
job) causes commands like "git commit" to write commit-graphs, but certain tests
like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
fixes a bug where fsck did not check commit-graphs by default, which means that
these tests will now fail because they have invalid commit-graphs. The easiest
solution I found is to disable this confusing and noisy behavior with
GIT_TEST_COMMIT_GRAPH=0.

And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
separate.

This series has also seen a healthy amount of interest in test style and
coverage. We haven't converged on a path for the future, but I'd like to think
that the tests here are still a step in (approximately) the right direction.
Hopefully this version LGT us while we figure out what to do with tests :)

[1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/

Changes since v3
* Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
  way that is incompatible with commit-graphs.
* Make patch 1 and 2's commit messages more concise (thanks Ævar!).

Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
  test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
  copy-pasted code

Changes since v1:
* clean up typo in patch 1 commit message 
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
  does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 +++--
 builtin/gc.c                |  5 ++---
 t/t0410-partial-clone.sh    |  6 +++++-
 t/t3800-mktag.sh            |  5 +++++
 t/t5318-commit-graph.sh     | 23 ++++++++++++++++++++++-
 t/t5319-multi-pack-index.sh |  5 ++++-
 t/t7900-maintenance.sh      | 28 ++++++++++++++++++++++++----
 7 files changed, 65 insertions(+), 12 deletions(-)

Range-diff against v3:
1:  2f9ff949e6 ! 1:  aac1253e7b fsck: verify commit graph when implicitly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify commit graph when implicitly enabled
     
    -    the_repository->settings is the preferred way to get certain
    -    settings (such as "core.commitGraph") because it gets default values
    -    from prepare_repo_settings(). However, cmd_fsck() reads the config
    -    directly via git_config_get_bool(), which bypasses these default values.
    -    This causes fsck to ignore the commit graph if "core.commitgraph" is not
    -    explicitly set in the config. This worked fine until commit-graph was
    -    enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
    -    default, 2019-08-13).
    +    Change fsck to check the "core_commit_graph" variable set in
    +    "repo-settings.c" instead of reading the "core.commitGraph" variable.
    +    This fixes a bug where we wouldn't verify the commit-graph if the
    +    config key was missing. This bug was introduced in
    +    31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
    +    where core.commitGraph was turned on by default.
     
    -    Replace git_config_get_bool("core.commitgraph") in fsck with the
    -    equivalent call to the_repository->settings.core_commit_graph.
    -
    -    The expected behavior is that fsck respects the config value when it is
    -    set, but uses the default value when it is unset. For example, for
    -    core.commitGraph, there are three cases:
    -
    -    - Config value is set to true -> enabled
    -    - Config value is set to false -> disabled
    -    - Config value is unset -> enabled
    -
    -    As such, tests cover all three cases.
    +    Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
    +    commit-graph as expected for the 3 values of core.commitGraph. Also,
    +    disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
    +    assume that commit-graph checking is disabled (t/t3800-mktag.sh,
    +    t/t0410-partial-clone.sh).
     
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Glen Choo <chooglen@google.com>
     
      ## builtin/fsck.c ##
    @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
      		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
      
     
    + ## t/t0410-partial-clone.sh ##
    +@@ t/t0410-partial-clone.sh: test_description='partial clone'
    + 
    + # missing promisor objects cause repacks which write bitmaps to fail
    + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
    ++# When enabled, some commands will write commit-graphs. This causes fsck
    ++# to fail when delete_object() is called because fsck will attempt to
    ++# verify the out-of-sync commit graph.
    ++GIT_TEST_COMMIT_GRAPH=0
    + 
    + delete_object () {
    + 	rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
    +@@ t/t0410-partial-clone.sh: test_expect_success 'rev-list stops traversal at missing and promised commit' '
    + 
    + 	git -C repo config core.repositoryformatversion 1 &&
    + 	git -C repo config extensions.partialclone "arbitrary string" &&
    +-	GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
    ++	git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
    + 	grep $(git -C repo rev-parse bar) out &&
    + 	! grep $FOO out
    + '
    +
    + ## t/t3800-mktag.sh ##
    +@@ t/t3800-mktag.sh: test_description='git mktag: tag object verify test'
    + 
    + . ./test-lib.sh
    + 
    ++# When enabled, some commands will automatically write commit-graphs.
    ++# This will cause the mktag tests to fail because fsck will attempt to
    ++# verify the out-of-sync commit graph.
    ++GIT_TEST_COMMIT_GRAPH=0
    ++
    + ###########################################################
    + # check the tag.sig file, expecting verify_tag() to fail,
    + # and checking that the error message matches the pattern
    +
      ## t/t5318-commit-graph.sh ##
     @@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
      		$GRAPH_CHUNK_LOOKUP_OFFSET
2:  b13ca2a695 ! 2:  ed64983430 fsck: verify multi-pack-index when implictly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify multi-pack-index when implictly enabled
     
    -    Like the previous commit, the_repository->settings.core_multi_pack_index
    -    is preferable to reading "core.multiPackIndex" from the config because
    -    prepare_repo_settings() sets a default value for
    -    the_repository->settings. This worked fine until core.multiPackIndex was
    -    enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
    -    default, 2020-09-25).
    -
    -    Replace git_config_get_bool("core.multiPackIndex") in fsck with the
    -    equivalent call to the_repository->settings.multi_pack_index.
    +    Like the previous commit, change fsck to check the
    +    "core_multi_pack_index" variable set in "repo-settings.c" instead of
    +    reading the "core.multiPackIndex" config variable. This fixes a bug
    +    where we wouldn't verify midx if the config key was missing. This bug
    +    was introduced in 18e449f86b (midx: enable core.multiPackIndex by
    +    default, 2020-09-25) where core.multiPackIndex was turned on by default.
     
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Glen Choo <chooglen@google.com>
     
      ## builtin/fsck.c ##
3:  76943aff80 = 3:  821b492d8b gc: perform incremental repack when implictly enabled
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
@ 2021-10-12 17:42       ` Glen Choo
  2021-10-12 17:42       ` [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau,
	Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo

Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.

Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
assume that commit-graph checking is disabled (t/t3800-mktag.sh,
t/t0410-partial-clone.sh).

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c           |  3 ++-
 t/t0410-partial-clone.sh |  6 +++++-
 t/t3800-mktag.sh         |  5 +++++
 t/t5318-commit-graph.sh  | 23 ++++++++++++++++++++++-
 4 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		fsck_enable_object_names(&fsck_walk_options);
 
 	git_config(git_fsck_config, &fsck_obj_options);
+	prepare_repo_settings(the_repository);
 
 	if (connectivity_only) {
 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	check_connectivity();
 
-	if (!git_config_get_bool("core.commitgraph", &i) && i) {
+	if (the_repository->settings.core_commit_graph) {
 		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
 		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bba679685f..c76485b1b6 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -6,6 +6,10 @@ test_description='partial clone'
 
 # missing promisor objects cause repacks which write bitmaps to fail
 GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+# When enabled, some commands will write commit-graphs. This causes fsck
+# to fail when delete_object() is called because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
 
 delete_object () {
 	rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
@@ -322,7 +326,7 @@ test_expect_success 'rev-list stops traversal at missing and promised commit' '
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
+	git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
 	grep $(git -C repo rev-parse bar) out &&
 	! grep $FOO out
 '
diff --git a/t/t3800-mktag.sh b/t/t3800-mktag.sh
index 0544d58a6e..27fb15c83b 100755
--- a/t/t3800-mktag.sh
+++ b/t/t3800-mktag.sh
@@ -6,6 +6,11 @@ test_description='git mktag: tag object verify test'
 
 . ./test-lib.sh
 
+# When enabled, some commands will automatically write commit-graphs.
+# This will cause the mktag tests to fail because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
+
 ###########################################################
 # check the tag.sig file, expecting verify_tag() to fail,
 # and checking that the error message matches the pattern
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 295c5bd94d..4e4450af1e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -693,12 +693,33 @@ test_expect_success 'detect incorrect chunk count' '
 		$GRAPH_CHUNK_LOOKUP_OFFSET
 '
 
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	test_when_finished "git config core.commitGraph true" &&
+
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	test_unconfig core.commitGraph &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
 	test_must_fail git fsck
 '
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
  2021-10-12 17:42       ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-12 17:42       ` Glen Choo
  2021-10-12 17:42       ` [PATCH v4 3/3] gc: perform incremental repack " Glen Choo
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau,
	Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo

Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c              | 2 +-
 t/t5319-multi-pack-index.sh | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (!git_config_get_bool("core.multipackindex", &i) && i) {
+	if (the_repository->settings.core_multi_pack_index) {
 		struct child_process midx_verify = CHILD_PROCESS_INIT;
 		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index bd17f308b3..38999f115f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -452,7 +452,10 @@ test_expect_success 'verify incorrect offset' '
 test_expect_success 'git-fsck incorrect offset' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" \
-		"git -c core.multipackindex=true fsck"
+		"git -c core.multiPackIndex=true fsck" &&
+	test_unconfig core.multiPackIndex &&
+	test_must_fail git fsck &&
+	git -c core.multiPackIndex=false fsck
 '
 
 test_expect_success 'corrupt MIDX is not reused' '
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v4 3/3] gc: perform incremental repack when implictly enabled
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
  2021-10-12 17:42       ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
  2021-10-12 17:42       ` [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-12 17:42       ` Glen Choo
  2021-10-12 20:23       ` [PATCH v4 0/3] Use default values from settings instead of config Junio C Hamano
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-12 17:42 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau,
	Ævar Arnfjörð Bjarmason, Junio C Hamano, Glen Choo

builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.

In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.

Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/gc.c           |  5 ++---
 t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 6b3de3dd51..2670931160 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1049,12 +1049,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
 static int incremental_repack_auto_condition(void)
 {
 	struct packed_git *p;
-	int enabled;
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	if (git_config_get_bool("core.multiPackIndex", &enabled) ||
-	    !enabled)
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
 	git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 9b9f11a8e7..74aa638475 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -336,15 +336,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
 		 --no-progress --batch-size=2147483647 <run-2g.txt
 '
 
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+	test_commit A &&
 	git repack -adk &&
-	git config core.multiPackIndex true &&
 	git multi-pack-index write &&
 	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
 		-c maintenance.incremental-repack.auto=1 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
-	test_commit A &&
+	test_commit B &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -353,7 +353,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
-	test_commit B &&
+	test_commit C &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -362,6 +362,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+	rm -rf incremental-repack-true &&
+	git init incremental-repack-true &&
+	(
+		cd incremental-repack-true &&
+		git config core.multiPackIndex true &&
+		run_incremental_repack_and_verify
+	)
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+	rm -rf incremental-repack-unset &&
+	git init incremental-repack-unset &&
+	(
+		cd incremental-repack-unset &&
+		test_unconfig core.multiPackIndex &&
+		run_incremental_repack_and_verify
+	)
 '
 
 test_expect_success 'pack-refs task' '
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
                         ` (2 preceding siblings ...)
  2021-10-12 17:42       ` [PATCH v4 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-12 20:23       ` Junio C Hamano
  2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
  5 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-12 20:23 UTC (permalink / raw)
  To: Glen Choo
  Cc: git, Eric Sunshine, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Glen Choo <chooglen@google.com> writes:

> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.

OK.  Will replace.  Thanks.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
                         ` (3 preceding siblings ...)
  2021-10-12 20:23       ` [PATCH v4 0/3] Use default values from settings instead of config Junio C Hamano
@ 2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
  2021-10-12 22:29         ` Glen Choo
  2021-10-13 13:12         ` Derrick Stolee
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
  5 siblings, 2 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-12 20:34 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee


On Tue, Oct 12 2021, Glen Choo wrote:

> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
>
> Instead of using the config, let's use repository settings where
> available. Replace core.commitGraph and core.multiPackIndex with their
> equivalent repository settings in fsck and gc.
>
> This re-roll is primarily motivated by the CI failures noted by Junio in [1].
> The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
> job) causes commands like "git commit" to write commit-graphs, but certain tests
> like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
> corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
> fixes a bug where fsck did not check commit-graphs by default, which means that
> these tests will now fail because they have invalid commit-graphs. The easiest
> solution I found is to disable this confusing and noisy behavior with
> GIT_TEST_COMMIT_GRAPH=0.

That's easy, but...

> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.
>
> This series has also seen a healthy amount of interest in test style and
> coverage. We haven't converged on a path for the future, but I'd like to think
> that the tests here are still a step in (approximately) the right direction.
> Hopefully this version LGT us while we figure out what to do with tests :)
>
> [1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/
>
> Changes since v3
> * Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
>   way that is incompatible with commit-graphs.
> * Make patch 1 and 2's commit messages more concise (thanks Ævar!).

...how isn't disabling those t3800-mktag.sh tests just plasting over
corruption that we're noticing because of your changes to (rightly) fix
the bug where "fsck" wasn't checking the graph at all?

IOW haven't we just found exactly the sort of bug that
"GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
of fixing it we're hiding it?

If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
I see that we fail N number of tests, but all of them are actually
fallout of just this test:

        git replace $head_parent $head && 
        git replace -f $tree $blob 

I.e. we've created a replacement object replacing a tree with a blob, as
part of tests I added to test how mktag handles those sorts of weird
edge cases.

This then causes the graph verify code to throw a hissy fit with:

    root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
    commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
    0fbca9850869684085d654f9e1380c9780802570

I.e. when we wrote the graph we somehow didn't notice that the root tree
node we wrote is to an object that's not actually a tree? Isn't this a
bug where some part of the commit graph writing should be doing its own
extended OID lookup that's replacement-object aware, it didn't, and we
wrote a corrupt graph as a result?

If there is a legitimate reason why we're not just hiding a bug we've
turned up with these fixes let's disable that one test, not the entire
test file.

If you don't run the one test that fails (which is split up into 3
individual pieces) there's still 143 other tests that are run, all of
those presumably benefit from finding future bugs with
GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
just have turned up one just now...

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
@ 2021-10-12 22:29         ` Glen Choo
  2021-10-14 15:53           ` Ævar Arnfjörð Bjarmason
  2021-10-13 13:12         ` Derrick Stolee
  1 sibling, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-12 22:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> ...how isn't disabling those t3800-mktag.sh tests just plasting over
> corruption that we're noticing because of your changes to (rightly) fix
> the bug where "fsck" wasn't checking the graph at all?
>
> IOW haven't we just found exactly the sort of bug that
> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
> of fixing it we're hiding it?
>
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
>
>         git replace $head_parent $head && 
>         git replace -f $tree $blob 
>
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
>
> This then causes the graph verify code to throw a hissy fit with:
>
>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>     0fbca9850869684085d654f9e1380c9780802570
>
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
>
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.
>
> If you don't run the one test that fails (which is split up into 3
> individual pieces) there's still 143 other tests that are run, all of
> those presumably benefit from finding future bugs with
> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
> just have turned up one just now...

I think this falls on my shoulders. I assumed that the failures were
expected behavior, not bugs. You are right that we shouldn't be
plastering over bugs.

I'll have to ask for help here because I don't know enough about mktag
to distinguish between 'expected' and 'unexpected' failures. The best I
can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
tests. But if that's good enough for now, I'll just do that :)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
  2021-10-12 22:29         ` Glen Choo
@ 2021-10-13 13:12         ` Derrick Stolee
  2021-10-13 15:57           ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 65+ messages in thread
From: Derrick Stolee @ 2021-10-13 13:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Glen Choo
  Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee

On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
> 
>         git replace $head_parent $head && 
>         git replace -f $tree $blob 
> 
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
> 
> This then causes the graph verify code to throw a hissy fit with:
> 
>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>     0fbca9850869684085d654f9e1380c9780802570
> 
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
> 
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.

The commit-graph should be disabled if replace-objects are enabled. If
there is a bug being introduced here it is because the commit-graph is
being checked during fsck even though it would never be read when the
replace-objects exist.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-13 13:12         ` Derrick Stolee
@ 2021-10-13 15:57           ` Ævar Arnfjörð Bjarmason
  2021-10-14 16:53             ` Derrick Stolee
  0 siblings, 1 reply; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 15:57 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
	Derrick Stolee


On Wed, Oct 13 2021, Derrick Stolee wrote:

> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>> 
>>         git replace $head_parent $head && 
>>         git replace -f $tree $blob 
>> 
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>> 
>> This then causes the graph verify code to throw a hissy fit with:
>> 
>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>     0fbca9850869684085d654f9e1380c9780802570
>> 
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>> 
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>
> The commit-graph should be disabled if replace-objects are enabled. If
> there is a bug being introduced here it is because the commit-graph is
> being checked during fsck even though it would never be read when the
> replace-objects exist.
>
> Thanks,
> -Stolee

Thanks, isn't the obvious fix for this to extend your d6538246d3d
(commit-graph: not compatible with replace objects, 2018-08-20) to do
"read_replace_refs = 0;" in graph_verify()? That works for me on this
case.

I.e. if we ignore replace refs when writing the graph, and we don't use
it if there are any due to commit_graph_compatible(), why would we
consider them when verifying it?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-12 22:29         ` Glen Choo
@ 2021-10-14 15:53           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-14 15:53 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee


On Tue, Oct 12 2021, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> ...how isn't disabling those t3800-mktag.sh tests just plasting over
>> corruption that we're noticing because of your changes to (rightly) fix
>> the bug where "fsck" wasn't checking the graph at all?
>>
>> IOW haven't we just found exactly the sort of bug that
>> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
>> of fixing it we're hiding it?
>>
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>>
>>         git replace $head_parent $head && 
>>         git replace -f $tree $blob 
>>
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>>
>> This then causes the graph verify code to throw a hissy fit with:
>>
>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>     0fbca9850869684085d654f9e1380c9780802570
>>
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>>
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>>
>> If you don't run the one test that fails (which is split up into 3
>> individual pieces) there's still 143 other tests that are run, all of
>> those presumably benefit from finding future bugs with
>> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
>> just have turned up one just now...
>
> I think this falls on my shoulders. I assumed that the failures were
> expected behavior, not bugs. You are right that we shouldn't be
> plastering over bugs.
>
> I'll have to ask for help here because I don't know enough about mktag
> to distinguish between 'expected' and 'unexpected' failures. The best I
> can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
> tests. But if that's good enough for now, I'll just do that :)

I don't think your series needs to be tasked with fixing a generic issue
in the commit-graph you stumbled across.

So for your series I'd think the only required fix would be to narrowlry
skip *just* the broken tests, not the entire test file.

But in this case (see
http://lore.kernel.org/git/87pms8hq4s.fsf@evledraar.gmail.com) the fix
seems rather trivial to me.

I.e. just adding one line to builtin/commit-graph.c, so perhaps you can
see if that works for you and such a "this bug was missed because this
mode didn't work" fix could be part of a re-roll of yours?


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-13 15:57           ` Ævar Arnfjörð Bjarmason
@ 2021-10-14 16:53             ` Derrick Stolee
  2021-10-14 22:21               ` Glen Choo
  2021-10-14 22:25               ` Junio C Hamano
  0 siblings, 2 replies; 65+ messages in thread
From: Derrick Stolee @ 2021-10-14 16:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Glen Choo, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
	Derrick Stolee

On 10/13/2021 11:57 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Oct 13 2021, Derrick Stolee wrote:
> 
>> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>>> I see that we fail N number of tests, but all of them are actually
>>> fallout of just this test:
>>>
>>>         git replace $head_parent $head && 
>>>         git replace -f $tree $blob 
>>>
>>> I.e. we've created a replacement object replacing a tree with a blob, as
>>> part of tests I added to test how mktag handles those sorts of weird
>>> edge cases.
>>>
>>> This then causes the graph verify code to throw a hissy fit with:
>>>
>>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>>     0fbca9850869684085d654f9e1380c9780802570
>>>
>>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>>> node we wrote is to an object that's not actually a tree? Isn't this a
>>> bug where some part of the commit graph writing should be doing its own
>>> extended OID lookup that's replacement-object aware, it didn't, and we
>>> wrote a corrupt graph as a result?
>>>
>>> If there is a legitimate reason why we're not just hiding a bug we've
>>> turned up with these fixes let's disable that one test, not the entire
>>> test file.
>>
>> The commit-graph should be disabled if replace-objects are enabled. If
>> there is a bug being introduced here it is because the commit-graph is
>> being checked during fsck even though it would never be read when the
>> replace-objects exist.
>>
>> Thanks,
>> -Stolee
> 
> Thanks, isn't the obvious fix for this to extend your d6538246d3d
> (commit-graph: not compatible with replace objects, 2018-08-20) to do
> "read_replace_refs = 0;" in graph_verify()? That works for me on this
> case.

Ignoring the replace refs while verifying will allow you to verify the
on-disk commit-graph file without issue.
 
> I.e. if we ignore replace refs when writing the graph, and we don't use
> it if there are any due to commit_graph_compatible(), why would we
> consider them when verifying it?

I generally consider any interaction with the commit-graph while
replace refs are enabled to be the bug. We don't read the commit-graph
when replace refs are on, so why are we verifying it?

But, I understand the desire to check the on-disk content even though
it is not being used, so disabling replace refs to do the verify should
be sufficient.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-14 16:53             ` Derrick Stolee
@ 2021-10-14 22:21               ` Glen Choo
  2021-10-14 23:38                 ` Ævar Arnfjörð Bjarmason
  2021-10-14 22:25               ` Junio C Hamano
  1 sibling, 1 reply; 65+ messages in thread
From: Glen Choo @ 2021-10-14 22:21 UTC (permalink / raw)
  To: Derrick Stolee, Ævar Arnfjörð Bjarmason
  Cc: git, Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee

Derrick Stolee <stolee@gmail.com> writes:

>>> The commit-graph should be disabled if replace-objects are enabled. If
>>> there is a bug being introduced here it is because the commit-graph is
>>> being checked during fsck even though it would never be read when the
>>> replace-objects exist.
>>>
>>> Thanks,
>>> -Stolee
>> 
>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>> case.
>
> Ignoring the replace refs while verifying will allow you to verify the
> on-disk commit-graph file without issue.

It seems like we've converged on doing read_replace_refs = 0 \o/

If we are going to do this twice in graph_verify() and graph_write(), is
there any reason why I shouldn't just do "read_replace_refs = 0" once in
cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
should just ignore replace refs because they're incompatible.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-14 16:53             ` Derrick Stolee
  2021-10-14 22:21               ` Glen Choo
@ 2021-10-14 22:25               ` Junio C Hamano
  2021-10-15 15:57                 ` Junio C Hamano
  1 sibling, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2021-10-14 22:25 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Ævar Arnfjörð Bjarmason, Glen Choo, git,
	Eric Sunshine, Taylor Blau, Derrick Stolee

Derrick Stolee <stolee@gmail.com> writes:

> I generally consider any interaction with the commit-graph while
> replace refs are enabled to be the bug. We don't read the commit-graph
> when replace refs are on, so why are we verifying it?
>
> But, I understand the desire to check the on-disk content even though
> it is not being used, so disabling replace refs to do the verify should
> be sufficient.

That's quite a sensible argument.  Sounds like we have a way to
solve this that everybody agrees with.

Thanks.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-14 22:21               ` Glen Choo
@ 2021-10-14 23:38                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 65+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-14 23:38 UTC (permalink / raw)
  To: Glen Choo
  Cc: Derrick Stolee, git, Eric Sunshine, Taylor Blau, Junio C Hamano,
	Derrick Stolee


On Thu, Oct 14 2021, Glen Choo wrote:

> Derrick Stolee <stolee@gmail.com> writes:
>
>>>> The commit-graph should be disabled if replace-objects are enabled. If
>>>> there is a bug being introduced here it is because the commit-graph is
>>>> being checked during fsck even though it would never be read when the
>>>> replace-objects exist.
>>>>
>>>> Thanks,
>>>> -Stolee
>>> 
>>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>>> case.
>>
>> Ignoring the replace refs while verifying will allow you to verify the
>> on-disk commit-graph file without issue.
>
> It seems like we've converged on doing read_replace_refs = 0 \o/
>
> If we are going to do this twice in graph_verify() and graph_write(), is
> there any reason why I shouldn't just do "read_replace_refs = 0" once in
> cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
> should just ignore replace refs because they're incompatible.

No reason, I think that's the best way to do this.

I've submitted a series to fix that verify issue, as noted in the CL
these patches on top of it without that disabling of the mktag tests
will pass with GIT_TEST_COMMIT_GRAPH=true):
https://lore.kernel.org/git/cover-0.3-00000000000-20211014T233343Z-avarab@gmail.com

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH v4 0/3] Use default values from settings instead of config
  2021-10-14 22:25               ` Junio C Hamano
@ 2021-10-15 15:57                 ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-15 15:57 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Ævar Arnfjörð Bjarmason, Glen Choo, git,
	Eric Sunshine, Taylor Blau, Derrick Stolee

Junio C Hamano <gitster@pobox.com> writes:

> Derrick Stolee <stolee@gmail.com> writes:
>
>> I generally consider any interaction with the commit-graph while
>> replace refs are enabled to be the bug. We don't read the commit-graph
>> when replace refs are on, so why are we verifying it?
>>
>> But, I understand the desire to check the on-disk content even though
>> it is not being used, so disabling replace refs to do the verify should
>> be sufficient.
>
> That's quite a sensible argument.  Sounds like we have a way to
> solve this that everybody agrees with.

Would it be sensible to silently ignore, though?  Would a warning,
when we see replacement is in use and an explicit request by an end
user is made to make use of or update the commit-graph, be too loud?

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v5 0/3] Use default values from settings instead of config
  2021-10-12 17:42     ` [PATCH v4 " Glen Choo
                         ` (4 preceding siblings ...)
  2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
@ 2021-10-15 20:16       ` Glen Choo
  2021-10-15 20:16         ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
                           ` (3 more replies)
  5 siblings, 4 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Glen Choo

  Hi everyone! This patch was created in response to something we observed in
  Google, where fsck failed to detect that the commit graph was invalid. We
  initially assumed that fsck never checked the commit graph, but it turns out
  that it does so only when core.commitgraph is set, even though we set defaults
  for "whether to use the commit graph" in the repository settings.

  Instead of using the config, let's use repository settings where
  available. Replace core.commitGraph and core.multiPackIndex with their
  equivalent repository settings in fsck and gc.

v5 is based off Ævar's fix (commit-graph: make "verify" work with replace refs,
[1]). As Ævar noted in v4, the mktag test failures indicated buggy behavior
"git commit-graph verify" when replace refs are enabled, so we should fix the
bug instead of just disabling commit-graph.

Ævar, Derrick, thanks for the productive discussion, I feel like I understand
this area a little better now.

Barring any unseen bugs, this should be ready to merge.

[1] https://lore.kernel.org/git/cover-0.3-00000000000-20211014T233343Z-avarab@gmail.com/

Changes since v4
* Rebase onto [1], which fixes a bug with commit-graph verify with replace refs
  enabled.
* Remove GIT_TEST_COMMIT_GRAPH=0 in t3800-mktag.sh because the bug that it was
  covering up is now fixed.

Changes since v3
* Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
  way that is incompatible with commit-graphs.
* Make patch 1 and 2's commit messages more concise (thanks Ævar!).

Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
  test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
  copy-pasted code

Changes since v1:
* clean up typo in patch 1 commit message 
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
  does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 +++--
 builtin/gc.c                |  5 ++---
 t/t0410-partial-clone.sh    |  6 +++++-
 t/t5318-commit-graph.sh     | 23 ++++++++++++++++++++++-
 t/t5319-multi-pack-index.sh |  5 ++++-
 t/t7900-maintenance.sh      | 28 ++++++++++++++++++++++++----
 6 files changed, 60 insertions(+), 12 deletions(-)

Range-diff against v4:
1:  aac1253e7b ! 1:  567d40849a fsck: verify commit graph when implicitly enabled
    @@ Commit message
     
         Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
         commit-graph as expected for the 3 values of core.commitGraph. Also,
    -    disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
    -    assume that commit-graph checking is disabled (t/t3800-mktag.sh,
    -    t/t0410-partial-clone.sh).
    +    disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
    +    test cases use fsck in ways that assume that commit-graph checking is
    +    disabled.
     
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Glen Choo <chooglen@google.com>
    @@ t/t0410-partial-clone.sh: test_expect_success 'rev-list stops traversal at missi
      	! grep $FOO out
      '
     
    - ## t/t3800-mktag.sh ##
    -@@ t/t3800-mktag.sh: test_description='git mktag: tag object verify test'
    - 
    - . ./test-lib.sh
    - 
    -+# When enabled, some commands will automatically write commit-graphs.
    -+# This will cause the mktag tests to fail because fsck will attempt to
    -+# verify the out-of-sync commit graph.
    -+GIT_TEST_COMMIT_GRAPH=0
    -+
    - ###########################################################
    - # check the tag.sig file, expecting verify_tag() to fail,
    - # and checking that the error message matches the pattern
    -
      ## t/t5318-commit-graph.sh ##
     @@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
      		$GRAPH_CHUNK_LOOKUP_OFFSET
2:  ed64983430 = 2:  7cb44a7aeb fsck: verify multi-pack-index when implictly enabled
3:  821b492d8b = 3:  a436c45af7 gc: perform incremental repack when implictly enabled
-- 
2.33.0.1079.g6e70778dc9-goog


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
@ 2021-10-15 20:16         ` Glen Choo
  2021-10-15 20:16         ` [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Glen Choo

Change fsck to check the "core_commit_graph" variable set in
"repo-settings.c" instead of reading the "core.commitGraph" variable.
This fixes a bug where we wouldn't verify the commit-graph if the
config key was missing. This bug was introduced in
31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
where core.commitGraph was turned on by default.

Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
commit-graph as expected for the 3 values of core.commitGraph. Also,
disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some
test cases use fsck in ways that assume that commit-graph checking is
disabled.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c           |  3 ++-
 t/t0410-partial-clone.sh |  6 +++++-
 t/t5318-commit-graph.sh  | 23 ++++++++++++++++++++++-
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..1c4e485b66 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -803,6 +803,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		fsck_enable_object_names(&fsck_walk_options);
 
 	git_config(git_fsck_config, &fsck_obj_options);
+	prepare_repo_settings(the_repository);
 
 	if (connectivity_only) {
 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
@@ -908,7 +909,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	check_connectivity();
 
-	if (!git_config_get_bool("core.commitgraph", &i) && i) {
+	if (the_repository->settings.core_commit_graph) {
 		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
 		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bba679685f..c76485b1b6 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -6,6 +6,10 @@ test_description='partial clone'
 
 # missing promisor objects cause repacks which write bitmaps to fail
 GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+# When enabled, some commands will write commit-graphs. This causes fsck
+# to fail when delete_object() is called because fsck will attempt to
+# verify the out-of-sync commit graph.
+GIT_TEST_COMMIT_GRAPH=0
 
 delete_object () {
 	rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
@@ -322,7 +326,7 @@ test_expect_success 'rev-list stops traversal at missing and promised commit' '
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
+	git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
 	grep $(git -C repo rev-parse bar) out &&
 	! grep $FOO out
 '
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 84d122a7ae..f516fda7cc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -694,12 +694,33 @@ test_expect_success 'detect incorrect chunk count' '
 		$GRAPH_CHUNK_LOOKUP_OFFSET
 '
 
-test_expect_success 'git fsck (checks commit-graph)' '
+test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	test_must_fail git -c core.commitGraph=true fsck
+'
+
+test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
+	git -c core.commitGraph=false fsck
+'
+
+test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	cd "$TRASH_DIRECTORY/full" &&
+	test_when_finished "git config core.commitGraph true" &&
+
+	git fsck &&
+	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
+		"incorrect checksum" &&
+	test_unconfig core.commitGraph &&
+	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
 	test_must_fail git fsck
 '
 
-- 
2.33.0.1079.g6e70778dc9-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
  2021-10-15 20:16         ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
@ 2021-10-15 20:16         ` Glen Choo
  2021-10-15 20:16         ` [PATCH v5 3/3] gc: perform incremental repack " Glen Choo
  2021-10-15 21:31         ` [PATCH v5 0/3] Use default values from settings instead of config Junio C Hamano
  3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Glen Choo

Like the previous commit, change fsck to check the
"core_multi_pack_index" variable set in "repo-settings.c" instead of
reading the "core.multiPackIndex" config variable. This fixes a bug
where we wouldn't verify midx if the config key was missing. This bug
was introduced in 18e449f86b (midx: enable core.multiPackIndex by
default, 2020-09-25) where core.multiPackIndex was turned on by default.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c              | 2 +-
 t/t5319-multi-pack-index.sh | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1c4e485b66..5bbe8068ec 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -925,7 +925,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (!git_config_get_bool("core.multipackindex", &i) && i) {
+	if (the_repository->settings.core_multi_pack_index) {
 		struct child_process midx_verify = CHILD_PROCESS_INIT;
 		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index bd17f308b3..38999f115f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -452,7 +452,10 @@ test_expect_success 'verify incorrect offset' '
 test_expect_success 'git-fsck incorrect offset' '
 	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
 		"incorrect object offset" \
-		"git -c core.multipackindex=true fsck"
+		"git -c core.multiPackIndex=true fsck" &&
+	test_unconfig core.multiPackIndex &&
+	test_must_fail git fsck &&
+	git -c core.multiPackIndex=false fsck
 '
 
 test_expect_success 'corrupt MIDX is not reused' '
-- 
2.33.0.1079.g6e70778dc9-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH v5 3/3] gc: perform incremental repack when implictly enabled
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
  2021-10-15 20:16         ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
  2021-10-15 20:16         ` [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
@ 2021-10-15 20:16         ` Glen Choo
  2021-10-15 21:31         ` [PATCH v5 0/3] Use default values from settings instead of config Junio C Hamano
  3 siblings, 0 replies; 65+ messages in thread
From: Glen Choo @ 2021-10-15 20:16 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Taylor Blau, Junio C Hamano, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Glen Choo

builtin/gc.c has two ways of checking if multi-pack-index is enabled:
- git_config_get_bool() in incremental_repack_auto_condition()
- the_repository->settings.core_multi_pack_index in
  maintenance_task_incremental_repack()

The two implementations have existed since the incremental-repack task
was introduced in e841a79a13 (maintenance: add incremental-repack auto
condition, 2020-09-25). These two values can diverge because
prepare_repo_settings() enables the feature in the_repository->settings
by default.

In the case where core.multiPackIndex is not set in the config, the auto
condition would fail, causing the incremental-repack task to not be
run. Because we always want to consider the default values, we should
always use the_repository->settings.

Standardize on using the_repository->settings.core_multi_pack_index to
check if multi-pack-index is enabled.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/gc.c           |  5 ++---
 t/t7900-maintenance.sh | 28 ++++++++++++++++++++++++----
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 6b3de3dd51..2670931160 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1049,12 +1049,11 @@ static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
 static int incremental_repack_auto_condition(void)
 {
 	struct packed_git *p;
-	int enabled;
 	int incremental_repack_auto_limit = 10;
 	int count = 0;
 
-	if (git_config_get_bool("core.multiPackIndex", &enabled) ||
-	    !enabled)
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_multi_pack_index)
 		return 0;
 
 	git_config_get_int("maintenance.incremental-repack.auto",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 9b9f11a8e7..74aa638475 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -336,15 +336,15 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
 		 --no-progress --batch-size=2147483647 <run-2g.txt
 '
 
-test_expect_success 'maintenance.incremental-repack.auto' '
+run_incremental_repack_and_verify () {
+	test_commit A &&
 	git repack -adk &&
-	git config core.multiPackIndex true &&
 	git multi-pack-index write &&
 	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
 		-c maintenance.incremental-repack.auto=1 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
-	test_commit A &&
+	test_commit B &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -353,7 +353,7 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
-	test_commit B &&
+	test_commit C &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
@@ -362,6 +362,26 @@ test_expect_success 'maintenance.incremental-repack.auto' '
 		-c maintenance.incremental-repack.auto=2 \
 		maintenance run --auto --task=incremental-repack 2>/dev/null &&
 	test_subcommand git multi-pack-index write --no-progress <trace-B
+}
+
+test_expect_success 'maintenance.incremental-repack.auto' '
+	rm -rf incremental-repack-true &&
+	git init incremental-repack-true &&
+	(
+		cd incremental-repack-true &&
+		git config core.multiPackIndex true &&
+		run_incremental_repack_and_verify
+	)
+'
+
+test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' '
+	rm -rf incremental-repack-unset &&
+	git init incremental-repack-unset &&
+	(
+		cd incremental-repack-unset &&
+		test_unconfig core.multiPackIndex &&
+		run_incremental_repack_and_verify
+	)
 '
 
 test_expect_success 'pack-refs task' '
-- 
2.33.0.1079.g6e70778dc9-goog


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH v5 0/3] Use default values from settings instead of config
  2021-10-15 20:16       ` [PATCH v5 " Glen Choo
                           ` (2 preceding siblings ...)
  2021-10-15 20:16         ` [PATCH v5 3/3] gc: perform incremental repack " Glen Choo
@ 2021-10-15 21:31         ` Junio C Hamano
  3 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2021-10-15 21:31 UTC (permalink / raw)
  To: Glen Choo
  Cc: git, Eric Sunshine, Taylor Blau, Derrick Stolee,
	Ævar Arnfjörð Bjarmason

Thanks, all three changes look quite well reasoned.

Will queue.

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2021-10-15 21:31 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 18:12 [PATCH 0/3] Use default values from settings instead of config Glen Choo
2021-09-13 18:12 ` [PATCH 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-09-13 19:29   ` Taylor Blau
2021-09-13 19:33     ` Eric Sunshine
2021-09-13 19:36       ` Taylor Blau
2021-09-13 23:15     ` Glen Choo
2021-09-13 23:32       ` Eric Sunshine
2021-09-14  1:09         ` Taylor Blau
2021-09-14  2:05           ` Eric Sunshine
2021-09-14  1:07       ` Taylor Blau
2021-09-13 18:12 ` [PATCH 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
2021-09-13 19:35   ` Taylor Blau
2021-09-13 18:12 ` [PATCH 3/3] gc: perform incremental repack " Glen Choo
2021-09-13 19:37   ` Taylor Blau
2021-09-14 17:41     ` Glen Choo
2021-09-14  4:00   ` Bagas Sanjaya
2021-09-16 17:15     ` Glen Choo
2021-09-13 19:19 ` [PATCH 0/3] Use default values from settings instead of config Taylor Blau
2021-09-17 22:54 ` [PATCH v2 " Glen Choo
2021-09-17 22:54   ` [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-09-29  6:09     ` Eric Sunshine
2021-09-30 18:00       ` Glen Choo
2021-09-30 18:35         ` Glen Choo
2021-09-30 18:39           ` Eric Sunshine
2021-10-01 17:28             ` Glen Choo
2021-09-17 22:54   ` [PATCH v2 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
2021-09-29  6:20     ` Eric Sunshine
2021-09-29 22:56       ` Glen Choo
2021-09-17 22:54   ` [PATCH v2 3/3] gc: perform incremental repack " Glen Choo
2021-09-29  6:39     ` Eric Sunshine
2021-09-27 17:59   ` [PATCH v2 0/3] Use default values from settings instead of config Glen Choo
2021-09-29  6:43     ` Eric Sunshine
2021-09-29 22:53       ` Glen Choo
2021-10-05  0:19   ` [PATCH v3 " Glen Choo
2021-10-05  0:19     ` [PATCH v3 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-05  0:19     ` [PATCH v3 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
2021-10-05  0:19     ` [PATCH v3 3/3] gc: perform incremental repack " Glen Choo
2021-10-05 11:57     ` [PATCH v3 0/3] Use default values from settings instead of config Ævar Arnfjörð Bjarmason
2021-10-05 17:43       ` Derrick Stolee
2021-10-05 19:10         ` Ævar Arnfjörð Bjarmason
2021-10-05 22:25       ` Glen Choo
2021-10-09  7:24     ` Junio C Hamano
2021-10-11 19:58       ` Glen Choo
2021-10-11 20:08         ` Junio C Hamano
2021-10-11 20:48           ` Glen Choo
2021-10-12 17:42     ` [PATCH v4 " Glen Choo
2021-10-12 17:42       ` [PATCH v4 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-12 17:42       ` [PATCH v4 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
2021-10-12 17:42       ` [PATCH v4 3/3] gc: perform incremental repack " Glen Choo
2021-10-12 20:23       ` [PATCH v4 0/3] Use default values from settings instead of config Junio C Hamano
2021-10-12 20:34       ` Ævar Arnfjörð Bjarmason
2021-10-12 22:29         ` Glen Choo
2021-10-14 15:53           ` Ævar Arnfjörð Bjarmason
2021-10-13 13:12         ` Derrick Stolee
2021-10-13 15:57           ` Ævar Arnfjörð Bjarmason
2021-10-14 16:53             ` Derrick Stolee
2021-10-14 22:21               ` Glen Choo
2021-10-14 23:38                 ` Ævar Arnfjörð Bjarmason
2021-10-14 22:25               ` Junio C Hamano
2021-10-15 15:57                 ` Junio C Hamano
2021-10-15 20:16       ` [PATCH v5 " Glen Choo
2021-10-15 20:16         ` [PATCH v5 1/3] fsck: verify commit graph when implicitly enabled Glen Choo
2021-10-15 20:16         ` [PATCH v5 2/3] fsck: verify multi-pack-index when implictly enabled Glen Choo
2021-10-15 20:16         ` [PATCH v5 3/3] gc: perform incremental repack " Glen Choo
2021-10-15 21:31         ` [PATCH v5 0/3] Use default values from settings instead of config Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).