git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, vdye@github.com, avarab@gmail.com,
	newren@gmail.com, Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 4/4] features: feature.manyFiles implies fast index writes
Date: Mon, 12 Dec 2022 16:31:17 +0000	[thread overview]
Message-ID: <e640dff53dd3f0aa48bd40fcf0e261292bbded66.1670862677.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1439.v2.git.1670862677.gitgitgadget@gmail.com>

From: Derrick Stolee <derrickstolee@github.com>

The recent addition of the index.skipHash config option allows index
writes to speed up by skipping the hash computation for the trailing
checksum. This is particularly critical for repositories with many files
at HEAD, so add this config option to two cases where users in that
scenario may opt-in to such behavior:

 1. The feature.manyFiles config option enables some options that are
    helpful for repositories with many files at HEAD.

 2. 'scalar register' and 'scalar reconfigure' set config options that
    optimize for large repositories.

In both of these cases, set index.skipHash=true to gain this
speedup. Add tests that demonstrate the proper way that
index.skipHash=true can override feature.manyFiles=true.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/config/feature.txt |  5 +++++
 read-cache.c                     |  5 ++++-
 repo-settings.c                  |  2 ++
 repository.h                     |  1 +
 scalar.c                         |  1 +
 t/t1600-index.sh                 | 13 ++++++++++++-
 6 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
index 95975e50912..e52bc6b8584 100644
--- a/Documentation/config/feature.txt
+++ b/Documentation/config/feature.txt
@@ -23,6 +23,11 @@ feature.manyFiles::
 	working directory. With many files, commands such as `git status` and
 	`git checkout` may be slow and these new defaults improve performance:
 +
+* `index.skipHash=true` speeds up index writes by not computing a trailing
+  checksum. Note that this will cause Git versions earlier than 2.13.0 to
+  refuse to parse the index and Git versions earlier than 2.40.0 will report
+  a corrupted index during `git fsck`.
++
 * `index.version=4` enables path-prefix compression in the index.
 +
 * `core.untrackedCache=true` enables the untracked cache. This setting assumes
diff --git a/read-cache.c b/read-cache.c
index 3f7de8b2e20..1844953fba7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2926,7 +2926,10 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
-	git_config_get_maybe_bool("index.skiphash", (int *)&f->skip_hash);
+	if (istate->repo) {
+		prepare_repo_settings(istate->repo);
+		f->skip_hash = istate->repo->settings.index_skip_hash;
+	}
 
 	for (i = removed = extended = 0; i < entries; i++) {
 		if (cache[i]->ce_flags & CE_REMOVE)
diff --git a/repo-settings.c b/repo-settings.c
index 3021921c53d..3dbd3f0e2ec 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -47,6 +47,7 @@ void prepare_repo_settings(struct repository *r)
 	}
 	if (manyfiles) {
 		r->settings.index_version = 4;
+		r->settings.index_skip_hash = 1;
 		r->settings.core_untracked_cache = UNTRACKED_CACHE_WRITE;
 	}
 
@@ -61,6 +62,7 @@ void prepare_repo_settings(struct repository *r)
 	repo_cfg_bool(r, "pack.usesparse", &r->settings.pack_use_sparse, 1);
 	repo_cfg_bool(r, "core.multipackindex", &r->settings.core_multi_pack_index, 1);
 	repo_cfg_bool(r, "index.sparse", &r->settings.sparse_index, 0);
+	repo_cfg_bool(r, "index.skiphash", &r->settings.index_skip_hash, r->settings.index_skip_hash);
 
 	/*
 	 * The GIT_TEST_MULTI_PACK_INDEX variable is special in that
diff --git a/repository.h b/repository.h
index 6c461c5b9de..e8c67ffe165 100644
--- a/repository.h
+++ b/repository.h
@@ -42,6 +42,7 @@ struct repo_settings {
 	struct fsmonitor_settings *fsmonitor; /* lazily loaded */
 
 	int index_version;
+	int index_skip_hash;
 	enum untracked_cache_setting core_untracked_cache;
 
 	int pack_use_sparse;
diff --git a/scalar.c b/scalar.c
index 6c52243cdf1..b49bb8c24ec 100644
--- a/scalar.c
+++ b/scalar.c
@@ -143,6 +143,7 @@ static int set_recommended_config(int reconfigure)
 		{ "credential.validate", "false", 1 }, /* GCM4W-only */
 		{ "gc.auto", "0", 1 },
 		{ "gui.GCWarning", "false", 1 },
+		{ "index.skipHash", "false", 1 },
 		{ "index.threads", "true", 1 },
 		{ "index.version", "4", 1 },
 		{ "merge.stat", "false", 1 },
diff --git a/t/t1600-index.sh b/t/t1600-index.sh
index 55914bc3506..103743a1c7d 100755
--- a/t/t1600-index.sh
+++ b/t/t1600-index.sh
@@ -71,7 +71,18 @@ test_expect_success 'index.skipHash config option' '
 	test_trailing_hash .git/index >hash &&
 	echo $(test_oid zero) >expect &&
 	test_cmp expect hash &&
-	git fsck
+	git fsck &&
+
+	rm -f .git/index &&
+	git -c feature.manyFiles=true add a &&
+	test_trailing_hash .git/index >hash &&
+	test_cmp expect hash &&
+
+	rm -f .git/index &&
+	git -c feature.manyFiles=true \
+		-c index.skipHash=false add a &&
+	test_trailing_hash .git/index >hash &&
+	! cmp expect hash
 '
 
 test_index_version () {
-- 
gitgitgadget

  parent reply	other threads:[~2022-12-12 16:31 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-07 17:25 [PATCH 0/4] Optionally skip hashing index on write Derrick Stolee via GitGitGadget
2022-12-07 17:25 ` [PATCH 1/4] hashfile: allow skipping the hash function Derrick Stolee via GitGitGadget
2022-12-07 22:13   ` Ævar Arnfjörð Bjarmason
2022-12-08  7:32     ` Jeff King
2022-12-07 17:25 ` [PATCH 2/4] read-cache: add index.skipHash config option Derrick Stolee via GitGitGadget
2022-12-07 18:59   ` Eric Sunshine
2022-12-12 13:59     ` Derrick Stolee
2022-12-12 18:55       ` Eric Sunshine
2022-12-07 22:25   ` Ævar Arnfjörð Bjarmason
2022-12-07 23:06   ` Ævar Arnfjörð Bjarmason
2022-12-08  0:05     ` Junio C Hamano
2022-12-12 14:05     ` Derrick Stolee
2022-12-12 18:01       ` Ævar Arnfjörð Bjarmason
2022-12-07 17:25 ` [PATCH 3/4] test-lib-functions: add helper for trailing hash Derrick Stolee via GitGitGadget
2022-12-07 22:27   ` Ævar Arnfjörð Bjarmason
2022-12-12 14:10     ` Derrick Stolee
2022-12-07 17:25 ` [PATCH 4/4] features: feature.manyFiles implies fast index writes Derrick Stolee via GitGitGadget
2022-12-07 22:30   ` Ævar Arnfjörð Bjarmason
2022-12-12 14:18     ` Derrick Stolee
2022-12-12 18:27       ` Ævar Arnfjörð Bjarmason
2022-12-07 23:27 ` [PATCH 0/4] Optionally skip hashing index on write Junio C Hamano
2022-12-07 23:42   ` Ævar Arnfjörð Bjarmason
2022-12-08 16:38   ` Derrick Stolee
2022-12-12 22:22     ` Jacob Keller
2022-12-12 16:31 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2022-12-12 16:31   ` [PATCH v2 1/4] hashfile: allow skipping the hash function Derrick Stolee via GitGitGadget
2022-12-12 16:31   ` [PATCH v2 2/4] read-cache: add index.skipHash config option Derrick Stolee via GitGitGadget
2022-12-12 16:31   ` [PATCH v2 3/4] test-lib-functions: add helper for trailing hash Derrick Stolee via GitGitGadget
2022-12-12 18:14     ` SZEDER Gábor
2022-12-13  0:55       ` Junio C Hamano
2022-12-17 17:37         ` SZEDER Gábor
2022-12-12 16:31   ` Derrick Stolee via GitGitGadget [this message]
2022-12-15 15:06   ` [PATCH v3 0/4] Optionally skip hashing index on write Derrick Stolee via GitGitGadget
2022-12-15 15:06     ` [PATCH v3 1/4] hashfile: allow skipping the hash function Derrick Stolee via GitGitGadget
2022-12-15 15:06     ` [PATCH v3 2/4] read-cache: add index.skipHash config option Derrick Stolee via GitGitGadget
2022-12-15 16:12       ` Ævar Arnfjörð Bjarmason
2022-12-15 15:06     ` [PATCH v3 3/4] test-lib-functions: add helper for trailing hash Derrick Stolee via GitGitGadget
2022-12-15 15:07     ` [PATCH v3 4/4] features: feature.manyFiles implies fast index writes Derrick Stolee via GitGitGadget
2022-12-15 15:56     ` [PATCH v3 0/4] Optionally skip hashing index on write Ævar Arnfjörð Bjarmason
2022-12-16 13:41       ` Derrick Stolee
2022-12-16 15:31     ` [PATCH v4 " Derrick Stolee via GitGitGadget
2022-12-16 15:31       ` [PATCH v4 1/4] hashfile: allow skipping the hash function Derrick Stolee via GitGitGadget
2022-12-16 15:31       ` [PATCH v4 2/4] read-cache: add index.skipHash config option Derrick Stolee via GitGitGadget
2022-12-16 15:31       ` [PATCH v4 3/4] test-lib-functions: add helper for trailing hash Derrick Stolee via GitGitGadget
2022-12-16 15:31       ` [PATCH v4 4/4] features: feature.manyFiles implies fast index writes Derrick Stolee via GitGitGadget
2022-12-16 15:43       ` [PATCH v4 0/4] Optionally skip hashing index on write Ævar Arnfjörð Bjarmason
2023-01-06 15:33         ` Derrick Stolee
2023-01-06 22:45           ` Junio C Hamano
2023-01-06 23:40             ` Derrick Stolee
2023-01-09 17:15               ` Ævar Arnfjörð Bjarmason
2023-01-09 18:00                 ` Derrick Stolee
2023-01-09 19:22                   ` Ævar Arnfjörð Bjarmason
2023-01-06 16:31       ` [PATCH v5 " Derrick Stolee via GitGitGadget
2023-01-06 16:31         ` [PATCH v5 1/4] hashfile: allow skipping the hash function Derrick Stolee via GitGitGadget
2023-01-06 16:31         ` [PATCH v5 2/4] read-cache: add index.skipHash config option Derrick Stolee via GitGitGadget
2023-01-06 16:31         ` [PATCH v5 3/4] test-lib-functions: add helper for trailing hash Derrick Stolee via GitGitGadget
2023-01-06 16:31         ` [PATCH v5 4/4] features: feature.manyFiles implies fast index writes Derrick Stolee via GitGitGadget
2023-01-15  9:31         ` [PATCH v5 0/4] Optionally skip hashing index on write Junio C Hamano
2023-01-17 14:49           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e640dff53dd3f0aa48bd40fcf0e261292bbded66.1670862677.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).