From: "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Neeraj K. Singh" <neerajsi@microsoft.com>,
Neeraj Singh <neerajsi@ntdev.microsoft.com>
Subject: [PATCH] read-cache: make the index write buffer size 128K
Date: Thu, 18 Feb 2021 02:48:26 +0000 [thread overview]
Message-ID: <pull.877.git.1613616506949.gitgitgadget@gmail.com> (raw)
From: Neeraj Singh <neerajsi@ntdev.microsoft.com>
Writing an index 8K at a time invokes the OS filesystem and caching code
very frequently, introducing noticeable overhead while writing large
indexes. When experimenting with different write buffer sizes on Windows
writing the Windows OS repo index (260MB), most of the benefit came by
bumping the index write buffer size to 64K. I picked 128K to ensure that
we're past the knee of the curve.
With this change, the time under do_write_index for an index with 3M
files goes from ~1.02s to ~0.72s.
Signed-off-by: Neeraj Singh <neerajsi@ntdev.microsoft.com>
---
read-cache: make the index write buffer size 128K
Writing an index 8K at a time invokes the OS filesystem and caching code
very frequently, introducing noticeable overhead while writing large
indexes. When experimenting with different write buffer sizes on Windows
writing the Windows OS repo index (260MB), most of the benefit came by
bumping the index write buffer size to 64K. I picked 128K to ensure that
we're past the knee of the curve.
With this change, the time under do_write_index for an index with 3M
files goes from ~1.02s to ~0.72s.
Signed-off-by: Neeraj Singh neerajsi@ntdev.microsoft.com
Note: This was previously discussed on the mailing list in 2016 at:
https://lore.kernel.org/git/1458350341-12276-1-git-send-email-dturner@twopensource.com/.
Since then, I believe we have a couple changes:
* 'small' development platforms like raspberry pi have gotten larger
(4GB RAM).
* spectre and meltdown make individual system calls more expensive when
mitigations are enabled
* there have been many investments to make very large repos scale well
in git, so huge repos are more common now.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-877%2Fneerajsi-msft%2Fneerajsi%2Findex-buffer-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-877/neerajsi-msft/neerajsi/index-buffer-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/877
read-cache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/read-cache.c b/read-cache.c
index 29144cf879e7..a5b2779b9586 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2447,7 +2447,7 @@ int repo_index_has_changes(struct repository *repo,
}
}
-#define WRITE_BUFFER_SIZE 8192
+#define WRITE_BUFFER_SIZE (128 * 1024)
static unsigned char write_buffer[WRITE_BUFFER_SIZE];
static unsigned long write_buffer_len;
base-commit: 45526154a57d15947cad7262230d0b935cedb9d3
--
gitgitgadget
next reply other threads:[~2021-02-18 2:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-18 2:48 Neeraj K. Singh via GitGitGadget [this message]
2021-02-19 19:12 ` [PATCH] read-cache: make the index write buffer size 128K Jeff Hostetler
2021-02-20 3:28 ` Junio C Hamano
2021-02-20 7:56 ` Neeraj Singh
2021-02-21 12:51 ` Junio C Hamano
2021-02-24 20:56 ` Neeraj Singh
2021-02-25 5:41 ` Junio C Hamano
2021-02-25 6:58 ` Chris Torek
2021-02-25 7:16 ` Junio C Hamano
2021-02-25 7:36 ` Neeraj Singh
2021-02-25 7:57 ` Chris Torek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.877.git.1613616506949.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=neerajsi@microsoft.com \
--cc=neerajsi@ntdev.microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).