git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: "Neeraj K. Singh" <neerajsi@microsoft.com>,
	Neeraj Singh <neerajsi@ntdev.microsoft.com>
Subject: Re: [PATCH] read-cache: make the index write buffer size 128K
Date: Fri, 19 Feb 2021 14:12:42 -0500	[thread overview]
Message-ID: <f52df30b-4ab0-fd6f-17f8-70daed81df39@jeffhostetler.com> (raw)
In-Reply-To: <pull.877.git.1613616506949.gitgitgadget@gmail.com>



On 2/17/21 9:48 PM, Neeraj K. Singh via GitGitGadget wrote:
> From: Neeraj Singh <neerajsi@ntdev.microsoft.com>
> 
> Writing an index 8K at a time invokes the OS filesystem and caching code
> very frequently, introducing noticeable overhead while writing large
> indexes. When experimenting with different write buffer sizes on Windows
> writing the Windows OS repo index (260MB), most of the benefit came by
> bumping the index write buffer size to 64K. I picked 128K to ensure that
> we're past the knee of the curve.
> 
> With this change, the time under do_write_index for an index with 3M
> files goes from ~1.02s to ~0.72s.

[...]

>   
> -#define WRITE_BUFFER_SIZE 8192
> +#define WRITE_BUFFER_SIZE (128 * 1024)
>   static unsigned char write_buffer[WRITE_BUFFER_SIZE];
>   static unsigned long write_buffer_len;

[...]

Very nice.

I can confirm that this gives nice gains on Windows.  (I'm using
the Office repo which has a 188MB index file (2.1M files at HEAD).
Running "git status" shows a gain of about 200ms.

We get a smaller gain on Mac of about 50ms (again, using the Office
repo).

So, you may add my sign-off or ACK to this.
     Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>



FWIW, You might take a look at `t/perf/p0007-write-cache.sh`
Update it as follows:

```
diff --git a/t/perf/p0007-write-cache.sh b/t/perf/p0007-write-cache.sh
index 09595264f0..337280ff1c 100755
--- a/t/perf/p0007-write-cache.sh
+++ b/t/perf/p0007-write-cache.sh
@@ -4,7 +4,8 @@ test_description="Tests performance of writing the index"

  . ./perf-lib.sh

-test_perf_default_repo
+test_perf_large_repo

  test_expect_success "setup repo" '
         if git rev-parse --verify refs/heads/p0006-ballast^{commit}
```


Then you can run it like this:

     $ cd t/perf
     $ GIT_PERF_LARGE_REPO=/path/to/your/enlistment ./p0007-write-cache

Then you can run it with the small and then with the large buffer and
get times for essentially just the index write in isolation.

Hope this helps,
Jeff

  reply	other threads:[~2021-02-19 19:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18  2:48 [PATCH] read-cache: make the index write buffer size 128K Neeraj K. Singh via GitGitGadget
2021-02-19 19:12 ` Jeff Hostetler [this message]
2021-02-20  3:28   ` Junio C Hamano
2021-02-20  7:56     ` Neeraj Singh
2021-02-21 12:51       ` Junio C Hamano
2021-02-24 20:56         ` Neeraj Singh
2021-02-25  5:41           ` Junio C Hamano
2021-02-25  6:58             ` Chris Torek
2021-02-25  7:16               ` Junio C Hamano
2021-02-25  7:36                 ` Neeraj Singh
2021-02-25  7:57                   ` Chris Torek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f52df30b-4ab0-fd6f-17f8-70daed81df39@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=neerajsi@microsoft.com \
    --cc=neerajsi@ntdev.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).