From: "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Neeraj-Personal <nksingh85@gmail.com>,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Jeff King" <peff@peff.net>,
"Jeff Hostetler" <jeffhost@microsoft.com>,
"Christoph Hellwig" <hch@lst.de>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Neeraj K. Singh" <neerajsi@microsoft.com>
Subject: [PATCH 0/2] [RFC] Implement a bulk-checkin option for core.fsyncObjectFiles
Date: Wed, 25 Aug 2021 01:51:30 +0000 [thread overview]
Message-ID: <pull.1076.git.git.1629856292.gitgitgadget@gmail.com> (raw)
Git for Windows has had fsyncing of object files enabled since "409cae91eb
(mingw: change core.fsyncObjectFiles = 1 by default, 2017-09-04)".
There have been requests to make core.fsyncObjectFiles the default
everywhere, but there are concerns about its performance cost (perf results
below). There's a long and gory thread here:
https://lore.kernel.org/git/87a7xcw8sa.fsf@linux-m68k.org/t/.
My change introduces the new 'core.fsyncobjectFiles = 2' setting, which
batches the data-integrity FLUSH command sent to the disk across multiple
loose object files added to the object database.
We take advantage of the bulk-checkin hooks already in the add command and
add some hooks to the update-index (which is used internally by stash).
Details are in the last patch of the series.
Here's a simple performance test script:
#!/bin/sh
git clone https://github.com/nodejs/node.git node-repo-cache
git clone node-repo-cache node-repo
cd node-repo
git --version
find . -name "*.c" -exec sh -c 'echo foo1 >> $1' -- {} \;
echo "----GIT stash fsync"
time git -c core.fsyncObjectFiles=true stash push
find . -name "*.c" -exec sh -c 'echo foo2 >> $1' -- {} \;
echo "----GIT stash fsync_defer"
time git -c core.fsyncObjectFiles=2 stash push
find . -name "*.c" -exec sh -c 'echo foo3 >> $1' -- {} \;
echo "----GIT stash no_fsync"
time git -c core.fsyncObjectFiles=false stash push
cd ..
rm -r -f node-repo
Hardware:
* Mac - Mac Mini 2018 running MacOS 11.5.1, APFS with a 1TB Apple NMVE SSD,
* Linux - Ubuntu 20.04 - ext4 running on a Hyper-V VM with a fixed VHDX
backed by a Samsung PM981.
* Win - Windows NTFS - Same Hyper-V host as Linux. Operation | Mac | Linux
| Windows
---------------- |---------|-------|---------- git fsync | 40.6 s | 7.8 s |
6.9s git fsync_defer | 6.5 s | 2.1 s | 3.8s git no_fsync | 1.7 s | 1.0 s |
2.6s
The windows version of git is slightly different:
https://github.com/git-for-windows/git/pull/3391. I also used a
Windows-specific test script.
I hope I'm CC'ing a reasonable set of people on this patch, based on the
last discussion.
Thanks, Neeraj Singh Windows Core File Systems.
Neeraj Singh (2):
object-file: use futimes rather than utime
core.fsyncobjectfiles: batch disk flushes
Documentation/config/core.txt | 17 ++++--
Makefile | 4 ++
builtin/add.c | 3 +-
builtin/update-index.c | 3 +
bulk-checkin.c | 105 +++++++++++++++++++++++++++++++---
bulk-checkin.h | 4 +-
compat/mingw.c | 42 +++++++++-----
compat/mingw.h | 2 +
config.c | 4 +-
config.mak.uname | 2 +
configure.ac | 8 +++
git-compat-util.h | 7 +++
object-file.c | 23 ++------
wrapper.c | 36 ++++++++++++
write-or-die.c | 2 +-
15 files changed, 213 insertions(+), 49 deletions(-)
base-commit: 225bc32a989d7a22fa6addafd4ce7dcd04675dbf
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1076%2Fneerajsi-msft%2Fneerajsi%2Fbulk-fsync-object-files-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1076/neerajsi-msft/neerajsi/bulk-fsync-object-files-v1
Pull-Request: https://github.com/git/git/pull/1076
--
gitgitgadget
next reply other threads:[~2021-08-25 1:51 UTC|newest]
Thread overview: 160+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-25 1:51 Neeraj K. Singh via GitGitGadget [this message]
2021-08-25 1:51 ` [PATCH 1/2] object-file: use futimes rather than utime Neeraj Singh via GitGitGadget
2021-08-25 13:51 ` Johannes Schindelin
2021-08-25 22:08 ` Neeraj Singh
2021-08-25 1:51 ` [PATCH 2/2] core.fsyncobjectfiles: batch disk flushes Neeraj Singh via GitGitGadget
2021-08-25 5:38 ` Christoph Hellwig
2021-08-25 17:40 ` Neeraj Singh
2021-08-26 5:54 ` Christoph Hellwig
2021-08-25 16:11 ` Ævar Arnfjörð Bjarmason
2021-08-26 0:49 ` Neeraj Singh
2021-08-26 5:50 ` Christoph Hellwig
2021-08-28 0:20 ` Neeraj Singh
2021-08-28 6:57 ` Christoph Hellwig
2021-08-31 19:59 ` Neeraj Singh
2021-09-01 5:09 ` Christoph Hellwig
2021-08-26 5:57 ` Christoph Hellwig
2021-08-25 18:52 ` Johannes Schindelin
2021-08-25 21:26 ` Junio C Hamano
2021-08-26 1:19 ` Neeraj Singh
2021-08-25 16:58 ` [PATCH 0/2] [RFC] Implement a bulk-checkin option for core.fsyncObjectFiles Neeraj Singh
2021-08-27 23:49 ` [PATCH v2 0/6] Implement a batched fsync " Neeraj K. Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 1/6] object-file: use futimens rather than utime Neeraj Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 2/6] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 3/6] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 4/6] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 5/6] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-08-27 23:49 ` [PATCH v2 6/6] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-09-07 19:44 ` [PATCH v2 0/6] Implement a batched fsync option for core.fsyncObjectFiles Neeraj Singh
2021-09-07 19:50 ` Ævar Arnfjörð Bjarmason
2021-09-07 19:54 ` Randall S. Becker
2021-09-08 0:54 ` Neeraj Singh
2021-09-08 1:22 ` Ævar Arnfjörð Bjarmason
2021-09-08 14:04 ` Randall S. Becker
2021-09-08 19:01 ` Neeraj Singh
2021-09-08 0:55 ` Neeraj Singh
2021-09-08 6:44 ` Junio C Hamano
2021-09-08 6:49 ` Christoph Hellwig
2021-09-08 13:57 ` Randall S. Becker
2021-09-08 14:13 ` 'Christoph Hellwig'
2021-09-08 14:25 ` Randall S. Becker
2021-09-08 16:34 ` Neeraj Singh
2021-09-08 19:12 ` Junio C Hamano
2021-09-08 19:20 ` Neeraj Singh
2021-09-08 19:23 ` Ævar Arnfjörð Bjarmason
2021-09-14 3:38 ` [PATCH v3 " Neeraj K. Singh via GitGitGadget
2021-09-14 3:38 ` [PATCH v3 1/6] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-09-14 3:38 ` [PATCH v3 2/6] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-09-14 10:39 ` Bagas Sanjaya
2021-09-14 19:05 ` Neeraj Singh
2021-09-14 19:34 ` Junio C Hamano
2021-09-14 20:33 ` Junio C Hamano
2021-09-15 4:55 ` Neeraj Singh
2021-09-14 3:38 ` [PATCH v3 3/6] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-09-14 3:38 ` [PATCH v3 4/6] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-09-14 19:35 ` Junio C Hamano
2021-09-14 3:38 ` [PATCH v3 5/6] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-09-14 3:38 ` [PATCH v3 6/6] core.fsyncobjectfiles: enable batch mode for testing Neeraj Singh via GitGitGadget
2021-09-15 16:21 ` Junio C Hamano
2021-09-15 22:43 ` Neeraj Singh
2021-09-15 23:12 ` Junio C Hamano
2021-09-16 6:19 ` Junio C Hamano
2021-09-14 5:49 ` [PATCH v3 0/6] Implement a batched fsync option for core.fsyncObjectFiles Christoph Hellwig
2021-09-20 22:15 ` [PATCH v4 " Neeraj K. Singh via GitGitGadget
2021-09-20 22:15 ` [PATCH v4 1/6] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-09-20 22:15 ` [PATCH v4 2/6] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-09-21 23:16 ` Ævar Arnfjörð Bjarmason
2021-09-22 1:23 ` Neeraj Singh
2021-09-22 2:02 ` Ævar Arnfjörð Bjarmason
2021-09-22 19:46 ` Neeraj Singh
2021-09-20 22:15 ` [PATCH v4 3/6] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-09-21 23:42 ` Ævar Arnfjörð Bjarmason
2021-09-22 1:23 ` Neeraj Singh
2021-09-20 22:15 ` [PATCH v4 4/6] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-09-21 23:46 ` Ævar Arnfjörð Bjarmason
2021-09-22 1:27 ` Neeraj Singh
2021-09-23 22:32 ` Neeraj Singh
2021-09-20 22:15 ` [PATCH v4 5/6] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-09-21 23:54 ` Ævar Arnfjörð Bjarmason
2021-09-22 1:30 ` Neeraj Singh
2021-09-22 1:58 ` Ævar Arnfjörð Bjarmason
2021-09-22 17:55 ` Neeraj Singh
2021-09-22 20:01 ` Ævar Arnfjörð Bjarmason
2021-09-20 22:15 ` [PATCH v4 6/6] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 0/7] Implement a batched fsync option for core.fsyncObjectFiles Neeraj K. Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 1/7] object-file.c: do not rename in a temp odb Neeraj Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 2/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 3/7] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-09-24 21:47 ` Neeraj Singh
2021-09-24 20:12 ` [PATCH v5 4/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-09-24 21:49 ` Neeraj Singh
2021-09-24 20:12 ` [PATCH v5 5/7] unpack-objects: " Neeraj Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 6/7] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-09-24 20:12 ` [PATCH v5 7/7] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-09-24 23:31 ` [PATCH v5 0/7] Implement a batched fsync option for core.fsyncObjectFiles Neeraj Singh
2021-09-24 23:53 ` [PATCH v6 0/8] " Neeraj K. Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 1/8] object-file.c: do not rename in a temp odb Neeraj Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 2/8] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 3/8] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-09-25 3:15 ` Bagas Sanjaya
2021-09-27 0:27 ` Neeraj Singh
2021-09-24 23:53 ` [PATCH v6 4/8] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-09-27 20:07 ` Junio C Hamano
2021-09-27 20:55 ` Neeraj Singh
2021-09-27 21:03 ` Neeraj Singh
2021-09-27 23:53 ` Junio C Hamano
2021-09-24 23:53 ` [PATCH v6 5/8] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 6/8] unpack-objects: " Neeraj Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 7/8] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-09-24 23:53 ` [PATCH v6 8/8] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 0/9] Implement a batched fsync option for core.fsyncObjectFiles Neeraj K. Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 1/9] object-file.c: do not rename in a temp odb Neeraj Singh via GitGitGadget
2021-09-28 23:55 ` Jeff King
2021-09-29 0:10 ` Neeraj Singh
2021-09-28 23:32 ` [PATCH v7 2/9] tmp-objdir: new API for creating temporary writable databases Neeraj Singh via GitGitGadget
2021-09-29 8:41 ` Elijah Newren
2021-09-29 16:40 ` Neeraj Singh
2021-09-28 23:32 ` [PATCH v7 3/9] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 4/9] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 5/9] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 6/9] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 7/9] unpack-objects: " Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 8/9] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-09-28 23:32 ` [PATCH v7 9/9] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 0/9] Implement a batched fsync option for core.fsyncObjectFiles Neeraj K. Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 1/9] tmp-objdir: new API for creating temporary writable databases Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 2/9] tmp-objdir: disable ref updates when replacing the primary odb Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 3/9] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 4/9] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 5/9] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 6/9] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 7/9] unpack-objects: " Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 8/9] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-10-04 16:57 ` [PATCH v8 9/9] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-11-15 23:50 ` [PATCH v9 0/9] Implement a batched fsync option for core.fsyncObjectFiles Neeraj K. Singh via GitGitGadget
2021-11-15 23:50 ` [PATCH v9 1/9] tmp-objdir: new API for creating temporary writable databases Neeraj Singh via GitGitGadget
2021-11-30 21:27 ` Elijah Newren
2021-11-30 21:52 ` Neeraj Singh
2021-11-30 22:36 ` Elijah Newren
2021-11-15 23:50 ` [PATCH v9 2/9] tmp-objdir: disable ref updates when replacing the primary odb Neeraj Singh via GitGitGadget
2021-11-16 7:23 ` Ævar Arnfjörð Bjarmason
2021-11-16 20:38 ` Neeraj Singh
2021-11-15 23:50 ` [PATCH v9 3/9] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2021-11-15 23:50 ` [PATCH v9 4/9] core.fsyncobjectfiles: batched disk flushes Neeraj Singh via GitGitGadget
2021-11-15 23:50 ` [PATCH v9 5/9] core.fsyncobjectfiles: add windows support for batch mode Neeraj Singh via GitGitGadget
2021-11-15 23:51 ` [PATCH v9 6/9] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2021-11-15 23:51 ` [PATCH v9 7/9] unpack-objects: " Neeraj Singh via GitGitGadget
2021-11-15 23:51 ` [PATCH v9 8/9] core.fsyncobjectfiles: tests for batch mode Neeraj Singh via GitGitGadget
2021-11-15 23:51 ` [PATCH v9 9/9] core.fsyncobjectfiles: performance tests for add and stash Neeraj Singh via GitGitGadget
2021-11-16 8:02 ` [PATCH v9 0/9] Implement a batched fsync option for core.fsyncObjectFiles Ævar Arnfjörð Bjarmason
2021-11-17 7:06 ` Neeraj Singh
2021-11-17 7:24 ` Ævar Arnfjörð Bjarmason
2021-11-18 5:03 ` Neeraj Singh
2021-12-01 14:15 ` Ævar Arnfjörð Bjarmason
2022-03-09 23:02 ` Ævar Arnfjörð Bjarmason
2022-03-10 1:16 ` Neeraj Singh
2022-03-10 14:01 ` Ævar Arnfjörð Bjarmason
2022-03-10 17:52 ` Neeraj Singh
2022-03-10 18:08 ` rsbecker
2022-03-10 18:43 ` Neeraj Singh
2022-03-10 18:48 ` rsbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1076.git.git.1629856292.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=hch@lst.de \
--cc=jeffhost@microsoft.com \
--cc=neerajsi@microsoft.com \
--cc=nksingh85@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).