git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
@ 2022-03-15 21:30 Neeraj K. Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
                   ` (7 more replies)
  0 siblings, 8 replies; 175+ messages in thread
From: Neeraj K. Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh

When core.fsync includes loose-object, we issue an fsync after every written
object. For a 'git-add' or similar command that adds a lot of files to the
repo, the costs of these fsyncs adds up. One major factor in this cost is
the time it takes for the physical storage controller to flush its caches to
durable media.

This series takes advantage of the writeout-only mode of git_fsync to issue
OS cache writebacks for all of the objects being added to the repository
followed by a single fsync to a dummy file, which should trigger a
filesystem log flush and storage controller cache flush. This mechanism is
known to be safe on common Windows filesystems and expected to be safe on
macOS. Some linux filesystems, such as XFS, will probably do the right thing
as well. See [1] for previous discussion on the predecessor of this patch
series.

This series is important on Windows, where loose-objects are included in the
fsync set by default in Git-For-Windows. In this series, I'm also setting
the default mode for Windows to turn on loose object fsyncing with batch
mode, so that we can get CI coverage of the actual git-for-windows
configuration upstream. We still don't actually issue fsyncs for the test
suite since GIT_TEST_FSYNC is set to 0, but we exercise all of the
surrounding batch mode code.

This work is based on 'seen' at 367f447f0f0cf39e9830c865e8373e42a3c45303.
It's dependent on ns/core-fsyncmethod.

[1]
https://lore.kernel.org/git/2c1ddef6057157d85da74a7274e03eacf0374e45.1629856293.git.gitgitgadget@gmail.com/

Neeraj Singh (7):
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  core.fsyncmethod: batched disk flushes for loose-objects
  update-index: use the bulk-checkin infrastructure
  unpack-objects: use the bulk-checkin infrastructure
  core.fsync: use batch mode and sync loose objects by default on
    Windows
  core.fsyncmethod: tests for batch mode
  core.fsyncmethod: performance tests for add and stash

 Documentation/config/core.txt |  5 ++
 builtin/unpack-objects.c      |  3 ++
 builtin/update-index.c        |  6 +++
 bulk-checkin.c                | 89 +++++++++++++++++++++++++++++++----
 bulk-checkin.h                |  2 +
 cache.h                       | 12 ++++-
 compat/mingw.h                |  3 ++
 config.c                      |  4 +-
 git-compat-util.h             |  2 +
 object-file.c                 |  2 +
 t/lib-unique-files.sh         | 36 ++++++++++++++
 t/perf/p3700-add.sh           | 59 +++++++++++++++++++++++
 t/perf/p3900-stash.sh         | 62 ++++++++++++++++++++++++
 t/perf/perf-lib.sh            |  4 +-
 t/t3700-add.sh                | 22 +++++++++
 t/t3903-stash.sh              | 17 +++++++
 t/t5300-pack-object.sh        | 32 ++++++++-----
 17 files changed, 335 insertions(+), 25 deletions(-)
 create mode 100644 t/lib-unique-files.sh
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh


base-commit: 367f447f0f0cf39e9830c865e8373e42a3c45303
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1134%2Fneerajsi-msft%2Fns%2Fbatched-fsync-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1134/neerajsi-msft/ns/batched-fsync-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1134
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

end of thread, other threads:[~2022-05-24 12:31 UTC | newest]

Thread overview: 175+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-16  5:33   ` Junio C Hamano
2022-03-16  7:33     ` Neeraj Singh
2022-03-16 16:14       ` Junio C Hamano
2022-03-16 17:59         ` Neeraj Singh
2022-03-16 18:10           ` Junio C Hamano
2022-03-16 19:50             ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-16  7:31   ` Patrick Steinhardt
2022-03-16 18:21     ` Neeraj Singh
2022-03-17  5:48       ` Patrick Steinhardt
2022-03-16 11:50   ` Bagas Sanjaya
2022-03-16 19:59     ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
2022-03-21 18:28       ` Neeraj Singh
2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
2022-03-21 20:14       ` Neeraj Singh
2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
2022-03-22  0:13           ` Neeraj Singh
2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
2022-03-22 20:05               ` Neeraj Singh
2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
2022-03-23  5:27                     ` Neeraj Singh
2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  5:51                     ` Neeraj Singh
2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
2022-03-23 20:19                         ` Neeraj Singh
2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23 20:23                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23 20:25                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
2022-03-23 20:30                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
2022-03-23 21:08                       ` Neeraj Singh
2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
2022-03-21 20:23       ` Neeraj Singh
2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
2022-03-24  2:04       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
2022-03-21 22:09       ` Neeraj Singh
2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
2022-03-21 17:50     ` Junio C Hamano
2022-03-21 22:18       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-21 17:55     ` Junio C Hamano
2022-03-21 23:02       ` Neeraj Singh
2022-03-22 20:54         ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-21 18:34     ` Junio C Hamano
2022-03-22  5:54       ` Neeraj Singh
2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-21 18:14     ` Neeraj Singh
2022-03-21 20:49       ` Junio C Hamano
2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
2022-03-24 17:52         ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-24 18:18       ` Junio C Hamano
2022-03-24 20:25         ` Neeraj Singh
2022-03-24 21:34           ` Junio C Hamano
2022-03-24 22:21             ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
2022-03-24 18:23         ` Neeraj Singh
2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-24 19:21       ` Neeraj Singh
2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-29 17:14         ` Neeraj Singh
2022-03-29 18:50           ` Junio C Hamano
2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-29 17:38         ` Neeraj Singh
2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
2022-03-29 17:09         ` Neeraj Singh
2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
2022-03-29 16:51         ` Neeraj Singh
2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 01/14] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-30 17:11           ` Junio C Hamano
2022-03-30 18:34             ` Neeraj Singh
2022-03-30 20:24               ` Junio C Hamano
2022-03-31  4:17                 ` Neeraj Singh
2022-03-31 17:50                   ` Junio C Hamano
2022-03-31 19:08                     ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 02/14] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-30 17:17           ` Junio C Hamano
2022-03-31  5:51             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 03/14] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-30 17:18           ` Junio C Hamano
2022-03-30 17:54             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 04/14] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-30 17:37           ` Junio C Hamano
2022-03-31  6:28             ` Neeraj Singh
2022-03-31 18:05               ` Junio C Hamano
2022-03-31 19:18                 ` Neeraj Singh
2022-04-01 15:56                   ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 05/14] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-30 17:46           ` Junio C Hamano
2022-03-30 19:04             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 06/14] builtin/add: add ODB transaction around add_files_to_cache Neeraj Singh via GitGitGadget
2022-03-30 17:47           ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 07/14] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-30 17:52           ` Junio C Hamano
2022-03-30 19:09             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 08/14] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 09/14] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 10/14] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 11/14] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-30 18:13           ` Junio C Hamano
2022-03-31  3:55             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 12/14] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 13/14] core.fsyncmethod: performance tests for batch mode Neeraj Singh via GitGitGadget
2022-03-31  4:09           ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 14/14] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-04-05  5:20         ` [PATCH v6 00/12] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects nksingh85
2022-04-06 20:32           ` Junio C Hamano
2022-05-19 21:47             ` Junio C Hamano
2022-05-19 21:54               ` Neeraj Singh
2022-05-24 12:31                 ` Johannes Schindelin
2022-04-05  5:20         ` [PATCH v6 01/12] bulk-checkin: rename 'state' variable and separate 'plugged' boolean nksingh85
2022-04-05  5:20         ` [PATCH v6 02/12] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' nksingh85
2022-04-05  5:20         ` [PATCH v6 03/12] core.fsyncmethod: batched disk flushes for loose-objects nksingh85
2022-04-05  5:20         ` [PATCH v6 04/12] cache-tree: use ODB transaction around writing a tree nksingh85
2022-04-05  5:20         ` [PATCH v6 05/12] builtin/add: add ODB transaction around add_files_to_cache nksingh85
2022-04-05  5:20         ` [PATCH v6 06/12] update-index: use the bulk-checkin infrastructure nksingh85
2022-04-05  5:20         ` [PATCH v6 07/12] unpack-objects: " nksingh85
2022-04-05  5:20         ` [PATCH v6 08/12] core.fsync: use batch mode and sync loose objects by default on Windows nksingh85
2022-04-05  5:20         ` [PATCH v6 09/12] test-lib-functions: add parsing helpers for ls-files and ls-tree nksingh85
2022-04-05  5:20         ` [PATCH v6 10/12] core.fsyncmethod: tests for batch mode nksingh85
2022-04-05  5:20         ` [PATCH v6 11/12] t/perf: add iteration setup mechanism to perf-lib nksingh85
2022-04-05  5:20         ` [PATCH v6 12/12] core.fsyncmethod: performance tests for batch mode nksingh85

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).