git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
@ 2022-03-15 21:30 Neeraj K. Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
                   ` (7 more replies)
  0 siblings, 8 replies; 175+ messages in thread
From: Neeraj K. Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh

When core.fsync includes loose-object, we issue an fsync after every written
object. For a 'git-add' or similar command that adds a lot of files to the
repo, the costs of these fsyncs adds up. One major factor in this cost is
the time it takes for the physical storage controller to flush its caches to
durable media.

This series takes advantage of the writeout-only mode of git_fsync to issue
OS cache writebacks for all of the objects being added to the repository
followed by a single fsync to a dummy file, which should trigger a
filesystem log flush and storage controller cache flush. This mechanism is
known to be safe on common Windows filesystems and expected to be safe on
macOS. Some linux filesystems, such as XFS, will probably do the right thing
as well. See [1] for previous discussion on the predecessor of this patch
series.

This series is important on Windows, where loose-objects are included in the
fsync set by default in Git-For-Windows. In this series, I'm also setting
the default mode for Windows to turn on loose object fsyncing with batch
mode, so that we can get CI coverage of the actual git-for-windows
configuration upstream. We still don't actually issue fsyncs for the test
suite since GIT_TEST_FSYNC is set to 0, but we exercise all of the
surrounding batch mode code.

This work is based on 'seen' at 367f447f0f0cf39e9830c865e8373e42a3c45303.
It's dependent on ns/core-fsyncmethod.

[1]
https://lore.kernel.org/git/2c1ddef6057157d85da74a7274e03eacf0374e45.1629856293.git.gitgitgadget@gmail.com/

Neeraj Singh (7):
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  core.fsyncmethod: batched disk flushes for loose-objects
  update-index: use the bulk-checkin infrastructure
  unpack-objects: use the bulk-checkin infrastructure
  core.fsync: use batch mode and sync loose objects by default on
    Windows
  core.fsyncmethod: tests for batch mode
  core.fsyncmethod: performance tests for add and stash

 Documentation/config/core.txt |  5 ++
 builtin/unpack-objects.c      |  3 ++
 builtin/update-index.c        |  6 +++
 bulk-checkin.c                | 89 +++++++++++++++++++++++++++++++----
 bulk-checkin.h                |  2 +
 cache.h                       | 12 ++++-
 compat/mingw.h                |  3 ++
 config.c                      |  4 +-
 git-compat-util.h             |  2 +
 object-file.c                 |  2 +
 t/lib-unique-files.sh         | 36 ++++++++++++++
 t/perf/p3700-add.sh           | 59 +++++++++++++++++++++++
 t/perf/p3900-stash.sh         | 62 ++++++++++++++++++++++++
 t/perf/perf-lib.sh            |  4 +-
 t/t3700-add.sh                | 22 +++++++++
 t/t3903-stash.sh              | 17 +++++++
 t/t5300-pack-object.sh        | 32 ++++++++-----
 17 files changed, 335 insertions(+), 25 deletions(-)
 create mode 100644 t/lib-unique-files.sh
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh


base-commit: 367f447f0f0cf39e9830c865e8373e42a3c45303
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1134%2Fneerajsi-msft%2Fns%2Fbatched-fsync-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1134/neerajsi-msft/ns/batched-fsync-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1134
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-16  5:33   ` Junio C Hamano
  2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.

* Rename 'state' variable to 'bulk_checkin_state', since we will later
  be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
  find in the debugger, since the name is more unique.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 bulk-checkin.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index e988a388b65..93b1dc5138a 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -10,9 +10,9 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static struct bulk_checkin_state {
-	unsigned plugged:1;
+static int bulk_checkin_plugged;
 
+static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
 	off_t offset;
@@ -21,7 +21,7 @@ static struct bulk_checkin_state {
 	struct pack_idx_entry **written;
 	uint32_t alloc_written;
 	uint32_t nr_written;
-} state;
+} bulk_checkin_state;
 
 static void finish_tmp_packfile(struct strbuf *basename,
 				const char *pack_tmp_name,
@@ -278,21 +278,23 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
 {
-	int status = deflate_to_pack(&state, oid, fd, size, type,
+	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
 				     path, flags);
-	if (!state.plugged)
-		finish_bulk_checkin(&state);
+	if (!bulk_checkin_plugged)
+		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
 
 void plug_bulk_checkin(void)
 {
-	state.plugged = 1;
+	assert(!bulk_checkin_plugged);
+	bulk_checkin_plugged = 1;
 }
 
 void unplug_bulk_checkin(void)
 {
-	state.plugged = 0;
-	if (state.f)
-		finish_bulk_checkin(&state);
+	assert(bulk_checkin_plugged);
+	bulk_checkin_plugged = 0;
+	if (bulk_checkin_state.f)
+		finish_bulk_checkin(&bulk_checkin_state);
 }
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-16  7:31   ` Patrick Steinhardt
  2022-03-16 11:50   ` Bagas Sanjaya
  2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin plugging and unplugging functionality,
takes advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1. Create the object in a tmp-objdir.
2. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1. Issue an fsync against a dummy file to flush the hardware writeback
   cache, which should by now have seen the tmp-objdir writes.
2. Rename all of the tmp-objdir files to their final names.
3. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns.

Batch mode is only enabled if core.fsyncObjectFiles is false or unset.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 Documentation/config/core.txt |  5 +++
 bulk-checkin.c                | 67 +++++++++++++++++++++++++++++++++++
 bulk-checkin.h                |  2 ++
 cache.h                       |  8 ++++-
 config.c                      |  2 ++
 object-file.c                 |  2 ++
 6 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 062e5259905..c041ed33801 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -628,6 +628,11 @@ core.fsyncMethod::
 * `writeout-only` issues pagecache writeback requests, but depending on the
   filesystem and storage hardware, data added to the repository may not be
   durable in the event of a system crash. This is the default mode on macOS.
+* `batch` enables a mode that uses writeout-only flushes to stage multiple
+  updates in the disk writeback cache and then a single full fsync to trigger
+  the disk cache flush at the end of the operation. This mode is expected to
+  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
+  and on Windows for repos stored on NTFS or ReFS filesystems.
 
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 93b1dc5138a..5c13fe17802 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -3,14 +3,20 @@
  */
 #include "cache.h"
 #include "bulk-checkin.h"
+#include "lockfile.h"
 #include "repository.h"
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
+#include "string-list.h"
+#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int bulk_checkin_plugged;
+static int needs_batch_fsync;
+
+static struct tmp_objdir *bulk_fsync_objdir;
 
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
@@ -80,6 +86,34 @@ clear_exit:
 	reprepare_packed_git(the_repository);
 }
 
+/*
+ * Cleanup after batch-mode fsync_object_files.
+ */
+static void do_batch_fsync(void)
+{
+	/*
+	 * Issue a full hardware flush against a temporary file to ensure
+	 * that all objects are durable before any renames occur.  The code in
+	 * fsync_loose_object_bulk_checkin has already issued a writeout
+	 * request, but it has not flushed any writeback cache in the storage
+	 * hardware.
+	 */
+
+	if (needs_batch_fsync) {
+		struct strbuf temp_path = STRBUF_INIT;
+		struct tempfile *temp;
+
+		strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
+		temp = xmks_tempfile(temp_path.buf);
+		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
+		delete_tempfile(&temp);
+		strbuf_release(&temp_path);
+	}
+
+	if (bulk_fsync_objdir)
+		tmp_objdir_migrate(bulk_fsync_objdir);
+}
+
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -274,6 +308,24 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
+void fsync_loose_object_bulk_checkin(int fd)
+{
+	/*
+	 * If we have a plugged bulk checkin, we issue a call that
+	 * cleans the filesystem page cache but avoids a hardware flush
+	 * command. Later on we will issue a single hardware flush
+	 * before as part of do_batch_fsync.
+	 */
+	if (bulk_checkin_plugged &&
+	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
+		assert(bulk_fsync_objdir);
+		if (!needs_batch_fsync)
+			needs_batch_fsync = 1;
+	} else {
+		fsync_or_die(fd, "loose object file");
+	}
+}
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -288,6 +340,19 @@ int index_bulk_checkin(struct object_id *oid,
 void plug_bulk_checkin(void)
 {
 	assert(!bulk_checkin_plugged);
+
+	/*
+	 * A temporary object directory is used to hold the files
+	 * while they are not fsynced.
+	 */
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
+		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
+		if (!bulk_fsync_objdir)
+			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
+
+		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
+	}
+
 	bulk_checkin_plugged = 1;
 }
 
@@ -297,4 +362,6 @@ void unplug_bulk_checkin(void)
 	bulk_checkin_plugged = 0;
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
+
+	do_batch_fsync();
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index b26f3dc3b74..08f292379b6 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,6 +6,8 @@
 
 #include "cache.h"
 
+void fsync_loose_object_bulk_checkin(int fd);
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
diff --git a/cache.h b/cache.h
index d347d0757f7..4d07691e791 100644
--- a/cache.h
+++ b/cache.h
@@ -1040,7 +1040,8 @@ extern int use_fsync;
 
 enum fsync_method {
 	FSYNC_METHOD_FSYNC,
-	FSYNC_METHOD_WRITEOUT_ONLY
+	FSYNC_METHOD_WRITEOUT_ONLY,
+	FSYNC_METHOD_BATCH
 };
 
 extern enum fsync_method fsync_method;
@@ -1766,6 +1767,11 @@ void fsync_or_die(int fd, const char *);
 int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
+static inline int batch_fsync_enabled(enum fsync_component component)
+{
+	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/config.c b/config.c
index 261ee7436e0..0b28f90de8b 100644
--- a/config.c
+++ b/config.c
@@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 			fsync_method = FSYNC_METHOD_FSYNC;
 		else if (!strcmp(value, "writeout-only"))
 			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
+		else if (!strcmp(value, "batch"))
+			fsync_method = FSYNC_METHOD_BATCH;
 		else
 			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
 
diff --git a/object-file.c b/object-file.c
index 295cb899e22..ef6621ffe56 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1894,6 +1894,8 @@ static void close_loose_object(int fd)
 
 	if (fsync_object_files > 0)
 		fsync_or_die(fd, "loose object file");
+	else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		fsync_loose_object_bulk_checkin(fd);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
 				       "loose object file");
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.

This change enables bulk-checkin for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.

There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete.  This
usage is unlikely, since any tool invoking update-index and expecting to
see objects would have to synchronize with the update-index process
after passing it a file path.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/update-index.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 75d646377cc..38e9d7e88cb 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -5,6 +5,7 @@
  */
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "lockfile.h"
 #include "quote.h"
@@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 	the_index.updated_skipworktree = 1;
 
+	/* we might be adding many objects to the object database */
+	plug_bulk_checkin();
+
 	/*
 	 * Custom copy of parse_options() because we want to handle
 	 * filename arguments as they come.
@@ -1190,6 +1194,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	/* by now we must have added all of the new objects */
+	unplug_bulk_checkin();
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 4/7] unpack-objects: use the bulk-checkin infrastructure
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling bulk-checkin when unpacking objects, we can take advantage
of batched fsyncs.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/unpack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dbeb0680a58..c55b6616aed 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "object-store.h"
 #include "object.h"
@@ -503,10 +504,12 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
+	plug_bulk_checkin();
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
 		display_progress(progress, i + 1);
 	}
+	unplug_bulk_checkin();
 	stop_progress(&progress);
 
 	if (delta_list)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.

We don't actually do fsyncs in the test suite, since GIT_TEST_FSYNC is
set to 0. However, we do exercise all of the surrounding batch mode code
since GIT_TEST_FSYNC merely makes the maybe_fsync wrapper always appear
to succeed.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>

change fsyncmethod to batch as well
---
 cache.h           | 4 ++++
 compat/mingw.h    | 3 +++
 config.c          | 2 +-
 git-compat-util.h | 2 ++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 4d07691e791..04193d87246 100644
--- a/cache.h
+++ b/cache.h
@@ -1031,6 +1031,10 @@ enum fsync_component {
 			      FSYNC_COMPONENT_INDEX | \
 			      FSYNC_COMPONENT_REFERENCE)
 
+#ifndef FSYNC_COMPONENTS_PLATFORM_DEFAULT
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT FSYNC_COMPONENTS_DEFAULT
+#endif
+
 /*
  * A bitmask indicating which components of the repo should be fsynced.
  */
diff --git a/compat/mingw.h b/compat/mingw.h
index 6074a3d3ced..afe30868c04 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -332,6 +332,9 @@ int mingw_getpagesize(void);
 int win32_fsync_no_flush(int fd);
 #define fsync_no_flush win32_fsync_no_flush
 
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT (FSYNC_COMPONENTS_DEFAULT | FSYNC_COMPONENT_LOOSE_OBJECT)
+#define FSYNC_METHOD_DEFAULT (FSYNC_METHOD_BATCH)
+
 struct rlimit {
 	unsigned int rlim_cur;
 };
diff --git a/config.c b/config.c
index 0b28f90de8b..c76443dc556 100644
--- a/config.c
+++ b/config.c
@@ -1342,7 +1342,7 @@ static const struct fsync_component_name {
 
 static enum fsync_component parse_fsync_components(const char *var, const char *string)
 {
-	enum fsync_component current = FSYNC_COMPONENTS_DEFAULT;
+	enum fsync_component current = FSYNC_COMPONENTS_PLATFORM_DEFAULT;
 	enum fsync_component positive = 0, negative = 0;
 
 	while (string) {
diff --git a/git-compat-util.h b/git-compat-util.h
index 0892e209a2f..fffe42ce7c1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1257,11 +1257,13 @@ __attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...);
 #define BUG(...) BUG_fl(__FILE__, __LINE__, __VA_ARGS__)
 
+#ifndef FSYNC_METHOD_DEFAULT
 #ifdef __APPLE__
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_WRITEOUT_ONLY
 #else
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_FSYNC
 #endif
+#endif
 
 enum fsync_action {
 	FSYNC_WRITEOUT_ONLY,
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 6/7] core.fsyncmethod: tests for batch mode
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  7 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added due
to it already being in the repo.

We aren't actually issuing any fsyncs in these tests, since
GIT_TEST_FSYNC is 0, but we still exercise all of the tmp_objdir logic
in bulk-checkin.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/lib-unique-files.sh  | 36 ++++++++++++++++++++++++++++++++++++
 t/t3700-add.sh         | 22 ++++++++++++++++++++++
 t/t3903-stash.sh       | 17 +++++++++++++++++
 t/t5300-pack-object.sh | 32 +++++++++++++++++++++-----------
 4 files changed, 96 insertions(+), 11 deletions(-)
 create mode 100644 t/lib-unique-files.sh

diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
new file mode 100644
index 00000000000..a7de4ca8512
--- /dev/null
+++ b/t/lib-unique-files.sh
@@ -0,0 +1,36 @@
+# Helper to create files with unique contents
+
+
+# Create multiple files with unique contents. Takes the number of
+# directories, the number of files in each directory, and the base
+# directory.
+#
+# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
+#					 each in my_dir, all with unique
+#					 contents.
+
+test_create_unique_files() {
+	test "$#" -ne 3 && BUG "3 param"
+
+	local dirs=$1
+	local files=$2
+	local basedir=$3
+	local counter=0
+	test_tick
+	local basedata=$test_tick
+
+
+	rm -rf $basedir
+
+	for i in $(test_seq $dirs)
+	do
+		local dir=$basedir/dir$i
+
+		mkdir -p "$dir"
+		for j in $(test_seq $files)
+		do
+			counter=$((counter + 1))
+			echo "$basedata.$counter"  >"$dir/file$j.txt"
+		done
+	done
+}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index b1f90ba3250..1f349f52ad3 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+. $TEST_DIRECTORY/lib-unique-files.sh
+
 # Test the file mode "$1" of the file "$2" in the index.
 test_mode_in_index () {
 	case "$(git ls-files -s "$2")" in
@@ -34,6 +36,26 @@ test_expect_success \
     'Test that "git add -- -q" works' \
     'touch -- -q && git add -- -q'
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'git add: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files &&
+	git $BATCH_CONFIGURATION add -- ./fsync-files/ &&
+	rm -f fsynced_files &&
+	git ls-files --stage fsync-files/ > fsynced_files &&
+	test_line_count = 8 fsynced_files &&
+	awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e
+"
+
+test_expect_success 'git update-index: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files2 &&
+	find fsync-files2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
+	rm -f fsynced_files2 &&
+	git ls-files --stage fsync-files2/ > fsynced_files2 &&
+	test_line_count = 8 fsynced_files2 &&
+	awk -- '{print \$2}' fsynced_files2 | xargs -n1 git cat-file -e
+"
+
 test_expect_success \
 	'git add: Test that executable bit is not used if core.filemode=0' \
 	'git config core.filemode 0 &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 4abbc8fccae..877276c1ca3 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
+. $TEST_DIRECTORY/lib-unique-files.sh
 
 test_expect_success 'usage on cmd and subcommand invalid option' '
 	test_expect_code 129 git stash --invalid-option 2>usage &&
@@ -1410,6 +1411,22 @@ test_expect_success 'stash handles skip-worktree entries nicely' '
 	git rev-parse --verify refs/stash:A.t
 '
 
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'stash with core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files &&
+	git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
+	rm -f fsynced_files &&
+
+	# The files were untracked, so use the third parent,
+	# which contains the untracked files
+	git ls-tree -r stash^3 -- ./fsync-files/ > fsynced_files &&
+	test_line_count = 8 fsynced_files &&
+	awk -- '{print \$3}' fsynced_files | xargs -n1 git cat-file -e
+"
+
+
 test_expect_success 'git stash succeeds despite directory/file change' '
 	test_create_repo directory_file_switch_v1 &&
 	(
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index a11d61206ad..8e2f73cc68f 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -162,23 +162,25 @@ test_expect_success 'pack-objects with bogus arguments' '
 
 check_unpack () {
 	test_when_finished "rm -rf git2" &&
-	git init --bare git2 &&
-	git -C git2 unpack-objects -n <"$1".pack &&
-	git -C git2 unpack-objects <"$1".pack &&
-	(cd .git && find objects -type f -print) |
-	while read path
-	do
-		cmp git2/$path .git/$path || {
-			echo $path differs.
-			return 1
-		}
-	done
+	git $2 init --bare git2 &&
+	(
+		git $2 -C git2 unpack-objects -n <"$1".pack &&
+		git $2 -C git2 unpack-objects <"$1".pack &&
+		git $2 -C git2 cat-file --batch-check="%(objectname)"
+	) <obj-list >current &&
+	cmp obj-list current
 }
 
 test_expect_success 'unpack without delta' '
 	check_unpack test-1-${packname_1}
 '
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
+	check_unpack test-1-${packname_1} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'pack with REF_DELTA' '
 	packname_2=$(git pack-objects --progress test-2 <obj-list 2>stderr) &&
 	check_deltas stderr -gt 0
@@ -188,6 +190,10 @@ test_expect_success 'unpack with REF_DELTA' '
 	check_unpack test-2-${packname_2}
 '
 
+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-2-${packname_2} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'pack with OFS_DELTA' '
 	packname_3=$(git pack-objects --progress --delta-base-offset test-3 \
 			<obj-list 2>stderr) &&
@@ -198,6 +204,10 @@ test_expect_success 'unpack with OFS_DELTA' '
 	check_unpack test-3-${packname_3}
 '
 
+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-3-${packname_3} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'compare delta flavors' '
 	perl -e '\''
 		defined($_ = -s $_) or die for @ARGV;
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 7/7] core.fsyncmethod: performance tests for add and stash
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-15 21:30 ` Neeraj Singh via GitGitGadget
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  7 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-15 21:30 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh,
	Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add basic performance tests for "git add" and "git stash" of a lot of
new objects with various fsync settings. This shows the benefit of batch
mode relative to an ordinary stash command.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p3700-add.sh   | 59 ++++++++++++++++++++++++++++++++++++++++
 t/perf/p3900-stash.sh | 62 +++++++++++++++++++++++++++++++++++++++++++
 t/perf/perf-lib.sh    |  4 +--
 3 files changed, 123 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh

diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh
new file mode 100755
index 00000000000..2ea78c9449d
--- /dev/null
+++ b/t/perf/p3700-add.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of add"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	test_perf "add $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG add files
+	"
+done
+
+test_done
diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh
new file mode 100755
index 00000000000..3526f06cef4
--- /dev/null
+++ b/t/perf/p3900-stash.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of stash"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	# We only stash files in the 'files' subdirectory since
+	# the perf test infrastructure creates files in the
+	# current working directory that need to be preserved
+	test_perf "stash $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG stash push -u -- files
+	"
+done
+
+test_done
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 932105cd12c..d270d1d962a 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -98,8 +98,8 @@ test_perf_create_repo_from () {
 	mkdir -p "$repo/.git"
 	(
 		cd "$source" &&
-		{ cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null ||
-			cp -R "$objects_dir" "$repo/.git/"; } &&
+		{ cp -Rl "$objects_dir" "$repo/.git/" ||
+			cp -R "$objects_dir" "$repo/.git/" 2>/dev/null;} &&
 
 		# common_dir must come first here, since we want source_git to
 		# take precedence and overwrite any overlapping files
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
@ 2022-03-16  5:33   ` Junio C Hamano
  2022-03-16  7:33     ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-16  5:33 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.
>
> * Rename 'state' variable to 'bulk_checkin_state', since we will later
>   be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
>   find in the debugger, since the name is more unique.
>
> * Move the 'plugged' data member of 'bulk_checkin_state' into a separate
>   static variable. Doing this avoids resetting the variable in
>   finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
>   seem to unintentionally disable the plugging functionality the first
>   time a new packfile must be created due to packfile size limits. While
>   disabling the plugging state only results in suboptimal behavior for
>   the current code, it would be fatal for the bulk-fsync functionality
>   later in this patch series.

Sorry, but I am confused.  The bulk-checkin infrastructure is there
so that we can send many little objects into a single packfile
instead of creating many little loose object files.  Everything we
throw at object-file.c::index_stream() will be concatenated into the
single packfile while we are "plugged" until we get "unplugged".

My understanding of what you are doing in this series is to still
create many little loose object files, but avoid the overhead of
having to fsync them individually.  And I am not sure how well the
original idea behind the bulk-checkin infrastructure to avoid
overhead of having to create many loose objects by creating a single
packfile (and presumably having to fsync at the end, but that is
just a single .pack file) with your goal of still creating many
loose object files but synching them more efficiently.

Is it just the new feature is piggybacking on the existing bulk
checkin infrastructure, even though these two have nothing in
common?


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-16  7:31   ` Patrick Steinhardt
  2022-03-16 18:21     ` Neeraj Singh
  2022-03-16 11:50   ` Bagas Sanjaya
  1 sibling, 1 reply; 175+ messages in thread
From: Patrick Steinhardt @ 2022-03-16  7:31 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, Neeraj K. Singh

[-- Attachment #1: Type: text/plain, Size: 9918 bytes --]

On Tue, Mar 15, 2022 at 09:30:54PM +0000, Neeraj Singh via GitGitGadget wrote:
> From: Neeraj Singh <neerajsi@microsoft.com>
> 
> When adding many objects to a repo with `core.fsync=loose-object`,
> the cost of fsync'ing each object file can become prohibitive.
> 
> One major source of the cost of fsync is the implied flush of the
> hardware writeback cache within the disk drive. This commit introduces
> a new `core.fsyncMethod=batch` option that batches up hardware flushes.
> It hooks into the bulk-checkin plugging and unplugging functionality,
> takes advantage of tmp-objdir, and uses the writeout-only support code.
> 
> When the new mode is enabled, we do the following for each new object:
> 1. Create the object in a tmp-objdir.
> 2. Issue a pagecache writeback request and wait for it to complete.
> 
> At the end of the entire transaction when unplugging bulk checkin:
> 1. Issue an fsync against a dummy file to flush the hardware writeback
>    cache, which should by now have seen the tmp-objdir writes.
> 2. Rename all of the tmp-objdir files to their final names.
> 3. When updating the index and/or refs, we assume that Git will issue
>    another fsync internal to that operation. This is not the default
>    today, but the user now has the option of syncing the index and there
>    is a separate patch series to implement syncing of refs.
> 
> On a filesystem with a singular journal that is updated during name
> operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
> we would expect the fsync to trigger a journal writeout so that this
> sequence is enough to ensure that the user's data is durable by the time
> the git command returns.
> 
> Batch mode is only enabled if core.fsyncObjectFiles is false or unset.
> 
> _Performance numbers_:
> 
> Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
> Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
> Windows - Same host as Linux, a preview version of Windows 11.
> 
> Adding 500 files to the repo with 'git add' Times reported in seconds.
> 
> object file syncing | Linux | Mac   | Windows
> --------------------|-------|-------|--------
>            disabled | 0.06  |  0.35 | 0.61
>               fsync | 1.88  | 11.18 | 2.47
>               batch | 0.15  |  0.41 | 1.53
> 
> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  Documentation/config/core.txt |  5 +++
>  bulk-checkin.c                | 67 +++++++++++++++++++++++++++++++++++
>  bulk-checkin.h                |  2 ++
>  cache.h                       |  8 ++++-
>  config.c                      |  2 ++
>  object-file.c                 |  2 ++
>  6 files changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index 062e5259905..c041ed33801 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -628,6 +628,11 @@ core.fsyncMethod::
>  * `writeout-only` issues pagecache writeback requests, but depending on the
>    filesystem and storage hardware, data added to the repository may not be
>    durable in the event of a system crash. This is the default mode on macOS.
> +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> +  updates in the disk writeback cache and then a single full fsync to trigger
> +  the disk cache flush at the end of the operation. This mode is expected to
> +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
> +  and on Windows for repos stored on NTFS or ReFS filesystems.

This mode will not be supported by all parts of our stack that use our
new fsync infra. So I think we should both document that some parts of
the stack don't support batching, and say what the fallback behaviour is
for those that don't.

>  core.fsyncObjectFiles::
>  	This boolean will enable 'fsync()' when writing object files.
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 93b1dc5138a..5c13fe17802 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -3,14 +3,20 @@
>   */
>  #include "cache.h"
>  #include "bulk-checkin.h"
> +#include "lockfile.h"
>  #include "repository.h"
>  #include "csum-file.h"
>  #include "pack.h"
>  #include "strbuf.h"
> +#include "string-list.h"
> +#include "tmp-objdir.h"
>  #include "packfile.h"
>  #include "object-store.h"
>  
>  static int bulk_checkin_plugged;
> +static int needs_batch_fsync;
> +
> +static struct tmp_objdir *bulk_fsync_objdir;
>  
>  static struct bulk_checkin_state {
>  	char *pack_tmp_name;
> @@ -80,6 +86,34 @@ clear_exit:
>  	reprepare_packed_git(the_repository);
>  }
>  
> +/*
> + * Cleanup after batch-mode fsync_object_files.
> + */
> +static void do_batch_fsync(void)
> +{
> +	/*
> +	 * Issue a full hardware flush against a temporary file to ensure
> +	 * that all objects are durable before any renames occur.  The code in
> +	 * fsync_loose_object_bulk_checkin has already issued a writeout
> +	 * request, but it has not flushed any writeback cache in the storage
> +	 * hardware.
> +	 */
> +
> +	if (needs_batch_fsync) {
> +		struct strbuf temp_path = STRBUF_INIT;
> +		struct tempfile *temp;
> +
> +		strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
> +		temp = xmks_tempfile(temp_path.buf);
> +		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
> +		delete_tempfile(&temp);
> +		strbuf_release(&temp_path);
> +	}
> +
> +	if (bulk_fsync_objdir)
> +		tmp_objdir_migrate(bulk_fsync_objdir);
> +}
> +

We never unset `bulk_fsync_objdir` anywhere. Shouldn't we be doing that
when we unplug this infrastructure?

Patrick

>  static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
>  {
>  	int i;
> @@ -274,6 +308,24 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
>  	return 0;
>  }
>  
> +void fsync_loose_object_bulk_checkin(int fd)
> +{
> +	/*
> +	 * If we have a plugged bulk checkin, we issue a call that
> +	 * cleans the filesystem page cache but avoids a hardware flush
> +	 * command. Later on we will issue a single hardware flush
> +	 * before as part of do_batch_fsync.
> +	 */
> +	if (bulk_checkin_plugged &&
> +	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
> +		assert(bulk_fsync_objdir);
> +		if (!needs_batch_fsync)
> +			needs_batch_fsync = 1;
> +	} else {
> +		fsync_or_die(fd, "loose object file");
> +	}
> +}
> +
>  int index_bulk_checkin(struct object_id *oid,
>  		       int fd, size_t size, enum object_type type,
>  		       const char *path, unsigned flags)
> @@ -288,6 +340,19 @@ int index_bulk_checkin(struct object_id *oid,
>  void plug_bulk_checkin(void)
>  {
>  	assert(!bulk_checkin_plugged);
> +
> +	/*
> +	 * A temporary object directory is used to hold the files
> +	 * while they are not fsynced.
> +	 */
> +	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> +		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> +		if (!bulk_fsync_objdir)
> +			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
> +
> +		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
> +	}
> +
>  	bulk_checkin_plugged = 1;
>  }
>  
> @@ -297,4 +362,6 @@ void unplug_bulk_checkin(void)
>  	bulk_checkin_plugged = 0;
>  	if (bulk_checkin_state.f)
>  		finish_bulk_checkin(&bulk_checkin_state);
> +
> +	do_batch_fsync();
>  }
> diff --git a/bulk-checkin.h b/bulk-checkin.h
> index b26f3dc3b74..08f292379b6 100644
> --- a/bulk-checkin.h
> +++ b/bulk-checkin.h
> @@ -6,6 +6,8 @@
>  
>  #include "cache.h"
>  
> +void fsync_loose_object_bulk_checkin(int fd);
> +
>  int index_bulk_checkin(struct object_id *oid,
>  		       int fd, size_t size, enum object_type type,
>  		       const char *path, unsigned flags);
> diff --git a/cache.h b/cache.h
> index d347d0757f7..4d07691e791 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1040,7 +1040,8 @@ extern int use_fsync;
>  
>  enum fsync_method {
>  	FSYNC_METHOD_FSYNC,
> -	FSYNC_METHOD_WRITEOUT_ONLY
> +	FSYNC_METHOD_WRITEOUT_ONLY,
> +	FSYNC_METHOD_BATCH
>  };
>  
>  extern enum fsync_method fsync_method;
> @@ -1766,6 +1767,11 @@ void fsync_or_die(int fd, const char *);
>  int fsync_component(enum fsync_component component, int fd);
>  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
>  
> +static inline int batch_fsync_enabled(enum fsync_component component)
> +{
> +	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
> +}
> +
>  ssize_t read_in_full(int fd, void *buf, size_t count);
>  ssize_t write_in_full(int fd, const void *buf, size_t count);
>  ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
> diff --git a/config.c b/config.c
> index 261ee7436e0..0b28f90de8b 100644
> --- a/config.c
> +++ b/config.c
> @@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>  			fsync_method = FSYNC_METHOD_FSYNC;
>  		else if (!strcmp(value, "writeout-only"))
>  			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
> +		else if (!strcmp(value, "batch"))
> +			fsync_method = FSYNC_METHOD_BATCH;
>  		else
>  			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
>  
> diff --git a/object-file.c b/object-file.c
> index 295cb899e22..ef6621ffe56 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1894,6 +1894,8 @@ static void close_loose_object(int fd)
>  
>  	if (fsync_object_files > 0)
>  		fsync_or_die(fd, "loose object file");
> +	else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> +		fsync_loose_object_bulk_checkin(fd);
>  	else
>  		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
>  				       "loose object file");
> -- 
> gitgitgadget
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-16  5:33   ` Junio C Hamano
@ 2022-03-16  7:33     ` Neeraj Singh
  2022-03-16 16:14       ` Junio C Hamano
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-16  7:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

On Tue, Mar 15, 2022 at 10:33 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.
> >
> > * Rename 'state' variable to 'bulk_checkin_state', since we will later
> >   be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
> >   find in the debugger, since the name is more unique.
> >
> > * Move the 'plugged' data member of 'bulk_checkin_state' into a separate
> >   static variable. Doing this avoids resetting the variable in
> >   finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
> >   seem to unintentionally disable the plugging functionality the first
> >   time a new packfile must be created due to packfile size limits. While
> >   disabling the plugging state only results in suboptimal behavior for
> >   the current code, it would be fatal for the bulk-fsync functionality
> >   later in this patch series.
>
> Sorry, but I am confused.  The bulk-checkin infrastructure is there
> so that we can send many little objects into a single packfile
> instead of creating many little loose object files.  Everything we
> throw at object-file.c::index_stream() will be concatenated into the
> single packfile while we are "plugged" until we get "unplugged".
>

I noticed that you invented bulk-checkin back in 2011, but I don't think your
description matches what the code actually does.  index_bulk_checkin
is only called from index_stream, which is only called from index_fd. index_fd
goes down the index_bulk_checkin path for large files (512MB by default). It
looks like the effect of the 'plug/unplug' code is to allow multiple
large blobs to
go into a single packfile rather than each getting one getting its own separate
packfile.

> My understanding of what you are doing in this series is to still
> create many little loose object files, but avoid the overhead of
> having to fsync them individually.  And I am not sure how well the
> original idea behind the bulk-checkin infrastructure to avoid
> overhead of having to create many loose objects by creating a single
> packfile (and presumably having to fsync at the end, but that is
> just a single .pack file) with your goal of still creating many
> loose object files but synching them more efficiently.
>
> Is it just the new feature is piggybacking on the existing bulk
> checkin infrastructure, even though these two have nothing in
> common?
>

I think my new usage is congruent with the existing API, which seems
to be about combining multiple add operations into a large transaction,
where we can do some cleanup operations once we're finished. In the
preexisting code, the transaction is about adding a bunch of large objects
to a single pack file (while leaving small objects loose), and then completing
the packfile when the adds are finished.

---
On a side note, I've also been thinking about how we could use a packfile
approach as an alternative means to achieve faster addition of many small
objects. It's essentially what you stated above, where we'd send our
little objects
into a pack file. But to avoid frequent repacking overhead, we might
want to reuse
the 'latest' packfile across multiple Git invocations by appending
objects to it, with
an fsync on the file at the end.

We'd need sufficient padding between objects created by different Git
invocations to
ensure that previously synced data doesn't get disturbed by later
operations.  We'd
need to rewrite the pack indexes each time, but that's at least
derived metadata, so it
doesn't need to be fsynced. To make the pack indexes more
incrementally-updatable,
we might want to have the fanout table be checksummed, with
checksummed pointers to
leaf blocks. If we detect corruption during an index lookup, we could
recreate the index
from the packfile.

Essentially the above proposal is to move away from storing loose
objects in the filesystem
and instead to index the data within Git itself.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
  2022-03-16  7:31   ` Patrick Steinhardt
@ 2022-03-16 11:50   ` Bagas Sanjaya
  2022-03-16 19:59     ` Neeraj Singh
  1 sibling, 1 reply; 175+ messages in thread
From: Bagas Sanjaya @ 2022-03-16 11:50 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget, git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Neeraj K. Singh

On 16/03/22 04.30, Neeraj Singh via GitGitGadget wrote:
> On a filesystem with a singular journal that is updated during name
> operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
> we would expect the fsync to trigger a journal writeout so that this
> sequence is enough to ensure that the user's data is durable by the time
> the git command returns.
> 

But what about ext4? Will fsync-ing trigger writing journal?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-16  7:33     ` Neeraj Singh
@ 2022-03-16 16:14       ` Junio C Hamano
  2022-03-16 17:59         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-16 16:14 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

Neeraj Singh <nksingh85@gmail.com> writes:

> I think my new usage is congruent with the existing API, which seems
> to be about combining multiple add operations into a large transaction,
> where we can do some cleanup operations once we're finished. In the
> preexisting code, the transaction is about adding a bunch of large objects
> to a single pack file (while leaving small objects loose), and then completing
> the packfile when the adds are finished.

OK, so it was part me, and part a suboptimal presentation, I guess
;-)

Let me rephrase the idea to see if I got it right this time.

The bulk-checkin API has two interesting entry points, "plug" that
signals that we are about to repeat possibly many operations to add
new objects to the object store, and "unplug" that signals that we
are done such adding.  They are meant to serve as a hint for the
object layer to optimize its operation.

So far the only way the hint was used was that the logic that sends
an overly large object into a packfile (instead of storing it loose,
which leaves it subject to expensive repacking later) can shove more
than one such objects in the same packfile.

This series invents another use of the "plug"-"unplug" hint.  By
knowing that many loose object files are created and when the series
of object creation ended, we can avoid having to fsync each and
every one of them on certain filesystems and achieve the same
robustness.  The new "batch" option to core.fsyncmethod triggers
this mechanism.

Did I get it right, more-or-less?

Thanks.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-16 16:14       ` Junio C Hamano
@ 2022-03-16 17:59         ` Neeraj Singh
  2022-03-16 18:10           ` Junio C Hamano
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-16 17:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

On Wed, Mar 16, 2022 at 9:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Neeraj Singh <nksingh85@gmail.com> writes:
>
> > I think my new usage is congruent with the existing API, which seems
> > to be about combining multiple add operations into a large transaction,
> > where we can do some cleanup operations once we're finished. In the
> > preexisting code, the transaction is about adding a bunch of large objects
> > to a single pack file (while leaving small objects loose), and then completing
> > the packfile when the adds are finished.
>
> OK, so it was part me, and part a suboptimal presentation, I guess
> ;-)
>
> Let me rephrase the idea to see if I got it right this time.
>
> The bulk-checkin API has two interesting entry points, "plug" that
> signals that we are about to repeat possibly many operations to add
> new objects to the object store, and "unplug" that signals that we
> are done such adding.  They are meant to serve as a hint for the
> object layer to optimize its operation.
>
> So far the only way the hint was used was that the logic that sends
> an overly large object into a packfile (instead of storing it loose,
> which leaves it subject to expensive repacking later) can shove more
> than one such objects in the same packfile.
>
> This series invents another use of the "plug"-"unplug" hint.  By
> knowing that many loose object files are created and when the series
> of object creation ended, we can avoid having to fsync each and
> every one of them on certain filesystems and achieve the same
> robustness.  The new "batch" option to core.fsyncmethod triggers
> this mechanism.
>
> Did I get it right, more-or-less?

Yes, that's my understanding as well.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-16 17:59         ` Neeraj Singh
@ 2022-03-16 18:10           ` Junio C Hamano
  2022-03-16 19:50             ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-16 18:10 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

Neeraj Singh <nksingh85@gmail.com> writes:

>> Did I get it right, more-or-less?
>
> Yes, that's my understanding as well.

I guess what I wrote would make a useful material for early part of
the log message to help future developers.

Thanks.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-16  7:31   ` Patrick Steinhardt
@ 2022-03-16 18:21     ` Neeraj Singh
  2022-03-17  5:48       ` Patrick Steinhardt
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-16 18:21 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Neeraj K. Singh

On Wed, Mar 16, 2022 at 12:31 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Mar 15, 2022 at 09:30:54PM +0000, Neeraj Singh via GitGitGadget wrote:
> > From: Neeraj Singh <neerajsi@microsoft.com>
> > diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> > index 062e5259905..c041ed33801 100644
> > --- a/Documentation/config/core.txt
> > +++ b/Documentation/config/core.txt
> > @@ -628,6 +628,11 @@ core.fsyncMethod::
> >  * `writeout-only` issues pagecache writeback requests, but depending on the
> >    filesystem and storage hardware, data added to the repository may not be
> >    durable in the event of a system crash. This is the default mode on macOS.
> > +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> > +  updates in the disk writeback cache and then a single full fsync to trigger
> > +  the disk cache flush at the end of the operation. This mode is expected to
> > +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
> > +  and on Windows for repos stored on NTFS or ReFS filesystems.
>
> This mode will not be supported by all parts of our stack that use our
> new fsync infra. So I think we should both document that some parts of
> the stack don't support batching, and say what the fallback behaviour is
> for those that don't.
>

Can do. I'm hoping that you'll revive your batch-mode refs change too so that
we get batching across the ODB and Refs, which are the two data stores that
may receive many updates in a single Git command.  This documentation
comment will read:
```
* `batch` enables a mode that uses writeout-only flushes to stage multiple
  updates in the disk writeback cache and then does a single full fsync of
  a dummy file to trigger the disk cache flush at the end of the operation.
  Currently `batch` mode only applies to loose-object files. Other repository
  data is made durable as if `fsync` was specified. This mode is expected to
  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
  and on Windows for repos stored on NTFS or ReFS filesystems.
```


> >  core.fsyncObjectFiles::
> >       This boolean will enable 'fsync()' when writing object files.
> > diff --git a/bulk-checkin.c b/bulk-checkin.c
> > index 93b1dc5138a..5c13fe17802 100644
> > --- a/bulk-checkin.c
> > +++ b/bulk-checkin.c
> > @@ -3,14 +3,20 @@
> >   */
> >  #include "cache.h"
> >  #include "bulk-checkin.h"
> > +#include "lockfile.h"
> >  #include "repository.h"
> >  #include "csum-file.h"
> >  #include "pack.h"
> >  #include "strbuf.h"
> > +#include "string-list.h"
> > +#include "tmp-objdir.h"
> >  #include "packfile.h"
> >  #include "object-store.h"
> >
> >  static int bulk_checkin_plugged;
> > +static int needs_batch_fsync;
> > +
> > +static struct tmp_objdir *bulk_fsync_objdir;
> >
> >  static struct bulk_checkin_state {
> >       char *pack_tmp_name;
> > @@ -80,6 +86,34 @@ clear_exit:
> >       reprepare_packed_git(the_repository);
> >  }
> >
> > +/*
> > + * Cleanup after batch-mode fsync_object_files.
> > + */
> > +static void do_batch_fsync(void)
> > +{
> > +     /*
> > +      * Issue a full hardware flush against a temporary file to ensure
> > +      * that all objects are durable before any renames occur.  The code in
> > +      * fsync_loose_object_bulk_checkin has already issued a writeout
> > +      * request, but it has not flushed any writeback cache in the storage
> > +      * hardware.
> > +      */
> > +
> > +     if (needs_batch_fsync) {
> > +             struct strbuf temp_path = STRBUF_INIT;
> > +             struct tempfile *temp;
> > +
> > +             strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
> > +             temp = xmks_tempfile(temp_path.buf);
> > +             fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
> > +             delete_tempfile(&temp);
> > +             strbuf_release(&temp_path);
> > +     }
> > +
> > +     if (bulk_fsync_objdir)
> > +             tmp_objdir_migrate(bulk_fsync_objdir);
> > +}
> > +
>
> We never unset `bulk_fsync_objdir` anywhere. Shouldn't we be doing that
> when we unplug this infrastructure?
>

Will Fix.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-16 18:10           ` Junio C Hamano
@ 2022-03-16 19:50             ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-16 19:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

On Wed, Mar 16, 2022 at 11:10 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Neeraj Singh <nksingh85@gmail.com> writes:
>
> >> Did I get it right, more-or-less?
> >
> > Yes, that's my understanding as well.
>
> I guess what I wrote would make a useful material for early part of
> the log message to help future developers.
>
> Thanks.

Will do.  I changed the commit message to explain the current
functionality of bulk-checkin and how it's similar to batched-fsync.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-16 11:50   ` Bagas Sanjaya
@ 2022-03-16 19:59     ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-16 19:59 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Neeraj K. Singh

On Wed, Mar 16, 2022 at 4:50 AM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On 16/03/22 04.30, Neeraj Singh via GitGitGadget wrote:
> > On a filesystem with a singular journal that is updated during name
> > operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
> > we would expect the fsync to trigger a journal writeout so that this
> > sequence is enough to ensure that the user's data is durable by the time
> > the git command returns.
> >
>
> But what about ext4? Will fsync-ing trigger writing journal?
>

That's a good question. So I did an experiment on ext4 which gives me
some confidence:

Here's my ext4 configuration: /dev/sdc on / type ext4
(rw,relatime,discard,errors=remount-ro,data=ordered)

I added a new mode called core.fsyncMethod=batch-extra-fsync. This
issues an extra open,fsync,close during migration from the tmp-objdir
(which I confirmed is really happening using strace).  The added cost
of this extra operation is relatively small compared to
core.fsyncMethod=fsync.  That leads me to believe that (barring fs
bugs), ext4 thinks that the data is already sufficiently durable that
it doesn't need to issue an extra disk cache flush.  See
https://github.com/neerajsi-msft/git/commit/131466dd95165efc5c480d971c69ea1e9182657e
for the test code.  I don't particularly want to add this as a
built-in mode at this point since it will be somewhat hard to document
which mode a user should choose.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-16 18:21     ` Neeraj Singh
@ 2022-03-17  5:48       ` Patrick Steinhardt
  0 siblings, 0 replies; 175+ messages in thread
From: Patrick Steinhardt @ 2022-03-17  5:48 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Neeraj K. Singh

[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]

On Wed, Mar 16, 2022 at 11:21:56AM -0700, Neeraj Singh wrote:
> On Wed, Mar 16, 2022 at 12:31 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Tue, Mar 15, 2022 at 09:30:54PM +0000, Neeraj Singh via GitGitGadget wrote:
> > > From: Neeraj Singh <neerajsi@microsoft.com>
> > > diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> > > index 062e5259905..c041ed33801 100644
> > > --- a/Documentation/config/core.txt
> > > +++ b/Documentation/config/core.txt
> > > @@ -628,6 +628,11 @@ core.fsyncMethod::
> > >  * `writeout-only` issues pagecache writeback requests, but depending on the
> > >    filesystem and storage hardware, data added to the repository may not be
> > >    durable in the event of a system crash. This is the default mode on macOS.
> > > +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> > > +  updates in the disk writeback cache and then a single full fsync to trigger
> > > +  the disk cache flush at the end of the operation. This mode is expected to
> > > +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
> > > +  and on Windows for repos stored on NTFS or ReFS filesystems.
> >
> > This mode will not be supported by all parts of our stack that use our
> > new fsync infra. So I think we should both document that some parts of
> > the stack don't support batching, and say what the fallback behaviour is
> > for those that don't.
> >
> 
> Can do. I'm hoping that you'll revive your batch-mode refs change too so that
> we get batching across the ODB and Refs, which are the two data stores that
> may receive many updates in a single Git command.

Huh, I completely forgot that my previous implementation already had
such a mechanism. I may have a go at it again, but it would take me a
while given that I'll be OOO most of April.

> This documentation
> comment will read:
> ```
> * `batch` enables a mode that uses writeout-only flushes to stage multiple
>   updates in the disk writeback cache and then does a single full fsync of
>   a dummy file to trigger the disk cache flush at the end of the operation.
>   Currently `batch` mode only applies to loose-object files. Other repository
>   data is made durable as if `fsync` was specified. This mode is expected to
>   be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
>   and on Windows for repos stored on NTFS or ReFS filesystems.
> ```

Reads good to me, thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15 ` Neeraj K. Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
                     ` (8 more replies)
  7 siblings, 9 replies; 175+ messages in thread
From: Neeraj K. Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

V2 changes:

 * Change doc to indicate that only some repo updates are batched
 * Null and zero out control variables in do_batch_fsync under
   unplug_bulk_checkin
 * Make batch mode default on Windows.
 * Update the description for the initial patch that cleans up the
   bulk-checkin infrastructure.
 * Rebase onto 'seen' at 0cac37f38f9.

--Original definition-- When core.fsync includes loose-object, we issue an
fsync after every written object. For a 'git-add' or similar command that
adds a lot of files to the repo, the costs of these fsyncs adds up. One
major factor in this cost is the time it takes for the physical storage
controller to flush its caches to durable media.

This series takes advantage of the writeout-only mode of git_fsync to issue
OS cache writebacks for all of the objects being added to the repository
followed by a single fsync to a dummy file, which should trigger a
filesystem log flush and storage controller cache flush. This mechanism is
known to be safe on common Windows filesystems and expected to be safe on
macOS. Some linux filesystems, such as XFS, will probably do the right thing
as well. See [1] for previous discussion on the predecessor of this patch
series.

This series is important on Windows, where loose-objects are included in the
fsync set by default in Git-For-Windows. In this series, I'm also setting
the default mode for Windows to turn on loose object fsyncing with batch
mode, so that we can get CI coverage of the actual git-for-windows
configuration upstream. We still don't actually issue fsyncs for the test
suite since GIT_TEST_FSYNC is set to 0, but we exercise all of the
surrounding batch mode code.

This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.

[1]
https://lore.kernel.org/git/2c1ddef6057157d85da74a7274e03eacf0374e45.1629856293.git.gitgitgadget@gmail.com/

Neeraj Singh (7):
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  core.fsyncmethod: batched disk flushes for loose-objects
  update-index: use the bulk-checkin infrastructure
  unpack-objects: use the bulk-checkin infrastructure
  core.fsync: use batch mode and sync loose objects by default on
    Windows
  core.fsyncmethod: tests for batch mode
  core.fsyncmethod: performance tests for add and stash

 Documentation/config/core.txt |  7 +++
 builtin/unpack-objects.c      |  3 ++
 builtin/update-index.c        |  6 +++
 bulk-checkin.c                | 92 +++++++++++++++++++++++++++++++----
 bulk-checkin.h                |  2 +
 cache.h                       | 12 ++++-
 compat/mingw.h                |  3 ++
 config.c                      |  4 +-
 git-compat-util.h             |  2 +
 object-file.c                 |  2 +
 t/lib-unique-files.sh         | 36 ++++++++++++++
 t/perf/p3700-add.sh           | 59 ++++++++++++++++++++++
 t/perf/p3900-stash.sh         | 62 +++++++++++++++++++++++
 t/perf/perf-lib.sh            |  4 +-
 t/t3700-add.sh                | 22 +++++++++
 t/t3903-stash.sh              | 17 +++++++
 t/t5300-pack-object.sh        | 32 +++++++-----
 17 files changed, 340 insertions(+), 25 deletions(-)
 create mode 100644 t/lib-unique-files.sh
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh


base-commit: 0cac37f38f94bb93550eb164b5d574cd96e23785
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1134%2Fneerajsi-msft%2Fns%2Fbatched-fsync-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1134/neerajsi-msft/ns/batched-fsync-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1134

Range-diff vs v1:

 1:  a77d02df626 ! 1:  9c2abd12bbb bulk-checkin: rename 'state' variable and separate 'plugged' boolean
     @@ Metadata
       ## Commit message ##
          bulk-checkin: rename 'state' variable and separate 'plugged' boolean
      
     -    Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.
     +    This commit prepares for adding batch-fsync to the bulk-checkin
     +    infrastructure.
     +
     +    The bulk-checkin infrastructure is currently used to batch up addition
     +    of large blobs to a packfile. When a blob is larger than
     +    big_file_threshold, we unconditionally add it to a pack. If bulk
     +    checkins are 'plugged', we allow multiple large blobs to be added to a
     +    single pack until we reach the packfile size limit; otherwise, we simply
     +    make a new packfile for each large blob. The 'unplug' call tells us when
     +    the series of blob additions is done so that we can finish the packfiles
     +    and make their objects available to subsequent operations.
     +
     +    Stated another way, bulk-checkin allows callers to define a transaction
     +    that adds multiple objects to the object database, where the object
     +    database can optimize its internal operations within the transaction
     +    boundary.
     +
     +    Batched fsync will fit into bulk-checkin by taking advantage of the
     +    plug/unplug functionality to determine the appropriate time to fsync
     +    and make newly-added objects available in the primary object database.
      
          * Rename 'state' variable to 'bulk_checkin_state', since we will later
            be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
 2:  d38f20b4430 ! 2:  3ed1dcd9b9b core.fsyncmethod: batched disk flushes for loose-objects
     @@ Documentation/config/core.txt: core.fsyncMethod::
         filesystem and storage hardware, data added to the repository may not be
         durable in the event of a system crash. This is the default mode on macOS.
      +* `batch` enables a mode that uses writeout-only flushes to stage multiple
     -+  updates in the disk writeback cache and then a single full fsync to trigger
     -+  the disk cache flush at the end of the operation. This mode is expected to
     ++  updates in the disk writeback cache and then does a single full fsync of
     ++  a dummy file to trigger the disk cache flush at the end of the operation.
     ++  Currently `batch` mode only applies to loose-object files. Other repository
     ++  data is made durable as if `fsync` was specified. This mode is expected to
      +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
      +  and on Windows for repos stored on NTFS or ReFS filesystems.
       
     @@ bulk-checkin.c: clear_exit:
      +		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
      +		delete_tempfile(&temp);
      +		strbuf_release(&temp_path);
     ++		needs_batch_fsync = 0;
      +	}
      +
     -+	if (bulk_fsync_objdir)
     ++	if (bulk_fsync_objdir) {
      +		tmp_objdir_migrate(bulk_fsync_objdir);
     ++		bulk_fsync_objdir = NULL;
     ++	}
      +}
      +
       static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 3:  b0480f0c814 = 3:  54797dbc520 update-index: use the bulk-checkin infrastructure
 4:  99e3a61b919 = 4:  6662e2dae0f unpack-objects: use the bulk-checkin infrastructure
 5:  4e56c58c8cb ! 5:  03bf591742a core.fsync: use batch mode and sync loose objects by default on Windows
     @@ Commit message
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     -    change fsyncmethod to batch as well
     -
       ## cache.h ##
      @@ cache.h: enum fsync_component {
       			      FSYNC_COMPONENT_INDEX | \
 6:  88e47047d79 = 6:  1937746df47 core.fsyncmethod: tests for batch mode
 7:  876741f1ef9 = 7:  624244078c7 core.fsyncmethod: performance tests for add and stash

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

This commit prepares for adding batch-fsync to the bulk-checkin
infrastructure.

The bulk-checkin infrastructure is currently used to batch up addition
of large blobs to a packfile. When a blob is larger than
big_file_threshold, we unconditionally add it to a pack. If bulk
checkins are 'plugged', we allow multiple large blobs to be added to a
single pack until we reach the packfile size limit; otherwise, we simply
make a new packfile for each large blob. The 'unplug' call tells us when
the series of blob additions is done so that we can finish the packfiles
and make their objects available to subsequent operations.

Stated another way, bulk-checkin allows callers to define a transaction
that adds multiple objects to the object database, where the object
database can optimize its internal operations within the transaction
boundary.

Batched fsync will fit into bulk-checkin by taking advantage of the
plug/unplug functionality to determine the appropriate time to fsync
and make newly-added objects available in the primary object database.

* Rename 'state' variable to 'bulk_checkin_state', since we will later
  be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
  find in the debugger, since the name is more unique.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 bulk-checkin.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index e988a388b65..93b1dc5138a 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -10,9 +10,9 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static struct bulk_checkin_state {
-	unsigned plugged:1;
+static int bulk_checkin_plugged;
 
+static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
 	off_t offset;
@@ -21,7 +21,7 @@ static struct bulk_checkin_state {
 	struct pack_idx_entry **written;
 	uint32_t alloc_written;
 	uint32_t nr_written;
-} state;
+} bulk_checkin_state;
 
 static void finish_tmp_packfile(struct strbuf *basename,
 				const char *pack_tmp_name,
@@ -278,21 +278,23 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
 {
-	int status = deflate_to_pack(&state, oid, fd, size, type,
+	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
 				     path, flags);
-	if (!state.plugged)
-		finish_bulk_checkin(&state);
+	if (!bulk_checkin_plugged)
+		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
 
 void plug_bulk_checkin(void)
 {
-	state.plugged = 1;
+	assert(!bulk_checkin_plugged);
+	bulk_checkin_plugged = 1;
 }
 
 void unplug_bulk_checkin(void)
 {
-	state.plugged = 0;
-	if (state.f)
-		finish_bulk_checkin(&state);
+	assert(bulk_checkin_plugged);
+	bulk_checkin_plugged = 0;
+	if (bulk_checkin_state.f)
+		finish_bulk_checkin(&bulk_checkin_state);
 }
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
                       ` (3 more replies)
  2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
                     ` (6 subsequent siblings)
  8 siblings, 4 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin plugging and unplugging functionality,
takes advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1. Create the object in a tmp-objdir.
2. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1. Issue an fsync against a dummy file to flush the hardware writeback
   cache, which should by now have seen the tmp-objdir writes.
2. Rename all of the tmp-objdir files to their final names.
3. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns.

Batch mode is only enabled if core.fsyncObjectFiles is false or unset.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 Documentation/config/core.txt |  7 ++++
 bulk-checkin.c                | 70 +++++++++++++++++++++++++++++++++++
 bulk-checkin.h                |  2 +
 cache.h                       |  8 +++-
 config.c                      |  2 +
 object-file.c                 |  2 +
 6 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 889522956e4..a3798dfc334 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -628,6 +628,13 @@ core.fsyncMethod::
 * `writeout-only` issues pagecache writeback requests, but depending on the
   filesystem and storage hardware, data added to the repository may not be
   durable in the event of a system crash. This is the default mode on macOS.
+* `batch` enables a mode that uses writeout-only flushes to stage multiple
+  updates in the disk writeback cache and then does a single full fsync of
+  a dummy file to trigger the disk cache flush at the end of the operation.
+  Currently `batch` mode only applies to loose-object files. Other repository
+  data is made durable as if `fsync` was specified. This mode is expected to
+  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
+  and on Windows for repos stored on NTFS or ReFS filesystems.
 
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 93b1dc5138a..a702e0ff203 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -3,14 +3,20 @@
  */
 #include "cache.h"
 #include "bulk-checkin.h"
+#include "lockfile.h"
 #include "repository.h"
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
+#include "string-list.h"
+#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int bulk_checkin_plugged;
+static int needs_batch_fsync;
+
+static struct tmp_objdir *bulk_fsync_objdir;
 
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
@@ -80,6 +86,37 @@ clear_exit:
 	reprepare_packed_git(the_repository);
 }
 
+/*
+ * Cleanup after batch-mode fsync_object_files.
+ */
+static void do_batch_fsync(void)
+{
+	/*
+	 * Issue a full hardware flush against a temporary file to ensure
+	 * that all objects are durable before any renames occur.  The code in
+	 * fsync_loose_object_bulk_checkin has already issued a writeout
+	 * request, but it has not flushed any writeback cache in the storage
+	 * hardware.
+	 */
+
+	if (needs_batch_fsync) {
+		struct strbuf temp_path = STRBUF_INIT;
+		struct tempfile *temp;
+
+		strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
+		temp = xmks_tempfile(temp_path.buf);
+		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
+		delete_tempfile(&temp);
+		strbuf_release(&temp_path);
+		needs_batch_fsync = 0;
+	}
+
+	if (bulk_fsync_objdir) {
+		tmp_objdir_migrate(bulk_fsync_objdir);
+		bulk_fsync_objdir = NULL;
+	}
+}
+
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -274,6 +311,24 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
+void fsync_loose_object_bulk_checkin(int fd)
+{
+	/*
+	 * If we have a plugged bulk checkin, we issue a call that
+	 * cleans the filesystem page cache but avoids a hardware flush
+	 * command. Later on we will issue a single hardware flush
+	 * before as part of do_batch_fsync.
+	 */
+	if (bulk_checkin_plugged &&
+	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
+		assert(bulk_fsync_objdir);
+		if (!needs_batch_fsync)
+			needs_batch_fsync = 1;
+	} else {
+		fsync_or_die(fd, "loose object file");
+	}
+}
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -288,6 +343,19 @@ int index_bulk_checkin(struct object_id *oid,
 void plug_bulk_checkin(void)
 {
 	assert(!bulk_checkin_plugged);
+
+	/*
+	 * A temporary object directory is used to hold the files
+	 * while they are not fsynced.
+	 */
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
+		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
+		if (!bulk_fsync_objdir)
+			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
+
+		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
+	}
+
 	bulk_checkin_plugged = 1;
 }
 
@@ -297,4 +365,6 @@ void unplug_bulk_checkin(void)
 	bulk_checkin_plugged = 0;
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
+
+	do_batch_fsync();
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index b26f3dc3b74..08f292379b6 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,6 +6,8 @@
 
 #include "cache.h"
 
+void fsync_loose_object_bulk_checkin(int fd);
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
diff --git a/cache.h b/cache.h
index 3160bc1e489..d1ae51388c9 100644
--- a/cache.h
+++ b/cache.h
@@ -1040,7 +1040,8 @@ extern int use_fsync;
 
 enum fsync_method {
 	FSYNC_METHOD_FSYNC,
-	FSYNC_METHOD_WRITEOUT_ONLY
+	FSYNC_METHOD_WRITEOUT_ONLY,
+	FSYNC_METHOD_BATCH
 };
 
 extern enum fsync_method fsync_method;
@@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *);
 int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
+static inline int batch_fsync_enabled(enum fsync_component component)
+{
+	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/config.c b/config.c
index 261ee7436e0..0b28f90de8b 100644
--- a/config.c
+++ b/config.c
@@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 			fsync_method = FSYNC_METHOD_FSYNC;
 		else if (!strcmp(value, "writeout-only"))
 			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
+		else if (!strcmp(value, "batch"))
+			fsync_method = FSYNC_METHOD_BATCH;
 		else
 			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
 
diff --git a/object-file.c b/object-file.c
index 5258d9ed827..bdb0a38328f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1895,6 +1895,8 @@ static void close_loose_object(int fd)
 
 	if (fsync_object_files > 0)
 		fsync_or_die(fd, "loose object file");
+	else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		fsync_loose_object_bulk_checkin(fd);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
 				       "loose object file");
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
  2022-03-21 17:50     ` Junio C Hamano
  2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
                     ` (5 subsequent siblings)
  8 siblings, 2 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.

This change enables bulk-checkin for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.

There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete.  This
usage is unlikely, since any tool invoking update-index and expecting to
see objects would have to synchronize with the update-index process
after passing it a file path.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/update-index.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 75d646377cc..38e9d7e88cb 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -5,6 +5,7 @@
  */
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "lockfile.h"
 #include "quote.h"
@@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 	the_index.updated_skipworktree = 1;
 
+	/* we might be adding many objects to the object database */
+	plug_bulk_checkin();
+
 	/*
 	 * Custom copy of parse_options() because we want to handle
 	 * filename arguments as they come.
@@ -1190,6 +1194,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	/* by now we must have added all of the new objects */
+	unplug_bulk_checkin();
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 4/7] unpack-objects: use the bulk-checkin infrastructure
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-21 17:55     ` Junio C Hamano
  2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling bulk-checkin when unpacking objects, we can take advantage
of batched fsyncs.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/unpack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dbeb0680a58..c55b6616aed 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "object-store.h"
 #include "object.h"
@@ -503,10 +504,12 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
+	plug_bulk_checkin();
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
 		display_progress(progress, i + 1);
 	}
+	unplug_bulk_checkin();
 	stop_progress(&progress);
 
 	if (delta_list)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.

We don't actually do fsyncs in the test suite, since GIT_TEST_FSYNC is
set to 0. However, we do exercise all of the surrounding batch mode code
since GIT_TEST_FSYNC merely makes the maybe_fsync wrapper always appear
to succeed.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 cache.h           | 4 ++++
 compat/mingw.h    | 3 +++
 config.c          | 2 +-
 git-compat-util.h | 2 ++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index d1ae51388c9..4d2131e8f4f 100644
--- a/cache.h
+++ b/cache.h
@@ -1031,6 +1031,10 @@ enum fsync_component {
 			      FSYNC_COMPONENT_INDEX | \
 			      FSYNC_COMPONENT_REFERENCE)
 
+#ifndef FSYNC_COMPONENTS_PLATFORM_DEFAULT
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT FSYNC_COMPONENTS_DEFAULT
+#endif
+
 /*
  * A bitmask indicating which components of the repo should be fsynced.
  */
diff --git a/compat/mingw.h b/compat/mingw.h
index 6074a3d3ced..afe30868c04 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -332,6 +332,9 @@ int mingw_getpagesize(void);
 int win32_fsync_no_flush(int fd);
 #define fsync_no_flush win32_fsync_no_flush
 
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT (FSYNC_COMPONENTS_DEFAULT | FSYNC_COMPONENT_LOOSE_OBJECT)
+#define FSYNC_METHOD_DEFAULT (FSYNC_METHOD_BATCH)
+
 struct rlimit {
 	unsigned int rlim_cur;
 };
diff --git a/config.c b/config.c
index 0b28f90de8b..c76443dc556 100644
--- a/config.c
+++ b/config.c
@@ -1342,7 +1342,7 @@ static const struct fsync_component_name {
 
 static enum fsync_component parse_fsync_components(const char *var, const char *string)
 {
-	enum fsync_component current = FSYNC_COMPONENTS_DEFAULT;
+	enum fsync_component current = FSYNC_COMPONENTS_PLATFORM_DEFAULT;
 	enum fsync_component positive = 0, negative = 0;
 
 	while (string) {
diff --git a/git-compat-util.h b/git-compat-util.h
index 0892e209a2f..fffe42ce7c1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1257,11 +1257,13 @@ __attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...);
 #define BUG(...) BUG_fl(__FILE__, __LINE__, __VA_ARGS__)
 
+#ifndef FSYNC_METHOD_DEFAULT
 #ifdef __APPLE__
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_WRITEOUT_ONLY
 #else
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_FSYNC
 #endif
+#endif
 
 enum fsync_action {
 	FSYNC_WRITEOUT_ONLY,
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 6/7] core.fsyncmethod: tests for batch mode
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
@ 2022-03-20  7:15   ` Neeraj Singh via GitGitGadget
  2022-03-21 18:34     ` Junio C Hamano
  2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:15 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added due
to it already being in the repo.

We aren't actually issuing any fsyncs in these tests, since
GIT_TEST_FSYNC is 0, but we still exercise all of the tmp_objdir logic
in bulk-checkin.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/lib-unique-files.sh  | 36 ++++++++++++++++++++++++++++++++++++
 t/t3700-add.sh         | 22 ++++++++++++++++++++++
 t/t3903-stash.sh       | 17 +++++++++++++++++
 t/t5300-pack-object.sh | 32 +++++++++++++++++++++-----------
 4 files changed, 96 insertions(+), 11 deletions(-)
 create mode 100644 t/lib-unique-files.sh

diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
new file mode 100644
index 00000000000..a7de4ca8512
--- /dev/null
+++ b/t/lib-unique-files.sh
@@ -0,0 +1,36 @@
+# Helper to create files with unique contents
+
+
+# Create multiple files with unique contents. Takes the number of
+# directories, the number of files in each directory, and the base
+# directory.
+#
+# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
+#					 each in my_dir, all with unique
+#					 contents.
+
+test_create_unique_files() {
+	test "$#" -ne 3 && BUG "3 param"
+
+	local dirs=$1
+	local files=$2
+	local basedir=$3
+	local counter=0
+	test_tick
+	local basedata=$test_tick
+
+
+	rm -rf $basedir
+
+	for i in $(test_seq $dirs)
+	do
+		local dir=$basedir/dir$i
+
+		mkdir -p "$dir"
+		for j in $(test_seq $files)
+		do
+			counter=$((counter + 1))
+			echo "$basedata.$counter"  >"$dir/file$j.txt"
+		done
+	done
+}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index b1f90ba3250..1f349f52ad3 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+. $TEST_DIRECTORY/lib-unique-files.sh
+
 # Test the file mode "$1" of the file "$2" in the index.
 test_mode_in_index () {
 	case "$(git ls-files -s "$2")" in
@@ -34,6 +36,26 @@ test_expect_success \
     'Test that "git add -- -q" works' \
     'touch -- -q && git add -- -q'
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'git add: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files &&
+	git $BATCH_CONFIGURATION add -- ./fsync-files/ &&
+	rm -f fsynced_files &&
+	git ls-files --stage fsync-files/ > fsynced_files &&
+	test_line_count = 8 fsynced_files &&
+	awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e
+"
+
+test_expect_success 'git update-index: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files2 &&
+	find fsync-files2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
+	rm -f fsynced_files2 &&
+	git ls-files --stage fsync-files2/ > fsynced_files2 &&
+	test_line_count = 8 fsynced_files2 &&
+	awk -- '{print \$2}' fsynced_files2 | xargs -n1 git cat-file -e
+"
+
 test_expect_success \
 	'git add: Test that executable bit is not used if core.filemode=0' \
 	'git config core.filemode 0 &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 55cd77901a8..5a3996b838f 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
+. $TEST_DIRECTORY/lib-unique-files.sh
 
 test_expect_success 'usage on cmd and subcommand invalid option' '
 	test_expect_code 129 git stash --invalid-option 2>usage &&
@@ -1462,6 +1463,22 @@ test_expect_success 'stash handles skip-worktree entries nicely' '
 	git rev-parse --verify refs/stash:A.t
 '
 
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'stash with core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 fsync-files &&
+	git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
+	rm -f fsynced_files &&
+
+	# The files were untracked, so use the third parent,
+	# which contains the untracked files
+	git ls-tree -r stash^3 -- ./fsync-files/ > fsynced_files &&
+	test_line_count = 8 fsynced_files &&
+	awk -- '{print \$3}' fsynced_files | xargs -n1 git cat-file -e
+"
+
+
 test_expect_success 'git stash succeeds despite directory/file change' '
 	test_create_repo directory_file_switch_v1 &&
 	(
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index a11d61206ad..8e2f73cc68f 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -162,23 +162,25 @@ test_expect_success 'pack-objects with bogus arguments' '
 
 check_unpack () {
 	test_when_finished "rm -rf git2" &&
-	git init --bare git2 &&
-	git -C git2 unpack-objects -n <"$1".pack &&
-	git -C git2 unpack-objects <"$1".pack &&
-	(cd .git && find objects -type f -print) |
-	while read path
-	do
-		cmp git2/$path .git/$path || {
-			echo $path differs.
-			return 1
-		}
-	done
+	git $2 init --bare git2 &&
+	(
+		git $2 -C git2 unpack-objects -n <"$1".pack &&
+		git $2 -C git2 unpack-objects <"$1".pack &&
+		git $2 -C git2 cat-file --batch-check="%(objectname)"
+	) <obj-list >current &&
+	cmp obj-list current
 }
 
 test_expect_success 'unpack without delta' '
 	check_unpack test-1-${packname_1}
 '
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
+	check_unpack test-1-${packname_1} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'pack with REF_DELTA' '
 	packname_2=$(git pack-objects --progress test-2 <obj-list 2>stderr) &&
 	check_deltas stderr -gt 0
@@ -188,6 +190,10 @@ test_expect_success 'unpack with REF_DELTA' '
 	check_unpack test-2-${packname_2}
 '
 
+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-2-${packname_2} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'pack with OFS_DELTA' '
 	packname_3=$(git pack-objects --progress --delta-base-offset test-3 \
 			<obj-list 2>stderr) &&
@@ -198,6 +204,10 @@ test_expect_success 'unpack with OFS_DELTA' '
 	check_unpack test-3-${packname_3}
 '
 
+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-3-${packname_3} "$BATCH_CONFIGURATION"
+'
+
 test_expect_success 'compare delta flavors' '
 	perl -e '\''
 		defined($_ = -s $_) or die for @ARGV;
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-20  7:16   ` Neeraj Singh via GitGitGadget
  2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
  8 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-20  7:16 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add basic performance tests for "git add" and "git stash" of a lot of
new objects with various fsync settings. This shows the benefit of batch
mode relative to an ordinary stash command.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p3700-add.sh   | 59 ++++++++++++++++++++++++++++++++++++++++
 t/perf/p3900-stash.sh | 62 +++++++++++++++++++++++++++++++++++++++++++
 t/perf/perf-lib.sh    |  4 +--
 3 files changed, 123 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh

diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh
new file mode 100755
index 00000000000..2ea78c9449d
--- /dev/null
+++ b/t/perf/p3700-add.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of add"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	test_perf "add $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG add files
+	"
+done
+
+test_done
diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh
new file mode 100755
index 00000000000..3526f06cef4
--- /dev/null
+++ b/t/perf/p3900-stash.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of stash"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	# We only stash files in the 'files' subdirectory since
+	# the perf test infrastructure creates files in the
+	# current working directory that need to be preserved
+	test_perf "stash $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG stash push -u -- files
+	"
+done
+
+test_done
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 932105cd12c..d270d1d962a 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -98,8 +98,8 @@ test_perf_create_repo_from () {
 	mkdir -p "$repo/.git"
 	(
 		cd "$source" &&
-		{ cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null ||
-			cp -R "$objects_dir" "$repo/.git/"; } &&
+		{ cp -Rl "$objects_dir" "$repo/.git/" ||
+			cp -R "$objects_dir" "$repo/.git/" 2>/dev/null;} &&
 
 		# common_dir must come first here, since we want source_git to
 		# take precedence and overwrite any overlapping files
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
  2022-03-21 18:28       ` Neeraj Singh
  2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 14:41 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
> [...]
> +	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> +		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> +		if (!bulk_fsync_objdir)
> +			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));

Should camel-case the config var, and we should have a die_errno() here
which tell us why we couldn't create it (possibly needing to ferry it up
from the tempfile API...)

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
  2022-03-21 22:09       ` Neeraj Singh
  2022-03-21 17:50     ` Junio C Hamano
  1 sibling, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 15:01 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> The update-index functionality is used internally by 'git stash push' to
> setup the internal stashed commit.
>
> This change enables bulk-checkin for update-index infrastructure to
> speed up adding new objects to the object database by leveraging the
> batch fsync functionality.
>
> There is some risk with this change, since under batch fsync, the object
> files will be in a tmp-objdir until update-index is complete.  This
> usage is unlikely, since any tool invoking update-index and expecting to
> see objects would have to synchronize with the update-index process
> after passing it a file path.
>
> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  builtin/update-index.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 75d646377cc..38e9d7e88cb 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -5,6 +5,7 @@
>   */
>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>  #include "cache.h"
> +#include "bulk-checkin.h"
>  #include "config.h"
>  #include "lockfile.h"
>  #include "quote.h"
> @@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  
>  	the_index.updated_skipworktree = 1;
>  
> +	/* we might be adding many objects to the object database */
> +	plug_bulk_checkin();
> +

Shouldn't this be after parse_options_start()?

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
  2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
  2022-03-21 20:14       ` Neeraj Singh
  2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
  2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
  3 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 15:47 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> One major source of the cost of fsync is the implied flush of the
> hardware writeback cache within the disk drive. This commit introduces
> a new `core.fsyncMethod=batch` option that batches up hardware flushes.
> It hooks into the bulk-checkin plugging and unplugging functionality,
> takes advantage of tmp-objdir, and uses the writeout-only support code.
>
> When the new mode is enabled, we do the following for each new object:
> 1. Create the object in a tmp-objdir.
> 2. Issue a pagecache writeback request and wait for it to complete.
>
> At the end of the entire transaction when unplugging bulk checkin:
> 1. Issue an fsync against a dummy file to flush the hardware writeback
>    cache, which should by now have seen the tmp-objdir writes.
> 2. Rename all of the tmp-objdir files to their final names.
> 3. When updating the index and/or refs, we assume that Git will issue
>    another fsync internal to that operation. This is not the default
>    today, but the user now has the option of syncing the index and there
>    is a separate patch series to implement syncing of refs.

Re my question in
https://lore.kernel.org/git/220310.86r179ki38.gmgdl@evledraar.gmail.com/
(which you *partially* replied to per my reading, i.e. not the
fsync_nth() question) I still don't get why the tmp-objdir part of this
is needed.

For "git stash" which is one thing sped up by this let's go over what
commands/FS ops we do. I changed the test like this:
	
	diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
	index 3fc16944e9e..479a495c68c 100755
	--- a/t/t3903-stash.sh
	+++ b/t/t3903-stash.sh
	@@ -1383,7 +1383,7 @@ BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
	 
	 test_expect_success 'stash with core.fsyncmethod=batch' "
	 	test_create_unique_files 2 4 fsync-files &&
	-	git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
	+	strace -f git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
	 	rm -f fsynced_files &&
	 
	 	# The files were untracked, so use the third parent,
	
Then we get this output, with my comments, and I snipped some output:
	 
	$ ./t3903-stash.sh --run=1-4,114 -vixd 2>&1|grep --color -e 89772c935031c228ed67890f9 -e .git/stash -e bulk_fsync -e .git/index
	[pid 14703] access(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
	[pid 14703] access(".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
	[pid 14703] link(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/tmp_obj_bdUlzu", ".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0

Here we're creating the tmp_objdir() files. We then sync_file_range()
and close() this.

	[pid 14703] openat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/objects/tmp_objdir-bulk-fsync-rR3AQI/bulk_fsync_HsDRl7", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
	[pid 14703] unlink("/home/avar/g/git/t/trash directory.t3903-stash/.git/objects/tmp_objdir-bulk-fsync-rR3AQI/bulk_fsync_HsDRl7") = 0

This is the flushing of the "cookie" in do_batch_fsync().

	[pid 14703] newfstatat(AT_FDCWD, ".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, 0) = 0
	[pid 14703] link(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0

Here we're going through the object dir migration with
unplug_bulk_checkin().

	[pid 14703] unlink(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0
	newfstatat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, AT_SYMLINK_NOFOLLOW) = 0
	[pid 14705] access(".git/objects/tmp_objdir-bulk-fsync-0F7DGy/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
	[pid 14705] access(".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = 0
	[pid 14705] utimensat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", NULL, 0) = 0
	[pid 14707] openat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", O_RDONLY|O_CLOEXEC) = 9

We then update the index itself, first a temporary index.stash :

    openat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 8
    openat(AT_FDCWD, ".git/index.stash.19141", O_RDONLY) = 9
    newfstatat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, AT_SYMLINK_NOFOLLOW) = 0
    newfstatat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", {st_mode=S_IFREG|0644, st_size=927, ...}, 0) = 0
    rename("/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141") = 0
    unlink(".git/index.stash.19141")        = 0

Followed by the same and a later rename of the actual index:

    [pid 19146] rename("/home/avar/g/git/t/trash directory.t3903-stash/.git/index.lock", "/home/avar/g/git/t/trash directory.t3903-stash/.git/index") = 0

So, my question is still why the temporary object dir migration part of
this is needed.

We are writing N loose object files, and we write those to temporary
names already.

AFAIKT we could do all of this by doing the same
tmp/rename/sync_file_range dance on the main object store.

Then instead of the "bulk_fsync" cookie file don't close() the last file
object file we write until we issue the fsync on it.

But maybe this is all needed, I just can't understand from the commit
message why the "bulk checkin" part is being done.

I think since we've been over this a few times without any success it
would really help to have some example of the smallest set of syscalls
to write a file like this safely. I.e. this is doing (pseudocode):

    /* first the bulk path */
    open("bulk/x.tmp");
    write("bulk/x.tmp");
    sync_file_range("bulk/x.tmp");
    close("bulk/x.tmp");
    rename("bulk/x.tmp", "bulk/x");
    open("bulk/y.tmp");
    write("bulk/y.tmp");
    sync_file_range("bulk/y.tmp");
    close("bulk/y.tmp");
    rename("bulk/y.tmp", "bulk/y");
    /* Rename to "real" */
    rename("bulk/x", x");
    rename("bulk/y", y");
    /* sync a cookie */
    fsync("cookie");

And I'm asking why it's not:

    /* Rename to "real" as we go */
    open("x.tmp");
    write("x.tmp");
    sync_file_range("x.tmp");
    close("x.tmp");
    rename("x.tmp", "x");
    last_fd = open("y.tmp"); /* don't close() the last one yet */
    write("y.tmp");
    sync_file_range("y.tmp");
    rename("y.tmp", "y");
    /* sync a cookie */
    fsync(last_fd);

Which I guess is two questions:

 A. do we need the cookie, or can we re-use the fd of the last thing we
    write?
 B. Is the bulk indirection needed?

> +		fsync_or_die(fd, "loose object file");

Unrelated nit: this API is producing sentence lego unfriendly to
translators.

Should be made to take an enum or something, so we can emit the relevant
translated message in fsync_or_die(). Imagine getting:

	fsync error on '日本語は話せません'

Which this will do, just the other way around for non-English speakers
using the translation.

(The solution is also not to add _() here, since translators will want
to control the word order.)

> diff --git a/cache.h b/cache.h
> index 3160bc1e489..d1ae51388c9 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1040,7 +1040,8 @@ extern int use_fsync;
>  
>  enum fsync_method {
>  	FSYNC_METHOD_FSYNC,
> -	FSYNC_METHOD_WRITEOUT_ONLY
> +	FSYNC_METHOD_WRITEOUT_ONLY,
> +	FSYNC_METHOD_BATCH
>  };
>  
>  extern enum fsync_method fsync_method;
> @@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *);
>  int fsync_component(enum fsync_component component, int fd);
>  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
>  
> +static inline int batch_fsync_enabled(enum fsync_component component)
> +{
> +	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
> +}
> +
>  ssize_t read_in_full(int fd, void *buf, size_t count);
>  ssize_t write_in_full(int fd, const void *buf, size_t count);
>  ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
> diff --git a/config.c b/config.c
> index 261ee7436e0..0b28f90de8b 100644
> --- a/config.c
> +++ b/config.c
> @@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>  			fsync_method = FSYNC_METHOD_FSYNC;
>  		else if (!strcmp(value, "writeout-only"))
>  			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
> +		else if (!strcmp(value, "batch"))
> +			fsync_method = FSYNC_METHOD_BATCH;
>  		else
>  			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
>  
> diff --git a/object-file.c b/object-file.c
> index 5258d9ed827..bdb0a38328f 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1895,6 +1895,8 @@ static void close_loose_object(int fd)
>  
>  	if (fsync_object_files > 0)
>  		fsync_or_die(fd, "loose object file");
> +	else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> +		fsync_loose_object_bulk_checkin(fd);
>  	else
>  		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
>  				       "loose object file");

This is related to the above comments about what minimum set of syscalls
are needed to trigger this "bulk" behavior, but it seems to me that this
whole API is avoiding just passing some new flags down to object-file.c
and friends.

For e.g. update-index that results in e.g. the "plug bulk" not being
aware of HASH_WRITE_OBJECT, so with dry-run writes and the like we'll do
the whole setup/teardown for nothing.

Which is another reason I wondered why this couldn't be a flagged passed
down to the object writing...

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
@ 2022-03-21 17:03   ` Junio C Hamano
  2022-03-21 18:14     ` Neeraj Singh
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
  8 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 17:03 UTC (permalink / raw)
  To: Neeraj K. Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> V2 changes:
>
>  * Change doc to indicate that only some repo updates are batched

OK.

>  * Null and zero out control variables in do_batch_fsync under
>    unplug_bulk_checkin

OK.

>  * Make batch mode default on Windows.

I do not care either way ;-)

>  * Update the description for the initial patch that cleans up the
>    bulk-checkin infrastructure.

OK.

>  * Rebase onto 'seen' at 0cac37f38f9.

That's unfortunate.  Having to depend on almost everything in 'seen'
is a guaranteed way to ensure that the topic would never graduate to
'next'.

For this topic, ns/core-fsyncmethod is the only thing outside of
'master' that the previous round needed, so I did an equivalent of

    $ git checkout -b ns/batch-fsync b896f729e2
    $ git merge ns/core-fsyncmethod 

to prepare fd008b1442 and then queued the patches on top, i.e.

    $ git am -s mbox

> This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.

"at ."?

In any case, I've applied them on 0cac37f38f9 and then re-applied
the result on top of fd008b1442 (i.e. the same base as the previous
round was queued), which, with the magic of "am -3", applied
cleanly.  Double checking the result was also simple (i.e. the tip of
such an application on top of fd008b1442 can be merged with
0cac37f38f9 and the result should be identical to the result of
applying them directly on top of 0cac37f38f9) and seems to have
produced the right result.

\Thanks.



^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
  2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
  2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 17:30     ` Junio C Hamano
  2022-03-21 20:23       ` Neeraj Singh
  2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
  3 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 17:30 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> +  updates in the disk writeback cache and then does a single full fsync of
> +  a dummy file to trigger the disk cache flush at the end of the operation.

It is unfortunate that we have a rather independent "unplug" that is
not tied to the "this is the last operation in the batch"---if there
were we didn't have to invent a dummy but a single full sync on the
real file who happened to be the last one in the batch would be
sufficient.  It would not matter, if the batch is any meaningful
size, hopefully.

> +/*
> + * Cleanup after batch-mode fsync_object_files.
> + */
> +static void do_batch_fsync(void)
> +{
> +	/*
> +	 * Issue a full hardware flush against a temporary file to ensure
> +	 * that all objects are durable before any renames occur.  The code in
> +	 * fsync_loose_object_bulk_checkin has already issued a writeout
> +	 * request, but it has not flushed any writeback cache in the storage
> +	 * hardware.
> +	 */
> +
> +	if (needs_batch_fsync) {
> +		struct strbuf temp_path = STRBUF_INIT;
> +		struct tempfile *temp;
> +
> +		strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
> +		temp = xmks_tempfile(temp_path.buf);
> +		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
> +		delete_tempfile(&temp);
> +		strbuf_release(&temp_path);
> +		needs_batch_fsync = 0;
> +	}
> +
> +	if (bulk_fsync_objdir) {
> +		tmp_objdir_migrate(bulk_fsync_objdir);
> +		bulk_fsync_objdir = NULL;

The struct obtained from tmp_objdir_create() is consumed by
tmp_objdir_migrate() so the only clean-up left for the caller to do
is to clear it to NULL.  OK.

> +	}

This initially made me wonder why we need two independent flags.
After applying this patch but not any later steps, upon plugging, we
create the tentative object directory, and any loose object will be
created there, but because nobody calls the writeout-only variant
via fsync_loose_object_bulk_checkin() yet, needs_batch_fsync may not
be turned on.  But even in that case, any new loose objects are in
the tentative object directory and need to be migrated to the real
place.

And we may not cover all the existing code paths at the end of the
series, or any new code paths right away after they get introduced,
to be aware of the fsync_loose_object_bulk_checkin() when they
create a loose object file, so it is most likely that these two if
statements will be with us forever.

OK.

> @@ -274,6 +311,24 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
>  	return 0;
>  }
>  
> +void fsync_loose_object_bulk_checkin(int fd)
> +{
> +	/*
> +	 * If we have a plugged bulk checkin, we issue a call that
> +	 * cleans the filesystem page cache but avoids a hardware flush
> +	 * command. Later on we will issue a single hardware flush
> +	 * before as part of do_batch_fsync.
> +	 */
> +	if (bulk_checkin_plugged &&
> +	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
> +		assert(bulk_fsync_objdir);
> +		if (!needs_batch_fsync)
> +			needs_batch_fsync = 1;

Except for when we unplug, do we ever flip needs_batch_fsync bit
off, once it is set?  If the answer is no, wouldn't it be clearer to
unconditionally set it, instead of "set it only for the first time"?

> +	} else {
> +		fsync_or_die(fd, "loose object file");
> +	}
> +}
> +
>  int index_bulk_checkin(struct object_id *oid,
>  		       int fd, size_t size, enum object_type type,
>  		       const char *path, unsigned flags)
> @@ -288,6 +343,19 @@ int index_bulk_checkin(struct object_id *oid,
>  void plug_bulk_checkin(void)
>  {
>  	assert(!bulk_checkin_plugged);
> +
> +	/*
> +	 * A temporary object directory is used to hold the files
> +	 * while they are not fsynced.
> +	 */
> +	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> +		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> +		if (!bulk_fsync_objdir)
> +			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
> +
> +		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
> +	}
> +
>  	bulk_checkin_plugged = 1;
>  }
>  
> @@ -297,4 +365,6 @@ void unplug_bulk_checkin(void)
>  	bulk_checkin_plugged = 0;
>  	if (bulk_checkin_state.f)
>  		finish_bulk_checkin(&bulk_checkin_state);
> +
> +	do_batch_fsync();
>  }
> diff --git a/bulk-checkin.h b/bulk-checkin.h
> index b26f3dc3b74..08f292379b6 100644
> --- a/bulk-checkin.h
> +++ b/bulk-checkin.h
> @@ -6,6 +6,8 @@
>  
>  #include "cache.h"
>  
> +void fsync_loose_object_bulk_checkin(int fd);
> +
>  int index_bulk_checkin(struct object_id *oid,
>  		       int fd, size_t size, enum object_type type,
>  		       const char *path, unsigned flags);
> diff --git a/cache.h b/cache.h
> index 3160bc1e489..d1ae51388c9 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1040,7 +1040,8 @@ extern int use_fsync;
>  
>  enum fsync_method {
>  	FSYNC_METHOD_FSYNC,
> -	FSYNC_METHOD_WRITEOUT_ONLY
> +	FSYNC_METHOD_WRITEOUT_ONLY,
> +	FSYNC_METHOD_BATCH
>  };

Style.

These days we allow trailing comma to enum definitions.  Perhaps
give a trailing comma after _BATCH so that the next update patch
will become less noisy?

Thanks.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
  2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 17:50     ` Junio C Hamano
  2022-03-21 22:18       ` Neeraj Singh
  1 sibling, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 17:50 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 75d646377cc..38e9d7e88cb 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -5,6 +5,7 @@
>   */
>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>  #include "cache.h"
> +#include "bulk-checkin.h"
>  #include "config.h"
>  #include "lockfile.h"
>  #include "quote.h"
> @@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  
>  	the_index.updated_skipworktree = 1;
>  
> +	/* we might be adding many objects to the object database */
> +	plug_bulk_checkin();
> +
>  	/*
>  	 * Custom copy of parse_options() because we want to handle
>  	 * filename arguments as they come.
> @@ -1190,6 +1194,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  		strbuf_release(&buf);
>  	}
>  
> +	/* by now we must have added all of the new objects */
> +	unplug_bulk_checkin();

I understand read-from-stdin code path would be worth plugging, but
the list of paths on the command line?  How many of them would one
fit?

Of course, the feeder may be expecting for the objects to appear in
the object store as it feeds the paths and will be utterly broken by
this change, as you mentioned in the proposed log message.  The
existing plug/unplug will change the behaviour by making the objects
sent to the packfile available only after getting unplugged.  This
series makes it even worse by making loose objects also unavailable
until unplug is called.

So, it probably is safer and more sensible approach to introduce a
new command line option to allow the bulk checkin, and those who do
not care about the intermediate state to opt into the new feature.


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 4/7] unpack-objects: use the bulk-checkin infrastructure
  2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
@ 2022-03-21 17:55     ` Junio C Hamano
  2022-03-21 23:02       ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 17:55 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> The unpack-objects functionality is used by fetch, push, and fast-import
> to turn the transfered data into object database entries when there are
> fewer objects than the 'unpacklimit' setting.
>
> By enabling bulk-checkin when unpacking objects, we can take advantage
> of batched fsyncs.

This feels confused in that we dispatch to unpack-objects (instead
of index-objects) only when the number of loose objects should not
matter from performance point of view, and bulk-checkin should shine
from performance point of view only when there are enough objects to
batch.

Also if we ever add "too many small loose objects is wasteful, let's
send them into a single 'batch pack'" optimization, it would create
a funny situation where the caller sends the contents of a small
incoming packfile to unpack-objects, but the command chooses to
bunch them all together in a packfile anyway ;-)

So, I dunno.


> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  builtin/unpack-objects.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index dbeb0680a58..c55b6616aed 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "cache.h"
> +#include "bulk-checkin.h"
>  #include "config.h"
>  #include "object-store.h"
>  #include "object.h"
> @@ -503,10 +504,12 @@ static void unpack_all(void)
>  	if (!quiet)
>  		progress = start_progress(_("Unpacking objects"), nr_objects);
>  	CALLOC_ARRAY(obj_list, nr_objects);
> +	plug_bulk_checkin();
>  	for (i = 0; i < nr_objects; i++) {
>  		unpack_one(i);
>  		display_progress(progress, i + 1);
>  	}
> +	unplug_bulk_checkin();
>  	stop_progress(&progress);
>  
>  	if (delta_list)

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
@ 2022-03-21 18:14     ` Neeraj Singh
  2022-03-21 20:49       ` Junio C Hamano
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 18:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj K. Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 10:03 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> >  * Rebase onto 'seen' at 0cac37f38f9.
>
> That's unfortunate.  Having to depend on almost everything in 'seen'
> is a guaranteed way to ensure that the topic would never graduate to
> 'next'.
>
> For this topic, ns/core-fsyncmethod is the only thing outside of
> 'master' that the previous round needed, so I did an equivalent of
>
>     $ git checkout -b ns/batch-fsync b896f729e2
>     $ git merge ns/core-fsyncmethod
>
> to prepare fd008b1442 and then queued the patches on top, i.e.
>
>     $ git am -s mbox
>
> > This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.
>
> "at ."?
>
> In any case, I've applied them on 0cac37f38f9 and then re-applied
> the result on top of fd008b1442 (i.e. the same base as the previous
> round was queued), which, with the magic of "am -3", applied
> cleanly.  Double checking the result was also simple (i.e. the tip of
> such an application on top of fd008b1442 can be merged with
> 0cac37f38f9 and the result should be identical to the result of
> applying them directly on top of 0cac37f38f9) and seems to have
> produced the right result.
>
> \Thanks.

Thanks Junio.  I was worried about how to properly represent the dependency
between these two in-flight branches without waiting for ns/core-fsyncmethod to
get into next.   Now ns/core-fsyncmethod appears to be there, so I'm assuming
that branch should have a stable OID until the end of the cycle.

Should I base future versions of this series on the tip of
ns/core-fsyncmethod, or
on the merge point between that branch and 'next'?  I guess it doesn't
really matter
if the merge is clean.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 18:28       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 18:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Mon, Mar 21, 2022 at 7:43 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> > [...]
> > +     if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> > +             bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> > +             if (!bulk_fsync_objdir)
> > +                     die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
>
> Should camel-case the config var, and we should have a die_errno() here
> which tell us why we couldn't create it (possibly needing to ferry it up
> from the tempfile API...)

Thanks for noticing the camelCasing.  The config var name was also
wrong. Now it will read:
> > +                     die(_("Could not create temporary object directory for core.fsyncMethod=batch"));

Do you have any recommendations on how to easily ferry the correct
errno out of tmp_objdir_create?
It looks like the remerge-diff usage has the same die behavior w/o
errno, and the builtin/receive-pack.c usage
doesn't die, but also loses the errno.  I'm concerned about preserving
the errno across the or tmp_objdir_destroy
calls.  I could introduce a temp errno var to preserve it across
those. Is that what you had in mind?

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 6/7] core.fsyncmethod: tests for batch mode
  2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-21 18:34     ` Junio C Hamano
  2022-03-22  5:54       ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 18:34 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> Add test cases to exercise batch mode for:
>  * 'git add'

I was wondering why the obviously safe and good candidate 'git add' is
not gaining plug/unplug pair in this series.  It is obviously safe,
unlike 'update-index', that nobody can interact with it, observe its
intermediate output, and expect anything from it.

I think the stupid reason of the lack of new plug/unplug is because
we already had them, which is good ;-).

>  * 'git stash'
>  * 'git update-index'

As I said, I suspect that we'd want to do this safely by adding a
new option to "update-index" and passing it from "stash" which knows
that it does not care about the intermediate state.

> These tests ensure that the added data winds up in the object database.

In other words, "git add $path; git rev-parse :$path" (and its
cousins) would be happy?  Like new object files not left hanging in
a tentative object store etc. _after_ the commands finish.

Good.

> In this change we introduce a new test helper lib-unique-files.sh. The
> goal of this library is to create a tree of files that have different
> oids from any other files that may have been created in the current test
> repo. This helps us avoid missing validation of an object being added due
> to it already being in the repo.

More on this below.

> We aren't actually issuing any fsyncs in these tests, since
> GIT_TEST_FSYNC is 0, but we still exercise all of the tmp_objdir logic
> in bulk-checkin.

Shouldn't we manually override that, if it matters?
Not a suggestion but a question.

> +# Create multiple files with unique contents. Takes the number of
> +# directories, the number of files in each directory, and the base
> +# directory.

This is more honest, compared to the claim made in the proposed log
message, in that the uniqueness guarantee is only among the files
created by this helper.  If we created other test contents without
using this helper, that may crash with the ones created here.

> +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> +#					 each in my_dir, all with unique
> +#					 contents.
> +
> +test_create_unique_files() {

Style.  SP on both sides of ().  I.e.

	test_create_unique_files () {

> +	test "$#" -ne 3 && BUG "3 param"
> +
> +	local dirs=$1
> +	local files=$2
> +	local basedir=$3
> +	local counter=0
> +	test_tick
> +	local basedata=$test_tick

I am not sure if consumption and reliance on tick is a wise thing.
$basedir must be unique across all the other directories in this
test repository (there is no other $basedir)---can't we key
uniqueness off of it?

> +	rm -rf $basedir

Can $basedir have any $IFS character in it?  We should "$quote" it.

> +	for i in $(test_seq $dirs)
> +	do
> +		local dir=$basedir/dir$i
> +
> +		mkdir -p "$dir"
> +		for j in $(test_seq $files)
> +		do
> +			counter=$((counter + 1))
> +			echo "$basedata.$counter"  >"$dir/file$j.txt"

An extra SP before ">"?

> +		done
> +	done
> +}

There is no &&- cascade here, and we expect nothing in this to
fail.  Is that sensible?

> +test_expect_success 'git add: core.fsyncmethod=batch' "
> +	test_create_unique_files 2 4 fsync-files &&
> +	git $BATCH_CONFIGURATION add -- ./fsync-files/ &&
> +	rm -f fsynced_files &&
> +	git ls-files --stage fsync-files/ > fsynced_files &&

Style.  No SP between redirection operator and its target.  I.e.

	git ls-files --stage fsync-files/ >fsynced_files &&

Mixture of names-with-dash and name_with_understore looks somewhat
irritating.

> +	test_line_count = 8 fsynced_files &&

The magic "8" matches "2 4" we saw earlier for create_unique_files?

> +	awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e

A test helper that takes the name of a file that has "ls-files -s" output
may prove to be useful.  I dunno.

> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> index a11d61206ad..8e2f73cc68f 100755
> --- a/t/t5300-pack-object.sh
> +++ b/t/t5300-pack-object.sh
> @@ -162,23 +162,25 @@ test_expect_success 'pack-objects with bogus arguments' '
>  
>  check_unpack () {
>  	test_when_finished "rm -rf git2" &&
> -	git init --bare git2 &&
> -	git -C git2 unpack-objects -n <"$1".pack &&
> -	git -C git2 unpack-objects <"$1".pack &&
> -	(cd .git && find objects -type f -print) |
> -	while read path
> -	do
> -		cmp git2/$path .git/$path || {
> -			echo $path differs.
> -			return 1
> -		}
> -	done
> +	git $2 init --bare git2 &&
> +	(
> +		git $2 -C git2 unpack-objects -n <"$1".pack &&
> +		git $2 -C git2 unpack-objects <"$1".pack &&
> +		git $2 -C git2 cat-file --batch-check="%(objectname)"
> +	) <obj-list >current &&
> +	cmp obj-list current
>  }

I think the change from the old "the existence and the contents of
the object files must all match" to the new "cat-file should say
that the objects we expect to exist indeed do" is not a bad thing.

We used to only depend on the contents of the provided packfile but
now we assume that obj-list file gives us the list of objects.  Is
that sensible?  I somehow do not think so.  Don't we have the
corresponding "$1.idx" that we can feed to "git show-index", e.g.

	git show-index <"$1.pack" >expect.full &&
	cut -d" " -f2 >expect <expect.full &&
	... your test in "$2", but feeding expect instead of obj-list ...
	test_cmp expect actual

Also make sure you quote whatever is coming from outside, even if
you happen to call the helper with tokens that do not need quoting
in the current code.  It is a good discipline to help readers.

Thanks.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 20:14       ` Neeraj Singh
  2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 20:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Mon, Mar 21, 2022 at 9:52 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > One major source of the cost of fsync is the implied flush of the
> > hardware writeback cache within the disk drive. This commit introduces
> > a new `core.fsyncMethod=batch` option that batches up hardware flushes.
> > It hooks into the bulk-checkin plugging and unplugging functionality,
> > takes advantage of tmp-objdir, and uses the writeout-only support code.
> >
> > When the new mode is enabled, we do the following for each new object:
> > 1. Create the object in a tmp-objdir.
> > 2. Issue a pagecache writeback request and wait for it to complete.
> >
> > At the end of the entire transaction when unplugging bulk checkin:
> > 1. Issue an fsync against a dummy file to flush the hardware writeback
> >    cache, which should by now have seen the tmp-objdir writes.
> > 2. Rename all of the tmp-objdir files to their final names.
> > 3. When updating the index and/or refs, we assume that Git will issue
> >    another fsync internal to that operation. This is not the default
> >    today, but the user now has the option of syncing the index and there
> >    is a separate patch series to implement syncing of refs.
>
> Re my question in
> https://lore.kernel.org/git/220310.86r179ki38.gmgdl@evledraar.gmail.com/
> (which you *partially* replied to per my reading, i.e. not the
> fsync_nth() question) I still don't get why the tmp-objdir part of this
> is needed.
>

Sorry for not fully answering your question. I think part of the issue might be
background, where it's not clear to me what's different between your
understanding
and mine, so may not have included something that's questionable to
you but not to me.

Your syscall description below makes the issues very concrete, so I
think we'll get it this round :).

> For "git stash" which is one thing sped up by this let's go over what
> commands/FS ops we do. I changed the test like this:
>
>         diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
>         index 3fc16944e9e..479a495c68c 100755
>         --- a/t/t3903-stash.sh
>         +++ b/t/t3903-stash.sh
>         @@ -1383,7 +1383,7 @@ BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
>
>          test_expect_success 'stash with core.fsyncmethod=batch' "
>                 test_create_unique_files 2 4 fsync-files &&
>         -       git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
>         +       strace -f git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
>                 rm -f fsynced_files &&
>
>                 # The files were untracked, so use the third parent,
>
> Then we get this output, with my comments, and I snipped some output:
>
>         $ ./t3903-stash.sh --run=1-4,114 -vixd 2>&1|grep --color -e 89772c935031c228ed67890f9 -e .git/stash -e bulk_fsync -e .git/index
>         [pid 14703] access(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
>         [pid 14703] access(".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
>         [pid 14703] link(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/tmp_obj_bdUlzu", ".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0
>
> Here we're creating the tmp_objdir() files. We then sync_file_range()
> and close() this.
>
>         [pid 14703] openat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/objects/tmp_objdir-bulk-fsync-rR3AQI/bulk_fsync_HsDRl7", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
>         [pid 14703] unlink("/home/avar/g/git/t/trash directory.t3903-stash/.git/objects/tmp_objdir-bulk-fsync-rR3AQI/bulk_fsync_HsDRl7") = 0
>
> This is the flushing of the "cookie" in do_batch_fsync().
>
>         [pid 14703] newfstatat(AT_FDCWD, ".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, 0) = 0
>         [pid 14703] link(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316", ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0
>
> Here we're going through the object dir migration with
> unplug_bulk_checkin().
>
>         [pid 14703] unlink(".git/objects/tmp_objdir-bulk-fsync-rR3AQI/fb/89772c935031c228ed67890f953c0a2b5c8316") = 0
>         newfstatat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, AT_SYMLINK_NOFOLLOW) = 0
>         [pid 14705] access(".git/objects/tmp_objdir-bulk-fsync-0F7DGy/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = -1 ENOENT (No such file or directory)
>         [pid 14705] access(".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", F_OK) = 0
>         [pid 14705] utimensat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", NULL, 0) = 0
>         [pid 14707] openat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", O_RDONLY|O_CLOEXEC) = 9
>
> We then update the index itself, first a temporary index.stash :
>
>     openat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 8
>     openat(AT_FDCWD, ".git/index.stash.19141", O_RDONLY) = 9
>     newfstatat(AT_FDCWD, ".git/objects/fb/89772c935031c228ed67890f953c0a2b5c8316", {st_mode=S_IFREG|0444, st_size=29, ...}, AT_SYMLINK_NOFOLLOW) = 0
>     newfstatat(AT_FDCWD, "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", {st_mode=S_IFREG|0644, st_size=927, ...}, 0) = 0
>     rename("/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141.lock", "/home/avar/g/git/t/trash directory.t3903-stash/.git/index.stash.19141") = 0
>     unlink(".git/index.stash.19141")        = 0
>
> Followed by the same and a later rename of the actual index:
>
>     [pid 19146] rename("/home/avar/g/git/t/trash directory.t3903-stash/.git/index.lock", "/home/avar/g/git/t/trash directory.t3903-stash/.git/index") = 0
>
> So, my question is still why the temporary object dir migration part of
> this is needed.
>
> We are writing N loose object files, and we write those to temporary
> names already.
>
> AFAIKT we could do all of this by doing the same
> tmp/rename/sync_file_range dance on the main object store.
>

Why not the main object store? We want to maintain the invariant that any
name in the main object store refers to a file that durably has the
correct contents.
If we do sync_file_range and then rename, and then crash, we now have a file
in the main object store with some SHA name, whose contents may or may not
match the SHA.  However, if we ensure an fsync happens before the rename,
a crash at any point will leave us either with no file in the main
object store or
with a file that is durable on the disk.

> Then instead of the "bulk_fsync" cookie file don't close() the last file
> object file we write until we issue the fsync on it.
>
> But maybe this is all needed, I just can't understand from the commit
> message why the "bulk checkin" part is being done.
>
> I think since we've been over this a few times without any success it
> would really help to have some example of the smallest set of syscalls
> to write a file like this safely. I.e. this is doing (pseudocode):
>
>     /* first the bulk path */
>     open("bulk/x.tmp");
>     write("bulk/x.tmp");
>     sync_file_range("bulk/x.tmp");
>     close("bulk/x.tmp");
>     rename("bulk/x.tmp", "bulk/x");
>     open("bulk/y.tmp");
>     write("bulk/y.tmp");
>     sync_file_range("bulk/y.tmp");
>     close("bulk/y.tmp");
>     rename("bulk/y.tmp", "bulk/y");
>     /* Rename to "real" */
>     rename("bulk/x", x");
>     rename("bulk/y", y");
>     /* sync a cookie */
>     fsync("cookie");
>

The '/* Rename to "real" */' and '/* sync a cookie */' steps are
reversed in your above sequence. It should be
1: (for each file)
    a) open
    b) write
    c) sync_file_range
    d) close
    e) rename in tmp_objdir  -- The rename step is not required for
bulk-fsync. An earlier version of this series didn't do it, but
Jeff King pointed out that it was required for concurrency:
https://lore.kernel.org/all/YVOrikAl%2Fu5%2FVi61@coredump.intra.peff.net/

2: fsync something on the same volume to flush the filesystem log and
disk cache. This functions as a "barrier".
3: Rename to final names.  At this point we know that the "contents"
are durable, so if the final name exists, we can read through it to
get the data.

> And I'm asking why it's not:
>
>     /* Rename to "real" as we go */
>     open("x.tmp");
>     write("x.tmp");
>     sync_file_range("x.tmp");
>     close("x.tmp");
>     rename("x.tmp", "x");
>     last_fd = open("y.tmp"); /* don't close() the last one yet */
>     write("y.tmp");
>     sync_file_range("y.tmp");
>     rename("y.tmp", "y");
>     /* sync a cookie */
>     fsync(last_fd);
>
> Which I guess is two questions:
>
>  A. do we need the cookie, or can we re-use the fd of the last thing we
>     write?

We can re-use the FD of the last thing we write, but that results in a
tricker API which
is more intrusive on callers. I was originally using a lockfile, but
found a usage where
there was no lockfile in unpack-objects.

>  B. Is the bulk indirection needed?
>

Hopefully the explanation above makes it clear why we need the
indirection. To state it again,
we need a real fsync before creating the final name in the objdir,
otherwise on a crash a name
could exist that points at contents which could have been lost, since
they weren't durable. I
updated the comment in do_batch_fsync to make this a little clearer.

> > +             fsync_or_die(fd, "loose object file");
>
> Unrelated nit: this API is producing sentence lego unfriendly to
> translators.
>
> Should be made to take an enum or something, so we can emit the relevant
> translated message in fsync_or_die(). Imagine getting:
>
>         fsync error on '日本語は話せません'
>
> Which this will do, just the other way around for non-English speakers
> using the translation.
>
> (The solution is also not to add _() here, since translators will want
> to control the word order.)

This line is copied from the preexisting version of the same code in
close_loose_object.
If I'm understanding it correctly, the entire chain of messages is
untranslated and would
remain as english.  fsync_or_die doesn't have a _().  Can we just
leave it that way, since
this is not a situation that should actually happen to many users?
Alternatively, I think it
would be pretty trivial to just pass through the file name, so I'll
just do that.

> > diff --git a/cache.h b/cache.h
> > index 3160bc1e489..d1ae51388c9 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1040,7 +1040,8 @@ extern int use_fsync;
> >
> >  enum fsync_method {
> >       FSYNC_METHOD_FSYNC,
> > -     FSYNC_METHOD_WRITEOUT_ONLY
> > +     FSYNC_METHOD_WRITEOUT_ONLY,
> > +     FSYNC_METHOD_BATCH
> >  };
> >
> >  extern enum fsync_method fsync_method;
> > @@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *);
> >  int fsync_component(enum fsync_component component, int fd);
> >  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
> >
> > +static inline int batch_fsync_enabled(enum fsync_component component)
> > +{
> > +     return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
> > +}
> > +
> >  ssize_t read_in_full(int fd, void *buf, size_t count);
> >  ssize_t write_in_full(int fd, const void *buf, size_t count);
> >  ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
> > diff --git a/config.c b/config.c
> > index 261ee7436e0..0b28f90de8b 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
> >                       fsync_method = FSYNC_METHOD_FSYNC;
> >               else if (!strcmp(value, "writeout-only"))
> >                       fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
> > +             else if (!strcmp(value, "batch"))
> > +                     fsync_method = FSYNC_METHOD_BATCH;
> >               else
> >                       warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
> >
> > diff --git a/object-file.c b/object-file.c
> > index 5258d9ed827..bdb0a38328f 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -1895,6 +1895,8 @@ static void close_loose_object(int fd)
> >
> >       if (fsync_object_files > 0)
> >               fsync_or_die(fd, "loose object file");
> > +     else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> > +             fsync_loose_object_bulk_checkin(fd);
> >       else
> >               fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
> >                                      "loose object file");
>
> This is related to the above comments about what minimum set of syscalls
> are needed to trigger this "bulk" behavior, but it seems to me that this
> whole API is avoiding just passing some new flags down to object-file.c
> and friends.
>
> For e.g. update-index that results in e.g. the "plug bulk" not being
> aware of HASH_WRITE_OBJECT, so with dry-run writes and the like we'll do
> the whole setup/teardown for nothing.
>
> Which is another reason I wondered why this couldn't be a flagged passed
> down to the object writing...

In the original implementation [1], I did some custom thing for
renaming the files
rather than tmp_objdir. But you suggested at the time that I use
tmp_objdir, which
was a good decision, since it made access to the objects possible in-process and
for descendents in the middle of the transaction.

It sounds to me like I just shouldn't plug the bulk checkin for cases
where we're not
going to add to the ODB.  Plugging the bulk checkin is always
optional.  But when
I wrote the code, I didn't love the result, since it makes arbitrary
callers harder. So
I changed the code to lazily create the tmp objdir the first time an object
shows up, which has the same effect of avoiding the cost when we
aren't adding any
objects.  This also avoids the need to write an error message, since
failing to create
the tmp objdir will just result in a fsync.  The main downside here is
that it's another thing
that will have to change if we want to make adding to the ODB multithreaded.

Thanks,
Neeraj

[1] https://lore.kernel.org/all/12cad737635663ed596e52f89f0f4f22f58bfe38.1632176111.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-21 20:14       ` Neeraj Singh
@ 2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
  2022-03-22  0:13           ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 20:18 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh


On Mon, Mar 21 2022, Neeraj Singh wrote:

[Don't have time for a full reply, sorry, just something quick]

> On Mon, Mar 21, 2022 at 9:52 AM Ævar Arnfjörð Bjarmason
> [...]
>> So, my question is still why the temporary object dir migration part of
>> this is needed.
>>
>> We are writing N loose object files, and we write those to temporary
>> names already.
>>
>> AFAIKT we could do all of this by doing the same
>> tmp/rename/sync_file_range dance on the main object store.
>>
>
> Why not the main object store? We want to maintain the invariant that any
> name in the main object store refers to a file that durably has the
> correct contents.
> If we do sync_file_range and then rename, and then crash, we now have a file
> in the main object store with some SHA name, whose contents may or may not
> match the SHA.  However, if we ensure an fsync happens before the rename,
> a crash at any point will leave us either with no file in the main
> object store or
> with a file that is durable on the disk.

Ah, I see.

Why does that matter? If the "bulk" mode works as advertised we might
have such a corrupt loose or pack file, but we won't have anything
referring to it as far as reachability goes.

I'm aware that the various code paths that handle OID writing don't deal
too well with it in practice to say the least, which one can try with
say:

    $ echo foo | git hash-object -w --stdin
    45b983be36b73c0788dc9cbcb76cbb80fc7bb057
    $ echo | sudo tee .git/objects/45/b983be36b73c0788dc9cbcb76cbb80fc7bb057

I.e. "fsck", "show" etc. will all scream bloddy murder, and re-running
that hash-object again even returns successful (we see it's there
already, and think it's OK).

But in any case, I think it would me much easier to both review and
reason about this code if these concerns were split up.

I.e. things that want no fsync at all (I'd think especially so) might
want to have such updates serialized in this manner, and as Junio
pointed out making these things inseparable as you've done creates API
concerns & fallout that's got nothing to do with what we need for the
performance gains of the bulk checkin fsyncing technique,
e.g. concurrent "update-index" consumers not being able to assume
reported objects exist as soon as they're reported.

>> Then instead of the "bulk_fsync" cookie file don't close() the last file
>> object file we write until we issue the fsync on it.
>>
>> But maybe this is all needed, I just can't understand from the commit
>> message why the "bulk checkin" part is being done.
>>
>> I think since we've been over this a few times without any success it
>> would really help to have some example of the smallest set of syscalls
>> to write a file like this safely. I.e. this is doing (pseudocode):
>>
>>     /* first the bulk path */
>>     open("bulk/x.tmp");
>>     write("bulk/x.tmp");
>>     sync_file_range("bulk/x.tmp");
>>     close("bulk/x.tmp");
>>     rename("bulk/x.tmp", "bulk/x");
>>     open("bulk/y.tmp");
>>     write("bulk/y.tmp");
>>     sync_file_range("bulk/y.tmp");
>>     close("bulk/y.tmp");
>>     rename("bulk/y.tmp", "bulk/y");
>>     /* Rename to "real" */
>>     rename("bulk/x", x");
>>     rename("bulk/y", y");
>>     /* sync a cookie */
>>     fsync("cookie");
>>
>
> The '/* Rename to "real" */' and '/* sync a cookie */' steps are
> reversed in your above sequence. It should be

Sorry.

> 1: (for each file)
>     a) open
>     b) write
>     c) sync_file_range
>     d) close
>     e) rename in tmp_objdir  -- The rename step is not required for
> bulk-fsync. An earlier version of this series didn't do it, but
> Jeff King pointed out that it was required for concurrency:
> https://lore.kernel.org/all/YVOrikAl%2Fu5%2FVi61@coredump.intra.peff.net/

Yes we definitely need the rename, I was wondering about why we needed
it 2x for each file, but that was answered above.

>> And I'm asking why it's not:
>>
>>     /* Rename to "real" as we go */
>>     open("x.tmp");
>>     write("x.tmp");
>>     sync_file_range("x.tmp");
>>     close("x.tmp");
>>     rename("x.tmp", "x");
>>     last_fd = open("y.tmp"); /* don't close() the last one yet */
>>     write("y.tmp");
>>     sync_file_range("y.tmp");
>>     rename("y.tmp", "y");
>>     /* sync a cookie */
>>     fsync(last_fd);
>>
>> Which I guess is two questions:
>>
>>  A. do we need the cookie, or can we re-use the fd of the last thing we
>>     write?
>
> We can re-use the FD of the last thing we write, but that results in a
> tricker API which
> is more intrusive on callers. I was originally using a lockfile, but
> found a usage where
> there was no lockfile in unpack-objects.

Ok, so it's something we could do, but passing down 2-3 functions to
object-file.c was a hassle.

I tried to hack that up earlier and found that it wasn't *too
bad*. I.e. we'd pass some "flags" about our intent, and amend various
functions to take "don't close this one" and pass up the fd (or even do
that as a global).

In any case, having the commit message clearly document what's needed
for what & what's essential & just shortcut taken for the convenience of
the current implementation would be really useful.

Then we can always e.g. change this later to just do the the fsync() on
the last of N we write.

[Ran out of time, sorry]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
@ 2022-03-21 20:23       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 20:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 10:30 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> > +  updates in the disk writeback cache and then does a single full fsync of
> > +  a dummy file to trigger the disk cache flush at the end of the operation.
>
> It is unfortunate that we have a rather independent "unplug" that is
> not tied to the "this is the last operation in the batch"---if there
> were we didn't have to invent a dummy but a single full sync on the
> real file who happened to be the last one in the batch would be
> sufficient.  It would not matter, if the batch is any meaningful
> size, hopefully.
>

I'm banking on  a large batch size or the fact that the additional
cost of creating
and syncing an empty file to be so small that it wouldn't be
noticeable event for
small batches. The current unfortunate scheme at least has a very simple API
that's easy to apply to any other operation going forward. For instance
builtin/hash-object.c might be another good operation, but it wasn't clear to me
if it's used for any mainline scenario.

> > +/*
> > + * Cleanup after batch-mode fsync_object_files.
> > + */
> > +static void do_batch_fsync(void)
> > +{
> > +     /*
> > +      * Issue a full hardware flush against a temporary file to ensure
> > +      * that all objects are durable before any renames occur.  The code in
> > +      * fsync_loose_object_bulk_checkin has already issued a writeout
> > +      * request, but it has not flushed any writeback cache in the storage
> > +      * hardware.
> > +      */
> > +
> > +     if (needs_batch_fsync) {
> > +             struct strbuf temp_path = STRBUF_INIT;
> > +             struct tempfile *temp;
> > +
> > +             strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
> > +             temp = xmks_tempfile(temp_path.buf);
> > +             fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
> > +             delete_tempfile(&temp);
> > +             strbuf_release(&temp_path);
> > +             needs_batch_fsync = 0;
> > +     }
> > +
> > +     if (bulk_fsync_objdir) {
> > +             tmp_objdir_migrate(bulk_fsync_objdir);
> > +             bulk_fsync_objdir = NULL;
>
> The struct obtained from tmp_objdir_create() is consumed by
> tmp_objdir_migrate() so the only clean-up left for the caller to do
> is to clear it to NULL.  OK.
>
> > +     }
>
> This initially made me wonder why we need two independent flags.
> After applying this patch but not any later steps, upon plugging, we
> create the tentative object directory, and any loose object will be
> created there, but because nobody calls the writeout-only variant
> via fsync_loose_object_bulk_checkin() yet, needs_batch_fsync may not
> be turned on.  But even in that case, any new loose objects are in
> the tentative object directory and need to be migrated to the real
> place.
>
> And we may not cover all the existing code paths at the end of the
> series, or any new code paths right away after they get introduced,
> to be aware of the fsync_loose_object_bulk_checkin() when they
> create a loose object file, so it is most likely that these two if
> statements will be with us forever.
>
> OK.

After Avarb's last feedback, I've changed this to lazily create the objdir, so
the existence of an objdir is a suitable proxy for there being something worth
syncing. The potential downside is that the lazy-creation would need to be
synchronized if the ODB becomes multithreaded.

>
> > @@ -274,6 +311,24 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
> >       return 0;
> >  }
> >
> > +void fsync_loose_object_bulk_checkin(int fd)
> > +{
> > +     /*
> > +      * If we have a plugged bulk checkin, we issue a call that
> > +      * cleans the filesystem page cache but avoids a hardware flush
> > +      * command. Later on we will issue a single hardware flush
> > +      * before as part of do_batch_fsync.
> > +      */
> > +     if (bulk_checkin_plugged &&
> > +         git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
> > +             assert(bulk_fsync_objdir);
> > +             if (!needs_batch_fsync)
> > +                     needs_batch_fsync = 1;
>
> Except for when we unplug, do we ever flip needs_batch_fsync bit
> off, once it is set?  If the answer is no, wouldn't it be clearer to
> unconditionally set it, instead of "set it only for the first time"?
>

This code is now gone. I was stupidly optimizing for a future
multithreaded world
which might never come.

> > +     } else {
> > +             fsync_or_die(fd, "loose object file");
> > +     }
> > +}
> > +
> >  int index_bulk_checkin(struct object_id *oid,
> >                      int fd, size_t size, enum object_type type,
> >                      const char *path, unsigned flags)
> > @@ -288,6 +343,19 @@ int index_bulk_checkin(struct object_id *oid,
> >  void plug_bulk_checkin(void)
> >  {
> >       assert(!bulk_checkin_plugged);
> > +
> > +     /*
> > +      * A temporary object directory is used to hold the files
> > +      * while they are not fsynced.
> > +      */
> > +     if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> > +             bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> > +             if (!bulk_fsync_objdir)
> > +                     die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
> > +
> > +             tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
> > +     }
> > +
> >       bulk_checkin_plugged = 1;
> >  }
> >
> > @@ -297,4 +365,6 @@ void unplug_bulk_checkin(void)
> >       bulk_checkin_plugged = 0;
> >       if (bulk_checkin_state.f)
> >               finish_bulk_checkin(&bulk_checkin_state);
> > +
> > +     do_batch_fsync();
> >  }
> > diff --git a/bulk-checkin.h b/bulk-checkin.h
> > index b26f3dc3b74..08f292379b6 100644
> > --- a/bulk-checkin.h
> > +++ b/bulk-checkin.h
> > @@ -6,6 +6,8 @@
> >
> >  #include "cache.h"
> >
> > +void fsync_loose_object_bulk_checkin(int fd);
> > +
> >  int index_bulk_checkin(struct object_id *oid,
> >                      int fd, size_t size, enum object_type type,
> >                      const char *path, unsigned flags);
> > diff --git a/cache.h b/cache.h
> > index 3160bc1e489..d1ae51388c9 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1040,7 +1040,8 @@ extern int use_fsync;
> >
> >  enum fsync_method {
> >       FSYNC_METHOD_FSYNC,
> > -     FSYNC_METHOD_WRITEOUT_ONLY
> > +     FSYNC_METHOD_WRITEOUT_ONLY,
> > +     FSYNC_METHOD_BATCH
> >  };
>
> Style.
>
> These days we allow trailing comma to enum definitions.  Perhaps
> give a trailing comma after _BATCH so that the next update patch
> will become less noisy?
>

Fixed.

> Thanks.

Thanks!
-Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-21 18:14     ` Neeraj Singh
@ 2022-03-21 20:49       ` Junio C Hamano
  0 siblings, 0 replies; 175+ messages in thread
From: Junio C Hamano @ 2022-03-21 20:49 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj K. Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

Neeraj Singh <nksingh85@gmail.com> writes:

>> In any case, I've applied them on 0cac37f38f9 and then re-applied
>> the result on top of fd008b1442 (i.e. the same base as the previous
>> round was queued), which, with the magic of "am -3", applied
>> cleanly.  Double checking the result was also simple (i.e. the tip of
>> such an application on top of fd008b1442 can be merged with
>> 0cac37f38f9 and the result should be identical to the result of
>> applying them directly on top of 0cac37f38f9) and seems to have
>> produced the right result.
>>
>> \Thanks.
>
> Thanks Junio.  I was worried about how to properly represent the dependency
> between these two in-flight branches without waiting for ns/core-fsyncmethod to
> get into next.   Now ns/core-fsyncmethod appears to be there, so I'm assuming
> that branch should have a stable OID until the end of the cycle.
>
> Should I base future versions of this series on the tip of
> ns/core-fsyncmethod, or
> on the merge point between that branch and 'next'?

Please base it on fd008b1442 (i.e. the same base as this and the
previous round was queued on), unless there is a strong reason to
rebase elsewhere.

Thanks.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
@ 2022-03-21 22:09       ` Neeraj Singh
  2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 22:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Mon, Mar 21, 2022 at 8:04 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > The update-index functionality is used internally by 'git stash push' to
> > setup the internal stashed commit.
> >
> > This change enables bulk-checkin for update-index infrastructure to
> > speed up adding new objects to the object database by leveraging the
> > batch fsync functionality.
> >
> > There is some risk with this change, since under batch fsync, the object
> > files will be in a tmp-objdir until update-index is complete.  This
> > usage is unlikely, since any tool invoking update-index and expecting to
> > see objects would have to synchronize with the update-index process
> > after passing it a file path.
> >
> > Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> > ---
> >  builtin/update-index.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/builtin/update-index.c b/builtin/update-index.c
> > index 75d646377cc..38e9d7e88cb 100644
> > --- a/builtin/update-index.c
> > +++ b/builtin/update-index.c
> > @@ -5,6 +5,7 @@
> >   */
> >  #define USE_THE_INDEX_COMPATIBILITY_MACROS
> >  #include "cache.h"
> > +#include "bulk-checkin.h"
> >  #include "config.h"
> >  #include "lockfile.h"
> >  #include "quote.h"
> > @@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >
> >       the_index.updated_skipworktree = 1;
> >
> > +     /* we might be adding many objects to the object database */
> > +     plug_bulk_checkin();
> > +
>
> Shouldn't this be after parse_options_start()?

Does it make a difference?  Especially if we do the object dir creation lazily?

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-21 17:50     ` Junio C Hamano
@ 2022-03-21 22:18       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 22:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 10:50 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/builtin/update-index.c b/builtin/update-index.c
> > index 75d646377cc..38e9d7e88cb 100644
> > --- a/builtin/update-index.c
> > +++ b/builtin/update-index.c
> > @@ -5,6 +5,7 @@
> >   */
> >  #define USE_THE_INDEX_COMPATIBILITY_MACROS
> >  #include "cache.h"
> > +#include "bulk-checkin.h"
> >  #include "config.h"
> >  #include "lockfile.h"
> >  #include "quote.h"
> > @@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >
> >       the_index.updated_skipworktree = 1;
> >
> > +     /* we might be adding many objects to the object database */
> > +     plug_bulk_checkin();
> > +
> >       /*
> >        * Custom copy of parse_options() because we want to handle
> >        * filename arguments as they come.
> > @@ -1190,6 +1194,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >               strbuf_release(&buf);
> >       }
> >
> > +     /* by now we must have added all of the new objects */
> > +     unplug_bulk_checkin();
>
> I understand read-from-stdin code path would be worth plugging, but
> the list of paths on the command line?  How many of them would one
> fit?
>

do_reupdate could touch all the files in the index.  Also one can pass a
directory, and re-add all files under the directory.

> Of course, the feeder may be expecting for the objects to appear in
> the object store as it feeds the paths and will be utterly broken by
> this change, as you mentioned in the proposed log message.  The
> existing plug/unplug will change the behaviour by making the objects
> sent to the packfile available only after getting unplugged.  This
> series makes it even worse by making loose objects also unavailable
> until unplug is called.
>
> So, it probably is safer and more sensible approach to introduce a
> new command line option to allow the bulk checkin, and those who do
> not care about the intermediate state to opt into the new feature.
>

I don't believe this usage is likely today. How would the feeder know when
it can expect to find an object in the object directory after passing something
on stdin?  When fed via stdin, git-update-index will asynchronously add that
object to the object database, leaving no indication to the feeder of when it
actually happens, aside from it happening before the git-update-index process
terminates.  I used to have a comment here about the feeder being able to
parse the --verbose output to get feedback from git-update-index, which
would be quite tricky. I thought it was unnecessarily detailed.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 4/7] unpack-objects: use the bulk-checkin infrastructure
  2022-03-21 17:55     ` Junio C Hamano
@ 2022-03-21 23:02       ` Neeraj Singh
  2022-03-22 20:54         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-21 23:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 10:55 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > The unpack-objects functionality is used by fetch, push, and fast-import
> > to turn the transfered data into object database entries when there are
> > fewer objects than the 'unpacklimit' setting.
> >
> > By enabling bulk-checkin when unpacking objects, we can take advantage
> > of batched fsyncs.
>
> This feels confused in that we dispatch to unpack-objects (instead
> of index-objects) only when the number of loose objects should not
> matter from performance point of view, and bulk-checkin should shine
> from performance point of view only when there are enough objects to
> batch.
>
> Also if we ever add "too many small loose objects is wasteful, let's
> send them into a single 'batch pack'" optimization, it would create
> a funny situation where the caller sends the contents of a small
> incoming packfile to unpack-objects, but the command chooses to
> bunch them all together in a packfile anyway ;-)
>
> So, I dunno.
>

I'd be happy to just drop this patch.  I originally added it to answer Avarab's
question: how does batch mode compare to packfiles? [1] [2].

[1] https://lore.kernel.org/git/87mtp5cwpn.fsf@evledraar.gmail.com/
[2] https://lore.kernel.org/git/pull.1076.v5.git.git.1632514331.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure
  2022-03-21 22:09       ` Neeraj Singh
@ 2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 23:16 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh


On Mon, Mar 21 2022, Neeraj Singh wrote:

> On Mon, Mar 21, 2022 at 8:04 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:
>>
>> > From: Neeraj Singh <neerajsi@microsoft.com>
>> >
>> > The update-index functionality is used internally by 'git stash push' to
>> > setup the internal stashed commit.
>> >
>> > This change enables bulk-checkin for update-index infrastructure to
>> > speed up adding new objects to the object database by leveraging the
>> > batch fsync functionality.
>> >
>> > There is some risk with this change, since under batch fsync, the object
>> > files will be in a tmp-objdir until update-index is complete.  This
>> > usage is unlikely, since any tool invoking update-index and expecting to
>> > see objects would have to synchronize with the update-index process
>> > after passing it a file path.
>> >
>> > Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
>> > ---
>> >  builtin/update-index.c | 6 ++++++
>> >  1 file changed, 6 insertions(+)
>> >
>> > diff --git a/builtin/update-index.c b/builtin/update-index.c
>> > index 75d646377cc..38e9d7e88cb 100644
>> > --- a/builtin/update-index.c
>> > +++ b/builtin/update-index.c
>> > @@ -5,6 +5,7 @@
>> >   */
>> >  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>> >  #include "cache.h"
>> > +#include "bulk-checkin.h"
>> >  #include "config.h"
>> >  #include "lockfile.h"
>> >  #include "quote.h"
>> > @@ -1110,6 +1111,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>> >
>> >       the_index.updated_skipworktree = 1;
>> >
>> > +     /* we might be adding many objects to the object database */
>> > +     plug_bulk_checkin();
>> > +
>>
>> Shouldn't this be after parse_options_start()?
>
> Does it make a difference?  Especially if we do the object dir creation lazily?

I think it won't matter for the machine, but it helps with readability
to keep code like this as close to where it's used as possible.

Close enough and we'd also spot the other bug I mentioned here,
i.e. that we're setting this up where we're not writing objects at all
:)

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
@ 2022-03-22  0:13           ` Neeraj Singh
  2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-22  0:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Mon, Mar 21, 2022 at 1:37 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Mar 21 2022, Neeraj Singh wrote:
>
> [Don't have time for a full reply, sorry, just something quick]
>
> > On Mon, Mar 21, 2022 at 9:52 AM Ævar Arnfjörð Bjarmason
> > [...]
> >> So, my question is still why the temporary object dir migration part of
> >> this is needed.
> >>
> >> We are writing N loose object files, and we write those to temporary
> >> names already.
> >>
> >> AFAIKT we could do all of this by doing the same
> >> tmp/rename/sync_file_range dance on the main object store.
> >>
> >
> > Why not the main object store? We want to maintain the invariant that any
> > name in the main object store refers to a file that durably has the
> > correct contents.
> > If we do sync_file_range and then rename, and then crash, we now have a file
> > in the main object store with some SHA name, whose contents may or may not
> > match the SHA.  However, if we ensure an fsync happens before the rename,
> > a crash at any point will leave us either with no file in the main
> > object store or
> > with a file that is durable on the disk.
>
> Ah, I see.
>
> Why does that matter? If the "bulk" mode works as advertised we might
> have such a corrupt loose or pack file, but we won't have anything
> referring to it as far as reachability goes.
>
> I'm aware that the various code paths that handle OID writing don't deal
> too well with it in practice to say the least, which one can try with
> say:
>
>     $ echo foo | git hash-object -w --stdin
>     45b983be36b73c0788dc9cbcb76cbb80fc7bb057
>     $ echo | sudo tee .git/objects/45/b983be36b73c0788dc9cbcb76cbb80fc7bb057
>
> I.e. "fsck", "show" etc. will all scream bloddy murder, and re-running
> that hash-object again even returns successful (we see it's there
> already, and think it's OK).
>

I was under the impression that in-practice a corrupt loose-object can create
persistent problems in the repo for future commands, since we might not
aggressively verify that an existing file with a certain OID really is
valid when
adding a new instance of the data with the same OID.

If you don't have an fsync barrier before producing the final
content-addressable
name, you can't reason about "this operation happened before that operation,"
so it wouldn't really be valid to say that "we won't have anything
referring to it as far
as reachability goes."

It's entirely possible that you'd have trees pointing to other trees
or blobs that aren't
valid, since data writes can be durable in any order. At this point,
future attempts add
the same blobs or trees might silently drop the updates.  I'm betting that's why
core.fsyncObjectFiles was added in the first place, since someone
observed severe
persistent consequences for this form of corruption.

> But in any case, I think it would me much easier to both review and
> reason about this code if these concerns were split up.
>
> I.e. things that want no fsync at all (I'd think especially so) might
> want to have such updates serialized in this manner, and as Junio
> pointed out making these things inseparable as you've done creates API
> concerns & fallout that's got nothing to do with what we need for the
> performance gains of the bulk checkin fsyncing technique,
> e.g. concurrent "update-index" consumers not being able to assume
> reported objects exist as soon as they're reported.
>

I want to explicitly not respond to this concern. I don't believe this
100 line patch
can be usefully split.

> >> Then instead of the "bulk_fsync" cookie file don't close() the last file
> >> object file we write until we issue the fsync on it.
> >>
> >> But maybe this is all needed, I just can't understand from the commit
> >> message why the "bulk checkin" part is being done.
> >>
> >> I think since we've been over this a few times without any success it
> >> would really help to have some example of the smallest set of syscalls
> >> to write a file like this safely. I.e. this is doing (pseudocode):
> >>
> >>     /* first the bulk path */
> >>     open("bulk/x.tmp");
> >>     write("bulk/x.tmp");
> >>     sync_file_range("bulk/x.tmp");
> >>     close("bulk/x.tmp");
> >>     rename("bulk/x.tmp", "bulk/x");
> >>     open("bulk/y.tmp");
> >>     write("bulk/y.tmp");
> >>     sync_file_range("bulk/y.tmp");
> >>     close("bulk/y.tmp");
> >>     rename("bulk/y.tmp", "bulk/y");
> >>     /* Rename to "real" */
> >>     rename("bulk/x", x");
> >>     rename("bulk/y", y");
> >>     /* sync a cookie */
> >>     fsync("cookie");
> >>
> >
> > The '/* Rename to "real" */' and '/* sync a cookie */' steps are
> > reversed in your above sequence. It should be
>
> Sorry.
>
> > 1: (for each file)
> >     a) open
> >     b) write
> >     c) sync_file_range
> >     d) close
> >     e) rename in tmp_objdir  -- The rename step is not required for
> > bulk-fsync. An earlier version of this series didn't do it, but
> > Jeff King pointed out that it was required for concurrency:
> > https://lore.kernel.org/all/YVOrikAl%2Fu5%2FVi61@coredump.intra.peff.net/
>
> Yes we definitely need the rename, I was wondering about why we needed
> it 2x for each file, but that was answered above.
>
> >> And I'm asking why it's not:
> >>
> >>     /* Rename to "real" as we go */
> >>     open("x.tmp");
> >>     write("x.tmp");
> >>     sync_file_range("x.tmp");
> >>     close("x.tmp");
> >>     rename("x.tmp", "x");
> >>     last_fd = open("y.tmp"); /* don't close() the last one yet */
> >>     write("y.tmp");
> >>     sync_file_range("y.tmp");
> >>     rename("y.tmp", "y");
> >>     /* sync a cookie */
> >>     fsync(last_fd);
> >>
> >> Which I guess is two questions:
> >>
> >>  A. do we need the cookie, or can we re-use the fd of the last thing we
> >>     write?
> >
> > We can re-use the FD of the last thing we write, but that results in a
> > tricker API which
> > is more intrusive on callers. I was originally using a lockfile, but
> > found a usage where
> > there was no lockfile in unpack-objects.
>
> Ok, so it's something we could do, but passing down 2-3 functions to
> object-file.c was a hassle.
>
> I tried to hack that up earlier and found that it wasn't *too
> bad*. I.e. we'd pass some "flags" about our intent, and amend various
> functions to take "don't close this one" and pass up the fd (or even do
> that as a global).
>
> In any case, having the commit message clearly document what's needed
> for what & what's essential & just shortcut taken for the convenience of
> the current implementation would be really useful.
>
> Then we can always e.g. change this later to just do the the fsync() on
> the last of N we write.
>

I left a comment in the (now very long) commit message that indicates the
dummy file is there to make the API simpler.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 6/7] core.fsyncmethod: tests for batch mode
  2022-03-21 18:34     ` Junio C Hamano
@ 2022-03-22  5:54       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-22  5:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 11:34 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > Add test cases to exercise batch mode for:
> >  * 'git add'
>
> I was wondering why the obviously safe and good candidate 'git add' is
> not gaining plug/unplug pair in this series.  It is obviously safe,
> unlike 'update-index', that nobody can interact with it, observe its
> intermediate output, and expect anything from it.
>
> I think the stupid reason of the lack of new plug/unplug is because
> we already had them, which is good ;-).
>
> >  * 'git stash'
> >  * 'git update-index'
>
> As I said, I suspect that we'd want to do this safely by adding a
> new option to "update-index" and passing it from "stash" which knows
> that it does not care about the intermediate state.
>
> > These tests ensure that the added data winds up in the object database.
>
> In other words, "git add $path; git rev-parse :$path" (and its
> cousins) would be happy?  Like new object files not left hanging in
> a tentative object store etc. _after_ the commands finish.
>
> Good.
>
> > In this change we introduce a new test helper lib-unique-files.sh. The
> > goal of this library is to create a tree of files that have different
> > oids from any other files that may have been created in the current test
> > repo. This helps us avoid missing validation of an object being added due
> > to it already being in the repo.
>
> More on this below.

To me the idea of putting the 'why' into the commit message and the 'what' into
code comments makes sense, since I'd assume people looking into the history
care about the why, but people making future changes would read the
documentation
in the comments for e.g. lib-unique-files.

>
> > We aren't actually issuing any fsyncs in these tests, since
> > GIT_TEST_FSYNC is 0, but we still exercise all of the tmp_objdir logic
> > in bulk-checkin.
>
> Shouldn't we manually override that, if it matters?
> Not a suggestion but a question.

I manually override it for the performance tests.  I think it's
sensible to manually override
this variable for the small number of tests added in this commit so
that we can exercise
the underlying system calls, so I'll do that.

> > +# Create multiple files with unique contents. Takes the number of
> > +# directories, the number of files in each directory, and the base
> > +# directory.
>
> This is more honest, compared to the claim made in the proposed log
> message, in that the uniqueness guarantee is only among the files
> created by this helper.  If we created other test contents without
> using this helper, that may crash with the ones created here.
>

I've revised this comment to indicate that the files are only unique
within this test run.

> > +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> > +#                                     each in my_dir, all with unique
> > +#                                     contents.
> > +
> > +test_create_unique_files() {
>
> Style.  SP on both sides of ().  I.e.
>
>         test_create_unique_files () {
>

Fixed

> > +     test "$#" -ne 3 && BUG "3 param"
> > +
> > +     local dirs=$1
> > +     local files=$2
> > +     local basedir=$3
> > +     local counter=0
> > +     test_tick
> > +     local basedata=$test_tick
>
> I am not sure if consumption and reliance on tick is a wise thing.
> $basedir must be unique across all the other directories in this
> test repository (there is no other $basedir)---can't we key
> uniqueness off of it?

In the performance tests, we create sets of files with the same basedir
but we want the files to have different contents since we don't blow away
the repo between tests.  The current approach still generates uniqueness
if someone simply copy/pastes a test_create_unique_files invocation, rather
than subtly failing to make new objects.

> > +     rm -rf $basedir
>
> Can $basedir have any $IFS character in it?  We should "$quote" it.

Fixed.

>
> > +     for i in $(test_seq $dirs)
> > +     do
> > +             local dir=$basedir/dir$i
> > +
> > +             mkdir -p "$dir"
> > +             for j in $(test_seq $files)
> > +             do
> > +                     counter=$((counter + 1))
> > +                     echo "$basedata.$counter"  >"$dir/file$j.txt"
>
> An extra SP before ">"?
>

Fixed.

> > +             done
> > +     done
> > +}
>
> There is no &&- cascade here, and we expect nothing in this to
> fail.  Is that sensible?
>

I apologize, there's a lot of subtlety about UNIX shell scripting that
I simply do not know.   I put an '&&' chain in, but I might still have it wrong.

> > +test_expect_success 'git add: core.fsyncmethod=batch' "
> > +     test_create_unique_files 2 4 fsync-files &&
> > +     git $BATCH_CONFIGURATION add -- ./fsync-files/ &&
> > +     rm -f fsynced_files &&
> > +     git ls-files --stage fsync-files/ > fsynced_files &&
>
> Style.  No SP between redirection operator and its target.  I.e.
>
>         git ls-files --stage fsync-files/ >fsynced_files &&
>
> Mixture of names-with-dash and name_with_understore looks somewhat
> irritating.
>

Will switch to underscores for both. Also will rename the base dir to be more
distinct from the list of files that we expect to see.

> > +     test_line_count = 8 fsynced_files &&
>
> The magic "8" matches "2 4" we saw earlier for create_unique_files?
>

Will comment to explain the 8.

> > +     awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e
>
> A test helper that takes the name of a file that has "ls-files -s" output
> may prove to be useful.  I dunno.
>
> > diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> > index a11d61206ad..8e2f73cc68f 100755
> > --- a/t/t5300-pack-object.sh
> > +++ b/t/t5300-pack-object.sh
> > @@ -162,23 +162,25 @@ test_expect_success 'pack-objects with bogus arguments' '
> >
> >  check_unpack () {
> >       test_when_finished "rm -rf git2" &&
> > -     git init --bare git2 &&
> > -     git -C git2 unpack-objects -n <"$1".pack &&
> > -     git -C git2 unpack-objects <"$1".pack &&
> > -     (cd .git && find objects -type f -print) |
> > -     while read path
> > -     do
> > -             cmp git2/$path .git/$path || {
> > -                     echo $path differs.
> > -                     return 1
> > -             }
> > -     done
> > +     git $2 init --bare git2 &&
> > +     (
> > +             git $2 -C git2 unpack-objects -n <"$1".pack &&
> > +             git $2 -C git2 unpack-objects <"$1".pack &&
> > +             git $2 -C git2 cat-file --batch-check="%(objectname)"
> > +     ) <obj-list >current &&
> > +     cmp obj-list current
> >  }
>
> I think the change from the old "the existence and the contents of
> the object files must all match" to the new "cat-file should say
> that the objects we expect to exist indeed do" is not a bad thing.
>
> We used to only depend on the contents of the provided packfile but
> now we assume that obj-list file gives us the list of objects.  Is
> that sensible?  I somehow do not think so.  Don't we have the
> corresponding "$1.idx" that we can feed to "git show-index", e.g.
>

I believe that "obj-list" is in some sense more authoritative,
assuming arbitrary
future bugs in the pack implementation.  We make sure that the pack we were
handed isn't missing any objects.  It makes sense to accept obj-list from
the outside so that check_unpack itself doesn't depend on that file name.
But the previous code wouldn't catch a bug where the pack code mistakenly drops
an object.

>         git show-index <"$1.pack" >expect.full &&
>         cut -d" " -f2 >expect <expect.full &&
>         ... your test in "$2", but feeding expect instead of obj-list ...
>         test_cmp expect actual
>
> Also make sure you quote whatever is coming from outside, even if
> you happen to call the helper with tokens that do not need quoting
> in the current code.  It is a good discipline to help readers.
>
> Thanks.

I'll take another pass over the shell code in the morning to make sure
I'm following all the conventions and recommendations.  Apologize
for the set of mistakes. In terms of shell, I am a rookie.

Thanks,
-Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-22  0:13           ` Neeraj Singh
@ 2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
  2022-03-22 20:05               ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-22  8:52 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh


On Mon, Mar 21 2022, Neeraj Singh wrote:

> On Mon, Mar 21, 2022 at 1:37 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Mon, Mar 21 2022, Neeraj Singh wrote:
>>
>> [Don't have time for a full reply, sorry, just something quick]
>>
>> > On Mon, Mar 21, 2022 at 9:52 AM Ævar Arnfjörð Bjarmason
>> > [...]
>> >> So, my question is still why the temporary object dir migration part of
>> >> this is needed.
>> >>
>> >> We are writing N loose object files, and we write those to temporary
>> >> names already.
>> >>
>> >> AFAIKT we could do all of this by doing the same
>> >> tmp/rename/sync_file_range dance on the main object store.
>> >>
>> >
>> > Why not the main object store? We want to maintain the invariant that any
>> > name in the main object store refers to a file that durably has the
>> > correct contents.
>> > If we do sync_file_range and then rename, and then crash, we now have a file
>> > in the main object store with some SHA name, whose contents may or may not
>> > match the SHA.  However, if we ensure an fsync happens before the rename,
>> > a crash at any point will leave us either with no file in the main
>> > object store or
>> > with a file that is durable on the disk.
>>
>> Ah, I see.
>>
>> Why does that matter? If the "bulk" mode works as advertised we might
>> have such a corrupt loose or pack file, but we won't have anything
>> referring to it as far as reachability goes.
>>
>> I'm aware that the various code paths that handle OID writing don't deal
>> too well with it in practice to say the least, which one can try with
>> say:
>>
>>     $ echo foo | git hash-object -w --stdin
>>     45b983be36b73c0788dc9cbcb76cbb80fc7bb057
>>     $ echo | sudo tee .git/objects/45/b983be36b73c0788dc9cbcb76cbb80fc7bb057
>>
>> I.e. "fsck", "show" etc. will all scream bloddy murder, and re-running
>> that hash-object again even returns successful (we see it's there
>> already, and think it's OK).
>>
>
> I was under the impression that in-practice a corrupt loose-object can create
> persistent problems in the repo for future commands, since we might not
> aggressively verify that an existing file with a certain OID really is
> valid when
> adding a new instance of the data with the same OID.

Yes, it can. As the hash-object case shows we don't even check at all.

For "incoming push" we *will* notice, but will just uselessly error
out.

I actually had some patches a while ago to turn off our own home-grown
SHA-1 collision checking.

It had the nice side effect of making it easier to recover from loose
object corruption, since you could (re-)push the corrupted OID as a
PACK, we wouldn't check (and die) on the bad loose object, and since we
take a PACK over LOOSE we'd recover:
https://lore.kernel.org/git/20181028225023.26427-5-avarab@gmail.com/

> If you don't have an fsync barrier before producing the final
> content-addressable
> name, you can't reason about "this operation happened before that operation,"
> so it wouldn't really be valid to say that "we won't have anything
> referring to it as far
> as reachability goes."

That's correct, but we're discussing a feature that *does have* that
fsync barrier. So if we get an error while writing the loose objects
before the "cookie" fsync we'll presumably error out. That'll then be
followed by an fsync() of whatever makes the objects reachable.

> It's entirely possible that you'd have trees pointing to other trees
> or blobs that aren't
> valid, since data writes can be durable in any order. At this point,
> future attempts add
> the same blobs or trees might silently drop the updates.  I'm betting that's why
> core.fsyncObjectFiles was added in the first place, since someone
> observed severe
> persistent consequences for this form of corruption.

Well, you can see Linus's original rant-as-documentation for why we
added it :) I.e. the original git implementation made some heavy
linux-FS assumption about the order of writes and an fsync() flushing
any previous writes, which wasn't portable.

>> But in any case, I think it would me much easier to both review and
>> reason about this code if these concerns were split up.
>>
>> I.e. things that want no fsync at all (I'd think especially so) might
>> want to have such updates serialized in this manner, and as Junio
>> pointed out making these things inseparable as you've done creates API
>> concerns & fallout that's got nothing to do with what we need for the
>> performance gains of the bulk checkin fsyncing technique,
>> e.g. concurrent "update-index" consumers not being able to assume
>> reported objects exist as soon as they're reported.
>>
>
> I want to explicitly not respond to this concern. I don't believe this
> 100 line patch
> can be usefully split.

Leaving "usefully" aside for a second (since that's subjective), it
clearly "can". I just tried this on top of "seen":

	diff --git a/bulk-checkin.c b/bulk-checkin.c
	index a702e0ff203..9e994c4d6ae 100644
	--- a/bulk-checkin.c
	+++ b/bulk-checkin.c
	@@ -9,15 +9,12 @@
	 #include "pack.h"
	 #include "strbuf.h"
	 #include "string-list.h"
	-#include "tmp-objdir.h"
	 #include "packfile.h"
	 #include "object-store.h"
	 
	 static int bulk_checkin_plugged;
	 static int needs_batch_fsync;
	 
	-static struct tmp_objdir *bulk_fsync_objdir;
	-
	 static struct bulk_checkin_state {
	 	char *pack_tmp_name;
	 	struct hashfile *f;
	@@ -110,11 +107,6 @@ static void do_batch_fsync(void)
	 		strbuf_release(&temp_path);
	 		needs_batch_fsync = 0;
	 	}
	-
	-	if (bulk_fsync_objdir) {
	-		tmp_objdir_migrate(bulk_fsync_objdir);
	-		bulk_fsync_objdir = NULL;
	-	}
	 }
	 
	 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
	@@ -321,7 +313,6 @@ void fsync_loose_object_bulk_checkin(int fd)
	 	 */
	 	if (bulk_checkin_plugged &&
	 	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
	-		assert(bulk_fsync_objdir);
	 		if (!needs_batch_fsync)
	 			needs_batch_fsync = 1;
	 	} else {
	@@ -343,19 +334,6 @@ int index_bulk_checkin(struct object_id *oid,
	 void plug_bulk_checkin(void)
	 {
	 	assert(!bulk_checkin_plugged);
	-
	-	/*
	-	 * A temporary object directory is used to hold the files
	-	 * while they are not fsynced.
	-	 */
	-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
	-		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
	-		if (!bulk_fsync_objdir)
	-			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
	-
	-		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
	-	}
	-
	 	bulk_checkin_plugged = 1;
	 }

And then tried running:

    $ GIT_PERF_MAKE_OPTS='CFLAGS=-O3' ./run HEAD~ HEAD -- p3900-stash.sh	

And got:
    
    Test                                              HEAD~              HEAD
    --------------------------------------------------------------------------------------------
    3900.2: stash 500 files (object_fsyncing=false)   0.56(0.08+0.09)    0.60(0.08+0.08) +7.1%
    3900.4: stash 500 files (object_fsyncing=true)    14.50(0.07+0.15)   17.13(0.10+0.12) +18.1%
    3900.6: stash 500 files (object_fsyncing=batch)   1.14(0.08+0.11)    1.03(0.08+0.10) -9.6%

Now, I really don't trust that perf run to say anything except these
being in the same ballpark, but it's clearly going to be a *bit* faster
since we'll be doing fewer IOps.

As to "usefully" I really do get what you're saying that you only find
these useful when you combine the two because you'd like to have 100%
safety, and that's fair enough.

But since we are going to have a knob to turn off fsyncing entirely, and
we have this "bulk" mode which requires you to carefully reason about
your FS semantics to ascertain safety the performance/safety trade-off
is clearly something that's useful to have tweaks for.

And with "bulk" the concern about leaving behind stray corrupt objects
is entirely orthagonal to corcerns about losing a ref update, which is
the main thing we're worried about.

I also don't see how even if you're arguing that nobody would want one
without the other because everyone who cares about "bulk" cares about
this stray-corrupt-loose-but-no-ref-update case, how it has any business
being tied up in the "bulk" mode as far as the implementation goes.

That's because the same edge case is exposed by
core.fsyncObjectFiles=false for those who are assuming the initial
"ordered" semantics.

I.e. if we're saying that before we write the ref we'd like to not
expose the WIP objects in the primary object store because they're not
fsync'd yet, how is that mode different than "bulk" if we crash while
doing that operation (before the eventual fsync()).

So I really think it's much better to split these concerns up.

I think even if you insist on the same end-state it makes the patch
progression much *easier* to reason about. We'd then solve one problem
at a time, and start with a commit where just the semantics that are
unique to "bulk" are implemented, with nothing else conflated with
those.

> [...]
>> Ok, so it's something we could do, but passing down 2-3 functions to
>> object-file.c was a hassle.
>>
>> I tried to hack that up earlier and found that it wasn't *too
>> bad*. I.e. we'd pass some "flags" about our intent, and amend various
>> functions to take "don't close this one" and pass up the fd (or even do
>> that as a global).
>>
>> In any case, having the commit message clearly document what's needed
>> for what & what's essential & just shortcut taken for the convenience of
>> the current implementation would be really useful.
>>
>> Then we can always e.g. change this later to just do the the fsync() on
>> the last of N we write.
>>
>
> I left a comment in the (now very long) commit message that indicates the
> dummy file is there to make the API simpler.

In terms of more understandable progression I also think this series
would be much easier to understand if it converted one caller without
needing the "cookie" where doing so is easy, e.g. the unpack-objects.c
caller where we're processing nr_objects, so we can just pass down a
flag to do the fsync() for i == nr_objects.

That'll then clearly show that the whole business of having the global
state on the side is just a replacement for passing down such a flag.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
@ 2022-03-22 20:05               ` Neeraj Singh
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-22 20:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Tue, Mar 22, 2022 at 2:29 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Mar 21 2022, Neeraj Singh wrote:
>
> > On Mon, Mar 21, 2022 at 1:37 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >>
> >> On Mon, Mar 21 2022, Neeraj Singh wrote:
> >>
> >> [Don't have time for a full reply, sorry, just something quick]
> >>
> >> > On Mon, Mar 21, 2022 at 9:52 AM Ævar Arnfjörð Bjarmason
> >> > [...]
> >> >> So, my question is still why the temporary object dir migration part of
> >> >> this is needed.
> >> >>
> >> >> We are writing N loose object files, and we write those to temporary
> >> >> names already.
> >> >>
> >> >> AFAIKT we could do all of this by doing the same
> >> >> tmp/rename/sync_file_range dance on the main object store.
> >> >>
> >> >
> >> > Why not the main object store? We want to maintain the invariant that any
> >> > name in the main object store refers to a file that durably has the
> >> > correct contents.
> >> > If we do sync_file_range and then rename, and then crash, we now have a file
> >> > in the main object store with some SHA name, whose contents may or may not
> >> > match the SHA.  However, if we ensure an fsync happens before the rename,
> >> > a crash at any point will leave us either with no file in the main
> >> > object store or
> >> > with a file that is durable on the disk.
> >>
> >> Ah, I see.
> >>
> >> Why does that matter? If the "bulk" mode works as advertised we might
> >> have such a corrupt loose or pack file, but we won't have anything
> >> referring to it as far as reachability goes.
> >>
> >> I'm aware that the various code paths that handle OID writing don't deal
> >> too well with it in practice to say the least, which one can try with
> >> say:
> >>
> >>     $ echo foo | git hash-object -w --stdin
> >>     45b983be36b73c0788dc9cbcb76cbb80fc7bb057
> >>     $ echo | sudo tee .git/objects/45/b983be36b73c0788dc9cbcb76cbb80fc7bb057
> >>
> >> I.e. "fsck", "show" etc. will all scream bloddy murder, and re-running
> >> that hash-object again even returns successful (we see it's there
> >> already, and think it's OK).
> >>
> >
> > I was under the impression that in-practice a corrupt loose-object can create
> > persistent problems in the repo for future commands, since we might not
> > aggressively verify that an existing file with a certain OID really is
> > valid when
> > adding a new instance of the data with the same OID.
>
> Yes, it can. As the hash-object case shows we don't even check at all.
>
> For "incoming push" we *will* notice, but will just uselessly error
> out.
>
> I actually had some patches a while ago to turn off our own home-grown
> SHA-1 collision checking.
>
> It had the nice side effect of making it easier to recover from loose
> object corruption, since you could (re-)push the corrupted OID as a
> PACK, we wouldn't check (and die) on the bad loose object, and since we
> take a PACK over LOOSE we'd recover:
> https://lore.kernel.org/git/20181028225023.26427-5-avarab@gmail.com/
>
> > If you don't have an fsync barrier before producing the final
> > content-addressable
> > name, you can't reason about "this operation happened before that operation,"
> > so it wouldn't really be valid to say that "we won't have anything
> > referring to it as far
> > as reachability goes."
>
> That's correct, but we're discussing a feature that *does have* that
> fsync barrier. So if we get an error while writing the loose objects
> before the "cookie" fsync we'll presumably error out. That'll then be
> followed by an fsync() of whatever makes the objects reachable.
>

Because we have a content-addressable store which generally trusts
its contents are valid (at least when adding new instances of the same
content), the mere existence of a loose-object with a certain name is
enough to make it "reachable" to future operations, even if there are
no other immediate ways to get to that object.

> > It's entirely possible that you'd have trees pointing to other trees
> > or blobs that aren't
> > valid, since data writes can be durable in any order. At this point,
> > future attempts add
> > the same blobs or trees might silently drop the updates.  I'm betting that's why
> > core.fsyncObjectFiles was added in the first place, since someone
> > observed severe
> > persistent consequences for this form of corruption.
>
> Well, you can see Linus's original rant-as-documentation for why we
> added it :) I.e. the original git implementation made some heavy
> linux-FS assumption about the order of writes and an fsync() flushing
> any previous writes, which wasn't portable.
>
> >> But in any case, I think it would me much easier to both review and
> >> reason about this code if these concerns were split up.
> >>
> >> I.e. things that want no fsync at all (I'd think especially so) might
> >> want to have such updates serialized in this manner, and as Junio
> >> pointed out making these things inseparable as you've done creates API
> >> concerns & fallout that's got nothing to do with what we need for the
> >> performance gains of the bulk checkin fsyncing technique,
> >> e.g. concurrent "update-index" consumers not being able to assume
> >> reported objects exist as soon as they're reported.
> >>
> >
> > I want to explicitly not respond to this concern. I don't believe this
> > 100 line patch
> > can be usefully split.
>
> Leaving "usefully" aside for a second (since that's subjective), it
> clearly "can". I just tried this on top of "seen":
>
>         diff --git a/bulk-checkin.c b/bulk-checkin.c
>         index a702e0ff203..9e994c4d6ae 100644
>         --- a/bulk-checkin.c
>         +++ b/bulk-checkin.c
>         @@ -9,15 +9,12 @@
>          #include "pack.h"
>          #include "strbuf.h"
>          #include "string-list.h"
>         -#include "tmp-objdir.h"
>          #include "packfile.h"
>          #include "object-store.h"
>
>          static int bulk_checkin_plugged;
>          static int needs_batch_fsync;
>
>         -static struct tmp_objdir *bulk_fsync_objdir;
>         -
>          static struct bulk_checkin_state {
>                 char *pack_tmp_name;
>                 struct hashfile *f;
>         @@ -110,11 +107,6 @@ static void do_batch_fsync(void)
>                         strbuf_release(&temp_path);
>                         needs_batch_fsync = 0;
>                 }
>         -
>         -       if (bulk_fsync_objdir) {
>         -               tmp_objdir_migrate(bulk_fsync_objdir);
>         -               bulk_fsync_objdir = NULL;
>         -       }
>          }
>
>          static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
>         @@ -321,7 +313,6 @@ void fsync_loose_object_bulk_checkin(int fd)
>                  */
>                 if (bulk_checkin_plugged &&
>                     git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
>         -               assert(bulk_fsync_objdir);
>                         if (!needs_batch_fsync)
>                                 needs_batch_fsync = 1;
>                 } else {
>         @@ -343,19 +334,6 @@ int index_bulk_checkin(struct object_id *oid,
>          void plug_bulk_checkin(void)
>          {
>                 assert(!bulk_checkin_plugged);
>         -
>         -       /*
>         -        * A temporary object directory is used to hold the files
>         -        * while they are not fsynced.
>         -        */
>         -       if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
>         -               bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
>         -               if (!bulk_fsync_objdir)
>         -                       die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
>         -
>         -               tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
>         -       }
>         -
>                 bulk_checkin_plugged = 1;
>          }
>
> And then tried running:
>
>     $ GIT_PERF_MAKE_OPTS='CFLAGS=-O3' ./run HEAD~ HEAD -- p3900-stash.sh
>
> And got:
>
>     Test                                              HEAD~              HEAD
>     --------------------------------------------------------------------------------------------
>     3900.2: stash 500 files (object_fsyncing=false)   0.56(0.08+0.09)    0.60(0.08+0.08) +7.1%
>     3900.4: stash 500 files (object_fsyncing=true)    14.50(0.07+0.15)   17.13(0.10+0.12) +18.1%
>     3900.6: stash 500 files (object_fsyncing=batch)   1.14(0.08+0.11)    1.03(0.08+0.10) -9.6%
>
> Now, I really don't trust that perf run to say anything except these
> being in the same ballpark, but it's clearly going to be a *bit* faster
> since we'll be doing fewer IOps.
>
> As to "usefully" I really do get what you're saying that you only find
> these useful when you combine the two because you'd like to have 100%
> safety, and that's fair enough.
>
> But since we are going to have a knob to turn off fsyncing entirely, and
> we have this "bulk" mode which requires you to carefully reason about
> your FS semantics to ascertain safety the performance/safety trade-off
> is clearly something that's useful to have tweaks for.
>
> And with "bulk" the concern about leaving behind stray corrupt objects
> is entirely orthagonal to corcerns about losing a ref update, which is
> the main thing we're worried about.
>
> I also don't see how even if you're arguing that nobody would want one
> without the other because everyone who cares about "bulk" cares about
> this stray-corrupt-loose-but-no-ref-update case, how it has any business
> being tied up in the "bulk" mode as far as the implementation goes.
>
> That's because the same edge case is exposed by
> core.fsyncObjectFiles=false for those who are assuming the initial
> "ordered" semantics.
>
> I.e. if we're saying that before we write the ref we'd like to not
> expose the WIP objects in the primary object store because they're not
> fsync'd yet, how is that mode different than "bulk" if we crash while
> doing that operation (before the eventual fsync()).
>
> So I really think it's much better to split these concerns up.
>
> I think even if you insist on the same end-state it makes the patch
> progression much *easier* to reason about. We'd then solve one problem
> at a time, and start with a commit where just the semantics that are
> unique to "bulk" are implemented, with nothing else conflated with
> those.

On Windows, where we want to have a consistent ODB by default, I'm
adding a faster way
to achieve that safety. No user is asking for a world where we are
doing half the
work to make a consistent ODB but not the other half.

This one patch works holistically to provide the full batch safety
feature, and splitting it
into two patches (which in the new version wouldn't be as clean as
you've done it above)
doesn't make the correctness of the whole thing more reviewable.  In
fact it's less reviewable
since the fsync and objdir migration are in two separate patches and a
future historian
wouldn't get as clear of a picture of the whole mechanism.

> > [...]
> >> Ok, so it's something we could do, but passing down 2-3 functions to
> >> object-file.c was a hassle.
> >>
> >> I tried to hack that up earlier and found that it wasn't *too
> >> bad*. I.e. we'd pass some "flags" about our intent, and amend various
> >> functions to take "don't close this one" and pass up the fd (or even do
> >> that as a global).
> >>
> >> In any case, having the commit message clearly document what's needed
> >> for what & what's essential & just shortcut taken for the convenience of
> >> the current implementation would be really useful.
> >>
> >> Then we can always e.g. change this later to just do the the fsync() on
> >> the last of N we write.
> >>
> >
> > I left a comment in the (now very long) commit message that indicates the
> > dummy file is there to make the API simpler.
>
> In terms of more understandable progression I also think this series
> would be much easier to understand if it converted one caller without
> needing the "cookie" where doing so is easy, e.g. the unpack-objects.c
> caller where we're processing nr_objects, so we can just pass down a
> flag to do the fsync() for i == nr_objects.
>
> That'll then clearly show that the whole business of having the global
> state on the side is just a replacement for passing down such a flag.

That seems appropriate for our mailing list discussion, but I don't see
how it helps the patch series, because we'd be doing work to fsync
the final object and then reversing that work when producing the final
end state, which uses the dummy file.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 4/7] unpack-objects: use the bulk-checkin infrastructure
  2022-03-21 23:02       ` Neeraj Singh
@ 2022-03-22 20:54         ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-22 20:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Mon, Mar 21, 2022 at 4:02 PM Neeraj Singh <nksingh85@gmail.com> wrote:
>
> On Mon, Mar 21, 2022 at 10:55 AM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > From: Neeraj Singh <neerajsi@microsoft.com>
> > >
> > > The unpack-objects functionality is used by fetch, push, and fast-import
> > > to turn the transfered data into object database entries when there are
> > > fewer objects than the 'unpacklimit' setting.
> > >
> > > By enabling bulk-checkin when unpacking objects, we can take advantage
> > > of batched fsyncs.
> >
> > This feels confused in that we dispatch to unpack-objects (instead
> > of index-objects) only when the number of loose objects should not
> > matter from performance point of view, and bulk-checkin should shine
> > from performance point of view only when there are enough objects to
> > batch.
> >
> > Also if we ever add "too many small loose objects is wasteful, let's
> > send them into a single 'batch pack'" optimization, it would create
> > a funny situation where the caller sends the contents of a small
> > incoming packfile to unpack-objects, but the command chooses to
> > bunch them all together in a packfile anyway ;-)
> >
> > So, I dunno.
> >
>
> I'd be happy to just drop this patch.  I originally added it to answer Avarab's
> question: how does batch mode compare to packfiles? [1] [2].
>
> [1] https://lore.kernel.org/git/87mtp5cwpn.fsf@evledraar.gmail.com/
> [2] https://lore.kernel.org/git/pull.1076.v5.git.git.1632514331.gitgitgadget@gmail.com/

Well looking back again at the spreadsheet [3], at 90 objects, which
is below the
default transfer.unpackLimit, we see a 3x difference in performance
between batch
mode and the default fsync mode.  That's a different interaction class
(230 ms versus 760 ms).

I'll include a small table in the commit description with these
performance numbers to
help justify it.

[3] https://docs.google.com/spreadsheets/d/1uxMBkEXFFnQ1Y3lXKqcKpw6Mq44BzhpCAcPex14T-QQ

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c
  2022-03-22 20:05               ` Neeraj Singh
@ 2022-03-23  3:47                 ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
                                     ` (7 more replies)
  0 siblings, 8 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

This RFC series is a continuation of the thread at
https://lore.kernel.org/git/CANQDOde2OG8fVSM1hQE3FBmzWy5FkgQCWAUYhFztB8UGFyJELg@mail.gmail.com/;
More details in individual commit messages.

I'd suggested (upthread of) there pass new object flags down to the
object machinery instead of the {un,}plug_bulk_checkin() API
route. This has advantages described in more details in individual
patches.

This also shows that the not-using tmpdir approach can be
significantly faster than using it, and per my understanding just as
safe fsync-wise for those willing to deal with the caveat of possibly
having truncated *unreachable* objects.

I thought that showing some working code with what I was suggesting
was more productive than continuing the current back & forth :)

Ævar Arnfjörð Bjarmason (7):
  write-or-die.c: remove unused fsync_component() function
  unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  object-file: pass down unpack-objects.c flags for "bulk" checkin
  update-index: use a utility function for stdin consumption
  update-index: pass down an "oflags" argument
  update-index: rename "buf" to "line"
  update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags

 builtin/add.c            |  3 --
 builtin/unpack-objects.c | 62 ++++++++++++++------------
 builtin/update-index.c   | 96 ++++++++++++++++++++++++++--------------
 bulk-checkin.c           | 86 -----------------------------------
 bulk-checkin.h           |  6 ---
 cache.h                  |  9 ++--
 object-file.c            | 39 +++++++++++-----
 t/t1050-large.sh         |  3 ++
 write-or-die.c           |  7 ---
 9 files changed, 131 insertions(+), 180 deletions(-)

-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  5:27                     ` Neeraj Singh
  2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
                                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

This function added in 020406eaa52 (core.fsync: introduce granular
fsync control infrastructure, 2022-03-10) hasn't been used, and
appears not to be used by the follow-up series either?

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h        | 1 -
 write-or-die.c | 7 -------
 2 files changed, 8 deletions(-)

diff --git a/cache.h b/cache.h
index 84fafe2ed71..5d863f8c5e8 100644
--- a/cache.h
+++ b/cache.h
@@ -1766,7 +1766,6 @@ int copy_file_with_time(const char *dst, const char *src, int mode);
 
 void write_or_die(int fd, const void *buf, size_t count);
 void fsync_or_die(int fd, const char *);
-int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
 static inline int batch_fsync_enabled(enum fsync_component component)
diff --git a/write-or-die.c b/write-or-die.c
index c4fd91b5b43..103698450c3 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -76,13 +76,6 @@ void fsync_or_die(int fd, const char *msg)
 		die_errno("fsync error on '%s'", msg);
 }
 
-int fsync_component(enum fsync_component component, int fd)
-{
-	if (fsync_components & component)
-		return maybe_fsync(fd);
-	return 0;
-}
-
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg)
 {
 	if (fsync_components & component)
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
                                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

In preparation for making the bulk-checkin.c logic operate from
object-file.c itself in some common cases let's add
HASH_N_OBJECTS{,_{FIRST,LAST}} flags.

This will allow us to adjust for-loops that add N objects to just pass
down whether they have >1 objects (HASH_N_OBJECTS), as well as passing
down flags for whether we have the first or last object.

We'll thus be able to drive any sort of batch-object mechanism from
write_object_file_flags() directly, which until now didn't know if it
was doing one object, or some arbitrary N.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/unpack-objects.c | 60 +++++++++++++++++++++++-----------------
 cache.h                  |  3 ++
 2 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index c55b6616aed..ec40c6fd966 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -233,7 +233,8 @@ static void write_rest(void)
 }
 
 static void added_object(unsigned nr, enum object_type type,
-			 void *data, unsigned long size);
+			 void *data, unsigned long size,
+			 unsigned oflags);
 
 /*
  * Write out nr-th object from the list, now we know the contents
@@ -241,21 +242,21 @@ static void added_object(unsigned nr, enum object_type type,
  * to be checked at the end.
  */
 static void write_object(unsigned nr, enum object_type type,
-			 void *buf, unsigned long size)
+			 void *buf, unsigned long size, unsigned oflags)
 {
 	if (!strict) {
-		if (write_object_file(buf, size, type,
-				      &obj_list[nr].oid) < 0)
+		if (write_object_file_flags(buf, size, type,
+				      &obj_list[nr].oid, oflags) < 0)
 			die("failed to write object");
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		free(buf);
 		obj_list[nr].obj = NULL;
 	} else if (type == OBJ_BLOB) {
 		struct blob *blob;
-		if (write_object_file(buf, size, type,
-				      &obj_list[nr].oid) < 0)
+		if (write_object_file_flags(buf, size, type,
+					    &obj_list[nr].oid, oflags) < 0)
 			die("failed to write object");
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		free(buf);
 
 		blob = lookup_blob(the_repository, &obj_list[nr].oid);
@@ -269,7 +270,7 @@ static void write_object(unsigned nr, enum object_type type,
 		int eaten;
 		hash_object_file(the_hash_algo, buf, size, type,
 				 &obj_list[nr].oid);
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		obj = parse_object_buffer(the_repository, &obj_list[nr].oid,
 					  type, size, buf,
 					  &eaten);
@@ -283,7 +284,7 @@ static void write_object(unsigned nr, enum object_type type,
 
 static void resolve_delta(unsigned nr, enum object_type type,
 			  void *base, unsigned long base_size,
-			  void *delta, unsigned long delta_size)
+			  void *delta, unsigned long delta_size, unsigned oflags)
 {
 	void *result;
 	unsigned long result_size;
@@ -294,7 +295,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
 	if (!result)
 		die("failed to apply delta");
 	free(delta);
-	write_object(nr, type, result, result_size);
+	write_object(nr, type, result, result_size, oflags);
 }
 
 /*
@@ -302,7 +303,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
  * resolve all the deltified objects that are based on it.
  */
 static void added_object(unsigned nr, enum object_type type,
-			 void *data, unsigned long size)
+			 void *data, unsigned long size, unsigned oflags)
 {
 	struct delta_info **p = &delta_list;
 	struct delta_info *info;
@@ -313,7 +314,7 @@ static void added_object(unsigned nr, enum object_type type,
 			*p = info->next;
 			p = &delta_list;
 			resolve_delta(info->nr, type, data, size,
-				      info->delta, info->size);
+				      info->delta, info->size, oflags);
 			free(info);
 			continue;
 		}
@@ -322,18 +323,19 @@ static void added_object(unsigned nr, enum object_type type,
 }
 
 static void unpack_non_delta_entry(enum object_type type, unsigned long size,
-				   unsigned nr)
+				   unsigned nr, unsigned oflags)
 {
 	void *buf = get_data(size);
 
 	if (!dry_run && buf)
-		write_object(nr, type, buf, size);
+		write_object(nr, type, buf, size, oflags);
 	else
 		free(buf);
 }
 
 static int resolve_against_held(unsigned nr, const struct object_id *base,
-				void *delta_data, unsigned long delta_size)
+				void *delta_data, unsigned long delta_size,
+				unsigned oflags)
 {
 	struct object *obj;
 	struct obj_buffer *obj_buffer;
@@ -344,12 +346,12 @@ static int resolve_against_held(unsigned nr, const struct object_id *base,
 	if (!obj_buffer)
 		return 0;
 	resolve_delta(nr, obj->type, obj_buffer->buffer,
-		      obj_buffer->size, delta_data, delta_size);
+		      obj_buffer->size, delta_data, delta_size, oflags);
 	return 1;
 }
 
 static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
-			       unsigned nr)
+			       unsigned nr, unsigned oflags)
 {
 	void *delta_data, *base;
 	unsigned long base_size;
@@ -366,7 +368,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		if (has_object_file(&base_oid))
 			; /* Ok we have this one */
 		else if (resolve_against_held(nr, &base_oid,
-					      delta_data, delta_size))
+					      delta_data, delta_size, oflags))
 			return; /* we are done */
 		else {
 			/* cannot resolve yet --- queue it */
@@ -428,7 +430,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		}
 	}
 
-	if (resolve_against_held(nr, &base_oid, delta_data, delta_size))
+	if (resolve_against_held(nr, &base_oid, delta_data, delta_size, oflags))
 		return;
 
 	base = read_object_file(&base_oid, &type, &base_size);
@@ -440,11 +442,11 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		has_errors = 1;
 		return;
 	}
-	resolve_delta(nr, type, base, base_size, delta_data, delta_size);
+	resolve_delta(nr, type, base, base_size, delta_data, delta_size, oflags);
 	free(base);
 }
 
-static void unpack_one(unsigned nr)
+static void unpack_one(unsigned nr, unsigned oflags)
 {
 	unsigned shift;
 	unsigned char *pack;
@@ -472,11 +474,11 @@ static void unpack_one(unsigned nr)
 	case OBJ_TREE:
 	case OBJ_BLOB:
 	case OBJ_TAG:
-		unpack_non_delta_entry(type, size, nr);
+		unpack_non_delta_entry(type, size, nr, oflags);
 		return;
 	case OBJ_REF_DELTA:
 	case OBJ_OFS_DELTA:
-		unpack_delta_entry(type, size, nr);
+		unpack_delta_entry(type, size, nr, oflags);
 		return;
 	default:
 		error("bad object type %d", type);
@@ -491,6 +493,7 @@ static void unpack_all(void)
 {
 	int i;
 	struct pack_header *hdr = fill(sizeof(struct pack_header));
+	unsigned oflags;
 
 	nr_objects = ntohl(hdr->hdr_entries);
 
@@ -505,9 +508,14 @@ static void unpack_all(void)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
 	plug_bulk_checkin();
+	oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
 	for (i = 0; i < nr_objects; i++) {
-		unpack_one(i);
-		display_progress(progress, i + 1);
+		int nth = i + 1;
+		unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
+			nr_objects == nth ? HASH_N_OBJECTS_LAST : 0;
+
+		unpack_one(i, oflags | f);
+		display_progress(progress, nth);
 	}
 	unplug_bulk_checkin();
 	stop_progress(&progress);
diff --git a/cache.h b/cache.h
index 5d863f8c5e8..320248a54e0 100644
--- a/cache.h
+++ b/cache.h
@@ -896,6 +896,9 @@ int ie_modified(struct index_state *, const struct cache_entry *, struct stat *,
 #define HASH_FORMAT_CHECK 2
 #define HASH_RENORMALIZE  4
 #define HASH_SILENT 8
+#define HASH_N_OBJECTS 1<<4
+#define HASH_N_OBJECTS_FIRST 1<<5
+#define HASH_N_OBJECTS_LAST 1<<6
 int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
 int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
 
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
                                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Remove much of this as a POC for exploring some of what I mentioned in
https://lore.kernel.org/git/220322.86mthinxnn.gmgdl@evledraar.gmail.com/

This commit is obviously not what we *should* do as end-state, but
demonstrates what's needed (I think) for a bare-minimum implementation
of just the "bulk" syncing method for loose objects without the part
where we do the tmp-objdir.c dance.

Performance with this is already quite promising. Benchmarking with:

	git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' \
	    	-p 'rm -rf r.git && git init --bare r.git' \
		'./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack'

I.e. unpacking a small packfile (my dotfiles) yields, on a Linux
ramdisk:

	Benchmark 1: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync
	  Time (mean ± σ):     815.9 ms ±   8.2 ms    [User: 522.9 ms, System: 287.9 ms]
	  Range (min … max):   805.6 ms … 835.9 ms    10 runs

	Benchmark 2: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD
	  Time (mean ± σ):     779.4 ms ±  15.4 ms    [User: 505.7 ms, System: 270.2 ms]
	  Range (min … max):   763.1 ms … 813.9 ms    10 runs

	Summary
	  './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
	    1.05 ± 0.02 times faster than './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'

Doing the same with "strace --summary-only", which probably helps to
emulate cases with slower syscalls is ~15% faster than using the
tmp-objdir indirection:

	Summary
	  'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
	    1.16 ± 0.01 times faster than 'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'

Which makes sense in terms of syscalls. In my case HEAD has ~101k
calls, and the parent topic is making ~129k calls, with around 2x the
number of unlink(), link() as expected.

Of course some users will want to use the tmp-objdir.c method. So a
version of this commit could be rewritten to come earlier in the
series, with the "bulk" on top being optional.

It seems to me that it's a much better strategy to do this whole thing
in close_loose_object() after passing down the new HASH_N_OBJECTS /
HASH_N_OBJECTS_FIRST / HASH_N_OBJECTS_LAST flags.

Doing that for the "builtin/add.c" and "builtin/unpack-objects.c" code
having its {un,}plug_bulk_checkin() removed here is then just a matter
of passing down a similar set of flags indicating whether we're
dealing with N objects, and if so if we're dealing with the last one
or not.

As we'll see in subsequent commits doing it this way also effortlessly
integrates with other HASH_* flags. E.g. for "update-index" the code
being rm'd here doesn't handle the interaction with
"HASH_WRITE_OBJECT" properly, but once we've moved all this sync
bootstrapping logic to close_loose_object() we'll never get to it if
we're not actually writing something.

This code currently doesn't use the HASH_N_OBJECTS_FIRST flag, but
that's what we'd use later to optionally call tmp_objdir_create().

Aside: This also changes logic that was a bit confusing and repetitive
in close_loose_object(). Previously we'd first call
batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT) which is just as
shorthand for:

	fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT &&
	fsync_method == FSYNC_METHOD_BATCH

We'd then proceed to call
fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT) later in the same
function, which is just a way of calling fsync_or_die() if:

	fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT

Now we instead just define a local "fsync_loose" variable by checking
"fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT", which shows us that
the previous case of fsync_component_or_die(...)" could just be added
to the existing "fsync_object_files > 0" branch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/add.c            |  3 --
 builtin/unpack-objects.c |  2 -
 builtin/update-index.c   |  4 --
 bulk-checkin.c           | 86 ----------------------------------------
 bulk-checkin.h           |  6 ---
 cache.h                  |  5 ---
 object-file.c            | 37 ++++++++++++-----
 t/t1050-large.sh         |  3 ++
 8 files changed, 29 insertions(+), 117 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 3ffb86a4338..a6d7f4dc1e1 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -670,8 +670,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		string_list_clear(&only_match_skip_worktree, 0);
 	}
 
-	plug_bulk_checkin();
-
 	if (add_renormalize)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
 	else
@@ -682,7 +680,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (chmod_arg && pathspec.nr)
 		exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
-	unplug_bulk_checkin();
 
 finish:
 	if (write_locked_index(&the_index, &lock_file,
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ec40c6fd966..93da436581b 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -507,7 +507,6 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
-	plug_bulk_checkin();
 	oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
 	for (i = 0; i < nr_objects; i++) {
 		int nth = i + 1;
@@ -517,7 +516,6 @@ static void unpack_all(void)
 		unpack_one(i, oflags | f);
 		display_progress(progress, nth);
 	}
-	unplug_bulk_checkin();
 	stop_progress(&progress);
 
 	if (delta_list)
diff --git a/builtin/update-index.c b/builtin/update-index.c
index cbd2b0d633b..95ed3c47b2e 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1118,8 +1118,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	parse_options_start(&ctx, argc, argv, prefix,
 			    options, PARSE_OPT_STOP_AT_NON_OPTION);
 
-	/* optimize adding many objects to the object database */
-	plug_bulk_checkin();
 	while (ctx.argc) {
 		if (parseopt_state != PARSE_OPT_DONE)
 			parseopt_state = parse_options_step(&ctx, options,
@@ -1194,8 +1192,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
-	/* by now we must have added all of the new objects */
-	unplug_bulk_checkin();
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
diff --git a/bulk-checkin.c b/bulk-checkin.c
index a0dca79ba6a..4ffea87f44d 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -9,14 +9,11 @@
 #include "pack.h"
 #include "strbuf.h"
 #include "string-list.h"
-#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int bulk_checkin_plugged;
 
-static struct tmp_objdir *bulk_fsync_objdir;
-
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
@@ -85,40 +82,6 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
 	reprepare_packed_git(the_repository);
 }
 
-/*
- * Cleanup after batch-mode fsync_object_files.
- */
-static void do_batch_fsync(void)
-{
-	struct strbuf temp_path = STRBUF_INIT;
-	struct tempfile *temp;
-
-	if (!bulk_fsync_objdir)
-		return;
-
-	/*
-	 * Issue a full hardware flush against a temporary file to ensure
-	 * that all objects are durable before any renames occur. The code in
-	 * fsync_loose_object_bulk_checkin has already issued a writeout
-	 * request, but it has not flushed any writeback cache in the storage
-	 * hardware or any filesystem logs. This fsync call acts as a barrier
-	 * to ensure that the data in each new object file is durable before
-	 * the final name is visible.
-	 */
-	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
-	temp = xmks_tempfile(temp_path.buf);
-	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
-	delete_tempfile(&temp);
-	strbuf_release(&temp_path);
-
-	/*
-	 * Make the object files visible in the primary ODB after their data is
-	 * fully durable.
-	 */
-	tmp_objdir_migrate(bulk_fsync_objdir);
-	bulk_fsync_objdir = NULL;
-}
-
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -313,26 +276,6 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
-void prepare_loose_object_bulk_checkin(void)
-{
-	if (bulk_checkin_plugged && !bulk_fsync_objdir)
-		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
-}
-
-void fsync_loose_object_bulk_checkin(int fd, const char *filename)
-{
-	/*
-	 * If we have a plugged bulk checkin, we issue a call that
-	 * cleans the filesystem page cache but avoids a hardware flush
-	 * command. Later on we will issue a single hardware flush
-	 * before as part of do_batch_fsync.
-	 */
-	if (!bulk_fsync_objdir ||
-	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
-		fsync_or_die(fd, filename);
-	}
-}
-
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -343,32 +286,3 @@ int index_bulk_checkin(struct object_id *oid,
 		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
-
-void plug_bulk_checkin(void)
-{
-	assert(!bulk_checkin_plugged);
-
-	/*
-	 * A temporary object directory is used to hold the files
-	 * while they are not fsynced.
-	 */
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
-		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
-		if (!bulk_fsync_objdir)
-			die(_("Could not create temporary object directory for core.fsyncMethod=batch"));
-
-		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
-	}
-
-	bulk_checkin_plugged = 1;
-}
-
-void unplug_bulk_checkin(void)
-{
-	assert(bulk_checkin_plugged);
-	bulk_checkin_plugged = 0;
-	if (bulk_checkin_state.f)
-		finish_bulk_checkin(&bulk_checkin_state);
-
-	do_batch_fsync();
-}
diff --git a/bulk-checkin.h b/bulk-checkin.h
index 181d3447ff9..76fc33e0c8f 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,14 +6,8 @@
 
 #include "cache.h"
 
-void prepare_loose_object_bulk_checkin(void);
-void fsync_loose_object_bulk_checkin(int fd, const char *filename);
-
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
 
-void plug_bulk_checkin(void);
-void unplug_bulk_checkin(void);
-
 #endif
diff --git a/cache.h b/cache.h
index 320248a54e0..997bf2f57fd 100644
--- a/cache.h
+++ b/cache.h
@@ -1771,11 +1771,6 @@ void write_or_die(int fd, const void *buf, size_t count);
 void fsync_or_die(int fd, const char *);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
-static inline int batch_fsync_enabled(enum fsync_component component)
-{
-	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
-}
-
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/object-file.c b/object-file.c
index cd0ddb49e4b..dbeb3df502d 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1886,19 +1886,37 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
 	hash_object_file_literally(algo, buf, len, type_name(type), oid);
 }
 
+static void sync_loose_object_batch(int fd, const char *filename,
+				    const unsigned oflags)
+{
+	const int last = oflags & HASH_N_OBJECTS_LAST;
+
+	/*
+	 * We're doing a sync_file_range() (or equivalent) for 1..N-1
+	 * objects, and then a "real" fsync() for N. On some OS's
+	 * enabling core.fsync=loose-object && core.fsyncMethod=batch
+	 * improves the performance by a lot.
+	 */
+	if (last || (!last && git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0))
+		fsync_or_die(fd, filename);
+}
+
 /* Finalize a file on disk, and close it. */
-static void close_loose_object(int fd, const char *filename)
+static void close_loose_object(int fd, const char *filename,
+			       const unsigned oflags)
 {
+	int fsync_loose;
+
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		fsync_loose_object_bulk_checkin(fd, filename);
-	else if (fsync_object_files > 0)
+	fsync_loose = fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT;
+
+	if (oflags & HASH_N_OBJECTS && fsync_loose &&
+	    fsync_method == FSYNC_METHOD_BATCH)
+		sync_loose_object_batch(fd, filename, oflags);
+	else if (fsync_object_files > 0 || fsync_loose)
 		fsync_or_die(fd, filename);
-	else
-		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
-				       filename);
 
 out:
 	if (close(fd) != 0)
@@ -1962,9 +1980,6 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	static struct strbuf filename = STRBUF_INIT;
 
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		prepare_loose_object_bulk_checkin();
-
 	loose_object_path(the_repository, &filename, oid);
 
 	fd = create_tmpfile(&tmp_file, filename.buf);
@@ -2015,7 +2030,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd, tmp_file.buf);
+	close_loose_object(fd, tmp_file.buf, flags);
 
 	if (mtime) {
 		struct utimbuf utb;
diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index 4f3aa17c994..1baaa8024c8 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -5,6 +5,9 @@ test_description='adding and checking out large blobs'
 
 . ./test-lib.sh
 
+skip_all='TODO: migrate the builtin/add.c code'
+test_done
+
 test_expect_success setup '
 	# clone does not allow us to pass core.bigfilethreshold to
 	# new repos, so set core.bigfilethreshold globally
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 4/7] update-index: use a utility function for stdin consumption
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                     ` (2 preceding siblings ...)
  2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
                                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c | 36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 95ed3c47b2e..80b96ec5721 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -971,6 +971,25 @@ static enum parse_opt_result reupdate_callback(
 	return 0;
 }
 
+static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
+			    const char *prefix, int prefix_length,
+			    const int nul_term_line, const int set_executable_bit)
+{
+	char *p;
+
+	if (!nul_term_line && buf->buf[0] == '"') {
+		strbuf_reset(unquoted);
+		if (unquote_c_style(unquoted, buf->buf, NULL))
+			die("line is badly quoted");
+		strbuf_swap(buf, unquoted);
+	}
+	p = prefix_path(prefix, prefix_length, buf->buf);
+	update_one(p);
+	if (set_executable_bit)
+		chmod_path(set_executable_bit, p);
+	free(p);
+}
+
 int cmd_update_index(int argc, const char **argv, const char *prefix)
 {
 	int newfd, entries, has_errors = 0, nul_term_line = 0;
@@ -1174,20 +1193,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		struct strbuf unquoted = STRBUF_INIT;
 
 		setup_work_tree();
-		while (getline_fn(&buf, stdin) != EOF) {
-			char *p;
-			if (!nul_term_line && buf.buf[0] == '"') {
-				strbuf_reset(&unquoted);
-				if (unquote_c_style(&unquoted, buf.buf, NULL))
-					die("line is badly quoted");
-				strbuf_swap(&buf, &unquoted);
-			}
-			p = prefix_path(prefix, prefix_length, buf.buf);
-			update_one(p);
-			if (set_executable_bit)
-				chmod_path(set_executable_bit, p);
-			free(p);
-		}
+		while (getline_fn(&buf, stdin) != EOF)
+			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
+					nul_term_line, set_executable_bit);
 		strbuf_release(&unquoted);
 		strbuf_release(&buf);
 	}
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 5/7] update-index: pass down an "oflags" argument
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                     ` (3 preceding siblings ...)
  2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
                                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

We do nothing with this yet, but will soon.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c | 37 +++++++++++++++++++++----------------
 object-file.c          |  2 +-
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 80b96ec5721..1884124224c 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -267,10 +267,12 @@ static int process_lstat_error(const char *path, int err)
 	return error("lstat(\"%s\"): %s", path, strerror(err));
 }
 
-static int add_one_path(const struct cache_entry *old, const char *path, int len, struct stat *st)
+static int add_one_path(const struct cache_entry *old, const char *path,
+			int len, struct stat *st, const unsigned oflags)
 {
 	int option;
 	struct cache_entry *ce;
+	unsigned f;
 
 	/* Was the old index entry already up-to-date? */
 	if (old && !ce_stage(old) && !ce_match_stat(old, st, 0))
@@ -283,8 +285,8 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
 	fill_stat_cache_info(&the_index, ce, st);
 	ce->ce_mode = ce_mode_from_stat(old, st->st_mode);
 
-	if (index_path(&the_index, &ce->oid, path, st,
-		       info_only ? 0 : HASH_WRITE_OBJECT)) {
+	f = oflags | (info_only ? 0 : HASH_WRITE_OBJECT);
+	if (index_path(&the_index, &ce->oid, path, st, f)) {
 		discard_cache_entry(ce);
 		return -1;
 	}
@@ -320,7 +322,8 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
  *  - it doesn't exist at all in the index, but it is a valid
  *    git directory, and it should be *added* as a gitlink.
  */
-static int process_directory(const char *path, int len, struct stat *st)
+static int process_directory(const char *path, int len, struct stat *st,
+			     const unsigned oflags)
 {
 	struct object_id oid;
 	int pos = cache_name_pos(path, len);
@@ -334,7 +337,7 @@ static int process_directory(const char *path, int len, struct stat *st)
 			if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
 				return 0;
 
-			return add_one_path(ce, path, len, st);
+			return add_one_path(ce, path, len, st, oflags);
 		}
 		/* Should this be an unconditional error? */
 		return remove_one_path(path);
@@ -358,13 +361,14 @@ static int process_directory(const char *path, int len, struct stat *st)
 
 	/* No match - should we add it as a gitlink? */
 	if (!resolve_gitlink_ref(path, "HEAD", &oid))
-		return add_one_path(NULL, path, len, st);
+		return add_one_path(NULL, path, len, st, oflags);
 
 	/* Error out. */
 	return error("%s: is a directory - add files inside instead", path);
 }
 
-static int process_path(const char *path, struct stat *st, int stat_errno)
+static int process_path(const char *path, struct stat *st, int stat_errno,
+			const unsigned oflags)
 {
 	int pos, len;
 	const struct cache_entry *ce;
@@ -395,9 +399,9 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 		return process_lstat_error(path, stat_errno);
 
 	if (S_ISDIR(st->st_mode))
-		return process_directory(path, len, st);
+		return process_directory(path, len, st, oflags);
 
-	return add_one_path(ce, path, len, st);
+	return add_one_path(ce, path, len, st, oflags);
 }
 
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
@@ -446,7 +450,7 @@ static void chmod_path(char flip, const char *path)
 	die("git update-index: cannot chmod %cx '%s'", flip, path);
 }
 
-static void update_one(const char *path)
+static void update_one(const char *path, const unsigned oflags)
 {
 	int stat_errno = 0;
 	struct stat st;
@@ -485,7 +489,7 @@ static void update_one(const char *path)
 		report("remove '%s'", path);
 		return;
 	}
-	if (process_path(path, &st, stat_errno))
+	if (process_path(path, &st, stat_errno, oflags))
 		die("Unable to process path %s", path);
 	report("add '%s'", path);
 }
@@ -776,7 +780,7 @@ static int do_reupdate(int ac, const char **av,
 		 */
 		save_nr = active_nr;
 		path = xstrdup(ce->name);
-		update_one(path);
+		update_one(path, 0);
 		free(path);
 		discard_cache_entry(old);
 		if (save_nr != active_nr)
@@ -973,7 +977,8 @@ static enum parse_opt_result reupdate_callback(
 
 static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
 			    const char *prefix, int prefix_length,
-			    const int nul_term_line, const int set_executable_bit)
+			    const int nul_term_line, const int set_executable_bit,
+			    const unsigned oflags)
 {
 	char *p;
 
@@ -984,7 +989,7 @@ static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
 		strbuf_swap(buf, unquoted);
 	}
 	p = prefix_path(prefix, prefix_length, buf->buf);
-	update_one(p);
+	update_one(p, oflags);
 	if (set_executable_bit)
 		chmod_path(set_executable_bit, p);
 	free(p);
@@ -1157,7 +1162,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 			setup_work_tree();
 			p = prefix_path(prefix, prefix_length, path);
-			update_one(p);
+			update_one(p, 0);
 			if (set_executable_bit)
 				chmod_path(set_executable_bit, p);
 			free(p);
@@ -1195,7 +1200,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		setup_work_tree();
 		while (getline_fn(&buf, stdin) != EOF)
 			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
-					nul_term_line, set_executable_bit);
+					nul_term_line, set_executable_bit, 0);
 		strbuf_release(&unquoted);
 		strbuf_release(&buf);
 	}
diff --git a/object-file.c b/object-file.c
index dbeb3df502d..8999fce2b15 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2211,7 +2211,7 @@ static int index_mem(struct index_state *istate,
 	}
 
 	if (write_object)
-		ret = write_object_file(buf, size, type, oid);
+		ret = write_object_file_flags(buf, size, type, oid, flags);
 	else
 		hash_object_file(the_hash_algo, buf, size, type, oid);
 	if (re_allocated)
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 6/7] update-index: rename "buf" to "line"
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                     ` (4 preceding siblings ...)
  2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

This variable renaming makes a subsequent more meaningful change
smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 1884124224c..af02ff39756 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1194,15 +1194,15 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (read_from_stdin) {
-		struct strbuf buf = STRBUF_INIT;
+		struct strbuf line = STRBUF_INIT;
 		struct strbuf unquoted = STRBUF_INIT;
 
 		setup_work_tree();
-		while (getline_fn(&buf, stdin) != EOF)
-			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
+		while (getline_fn(&line, stdin) != EOF)
+			line_from_stdin(&line, &unquoted, prefix, prefix_length,
 					nul_term_line, set_executable_bit, 0);
 		strbuf_release(&unquoted);
-		strbuf_release(&buf);
+		strbuf_release(&line);
 	}
 
 	if (split_index > 0) {
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                     ` (5 preceding siblings ...)
  2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
@ 2022-03-23  3:47                   ` Ævar Arnfjörð Bjarmason
  2022-03-23  5:51                     ` Neeraj Singh
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  7 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  3:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

As with unpack-objects in a preceding commit have update-index.c make
use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
mode again for "update-index".

Adding the t/* directory from git.git on a Linux ramdisk is a bit
faster than with the tmp-objdir indirection:

	git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
	Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
	  Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
	  Range (min … max):   285.6 ms … 297.0 ms    10 runs

	Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
	  Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
	  Range (min … max):   267.8 ms … 291.3 ms    10 runs

	Summary
	  'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
	    1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'

And as before running that with "strace --summary-only" slows things
down a bit (probably mimicking slower I/O a bit). I then get:

	Summary
	  'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
	    1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'

We also go from ~51k syscalls to ~39k, with ~2x the number of link()
and unlink() in ns/batched-fsync.

In the process of doing this conversion we lost the "bulk" mode for
files added on the command-line. I don't think it's useful to optimize
that, but we could if anyone cared.

We've also converted this to a string_list, we could walk with
getline_fn() and get one line "ahead" to see what we have left, but I
found that state machine a bit painful, and at least in my testing
buffering this doesn't harm things. But we could also change this to
stream again, at the cost of some getline_fn() twiddling.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index af02ff39756..c7cbfe1123b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (read_from_stdin) {
+		struct string_list list = STRING_LIST_INIT_NODUP;
 		struct strbuf line = STRBUF_INIT;
 		struct strbuf unquoted = STRBUF_INIT;
+		size_t i, nr;
+		unsigned oflags;
 
 		setup_work_tree();
-		while (getline_fn(&line, stdin) != EOF)
-			line_from_stdin(&line, &unquoted, prefix, prefix_length,
-					nul_term_line, set_executable_bit, 0);
+		while (getline_fn(&line, stdin) != EOF) {
+			size_t len = line.len;
+			char *str = strbuf_detach(&line, NULL);
+
+			string_list_append_nodup(&list, str)->util = (void *)len;
+		}
+
+		nr = list.nr;
+		oflags = nr > 1 ? HASH_N_OBJECTS : 0;
+		for (i = 0; i < nr; i++) {
+			size_t nth = i + 1;
+			unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
+				  nr == nth ? HASH_N_OBJECTS_LAST : 0;
+			struct strbuf buf = STRBUF_INIT;
+			struct string_list_item *item = list.items + i;
+			const size_t len = (size_t)item->util;
+
+			strbuf_attach(&buf, item->string, len, len);
+			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
+					nul_term_line, set_executable_bit,
+					oflags | f);
+			strbuf_release(&buf);
+		}
 		strbuf_release(&unquoted);
-		strbuf_release(&line);
+		string_list_clear(&list, 0);
 	}
 
 	if (split_index > 0) {
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function
  2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
@ 2022-03-23  5:27                     ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23  5:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Tue, Mar 22, 2022 at 8:48 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> This function added in 020406eaa52 (core.fsync: introduce granular
> fsync control infrastructure, 2022-03-10) hasn't been used, and
> appears not to be used by the follow-up series either?
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  cache.h        | 1 -
>  write-or-die.c | 7 -------
>  2 files changed, 8 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 84fafe2ed71..5d863f8c5e8 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1766,7 +1766,6 @@ int copy_file_with_time(const char *dst, const char *src, int mode);
>
>  void write_or_die(int fd, const void *buf, size_t count);
>  void fsync_or_die(int fd, const char *);
> -int fsync_component(enum fsync_component component, int fd);
>  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
>
>  static inline int batch_fsync_enabled(enum fsync_component component)
> diff --git a/write-or-die.c b/write-or-die.c
> index c4fd91b5b43..103698450c3 100644
> --- a/write-or-die.c
> +++ b/write-or-die.c
> @@ -76,13 +76,6 @@ void fsync_or_die(int fd, const char *msg)
>                 die_errno("fsync error on '%s'", msg);
>  }
>
> -int fsync_component(enum fsync_component component, int fd)
> -{
> -       if (fsync_components & component)
> -               return maybe_fsync(fd);
> -       return 0;
> -}
> -
>  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg)
>  {
>         if (fsync_components & component)
> --
> 2.35.1.1428.g1c1a0152d61
>

This helper was put in for Patrick's patch at
https://lore.kernel.org/git/f1e8a7bb3bf0f4c0414819cb1d5579dc08fd2a4f.1646905589.git.ps@pks.im/.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
@ 2022-03-23  5:51                     ` Neeraj Singh
  2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23  5:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Tue, Mar 22, 2022 at 8:48 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> As with unpack-objects in a preceding commit have update-index.c make
> use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
> mode again for "update-index".
>
> Adding the t/* directory from git.git on a Linux ramdisk is a bit
> faster than with the tmp-objdir indirection:
>
>         git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
>         Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
>           Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
>           Range (min … max):   285.6 ms … 297.0 ms    10 runs
>
>         Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
>           Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
>           Range (min … max):   267.8 ms … 291.3 ms    10 runs
>
>         Summary
>           'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
>             1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
>
> And as before running that with "strace --summary-only" slows things
> down a bit (probably mimicking slower I/O a bit). I then get:
>
>         Summary
>           'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
>             1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
>
> We also go from ~51k syscalls to ~39k, with ~2x the number of link()
> and unlink() in ns/batched-fsync.
>
> In the process of doing this conversion we lost the "bulk" mode for
> files added on the command-line. I don't think it's useful to optimize
> that, but we could if anyone cared.
>
> We've also converted this to a string_list, we could walk with
> getline_fn() and get one line "ahead" to see what we have left, but I
> found that state machine a bit painful, and at least in my testing
> buffering this doesn't harm things. But we could also change this to
> stream again, at the cost of some getline_fn() twiddling.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/update-index.c | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index af02ff39756..c7cbfe1123b 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>         }
>
>         if (read_from_stdin) {
> +               struct string_list list = STRING_LIST_INIT_NODUP;
>                 struct strbuf line = STRBUF_INIT;
>                 struct strbuf unquoted = STRBUF_INIT;
> +               size_t i, nr;
> +               unsigned oflags;
>
>                 setup_work_tree();
> -               while (getline_fn(&line, stdin) != EOF)
> -                       line_from_stdin(&line, &unquoted, prefix, prefix_length,
> -                                       nul_term_line, set_executable_bit, 0);
> +               while (getline_fn(&line, stdin) != EOF) {
> +                       size_t len = line.len;
> +                       char *str = strbuf_detach(&line, NULL);
> +
> +                       string_list_append_nodup(&list, str)->util = (void *)len;
> +               }
> +
> +               nr = list.nr;
> +               oflags = nr > 1 ? HASH_N_OBJECTS : 0;
> +               for (i = 0; i < nr; i++) {
> +                       size_t nth = i + 1;
> +                       unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
> +                                 nr == nth ? HASH_N_OBJECTS_LAST : 0;
> +                       struct strbuf buf = STRBUF_INIT;
> +                       struct string_list_item *item = list.items + i;
> +                       const size_t len = (size_t)item->util;
> +
> +                       strbuf_attach(&buf, item->string, len, len);
> +                       line_from_stdin(&buf, &unquoted, prefix, prefix_length,
> +                                       nul_term_line, set_executable_bit,
> +                                       oflags | f);
> +                       strbuf_release(&buf);
> +               }
>                 strbuf_release(&unquoted);
> -               strbuf_release(&line);
> +               string_list_clear(&list, 0);
>         }
>
>         if (split_index > 0) {
> --
> 2.35.1.1428.g1c1a0152d61
>

This buffering introduces the same potential risk of the
"stdin-feeder" process not being able to see objects right away as my
version had. I'm planning to mitigate the issue by unplugging the bulk
checkin when issuing a verbose report so that anyone who's using that
output to synchronize can still see what they're expecting.

I think the code you've presented here is a lot of diff to accomplish
the same thing that my series does, where this specific update-index
caller has been roto-tilled to provide the needed
begin/end-transaction points.  And I think there will be a lot of
complexity in supporting the same hints for command-line additions
(which is roughly equivalent to the git-add workflow). Every caller
that wants batch treatment will have to either implement a state
machine or implement a buffering mechanism in order to figure out the
begin-end points. Having a separate plug/unplug call eliminates this
complexity on each caller.

Btw, I'm planning in a future series to reduce the system calls
involved in renaming a file by taking advantage of the renameat2
system call and equivalents on other platforms.  There's a pretty
strong motivation to do that on Windows.

Thanks for the concrete code,
-Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23  5:51                     ` Neeraj Singh
@ 2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
  2022-03-23 20:19                         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23  9:48 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh


On Tue, Mar 22 2022, Neeraj Singh wrote:

> On Tue, Mar 22, 2022 at 8:48 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> As with unpack-objects in a preceding commit have update-index.c make
>> use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
>> mode again for "update-index".
>>
>> Adding the t/* directory from git.git on a Linux ramdisk is a bit
>> faster than with the tmp-objdir indirection:
>>
>>         git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
>>         Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
>>           Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
>>           Range (min … max):   285.6 ms … 297.0 ms    10 runs
>>
>>         Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
>>           Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
>>           Range (min … max):   267.8 ms … 291.3 ms    10 runs
>>
>>         Summary
>>           'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
>>             1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
>>
>> And as before running that with "strace --summary-only" slows things
>> down a bit (probably mimicking slower I/O a bit). I then get:
>>
>>         Summary
>>           'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
>>             1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
>>
>> We also go from ~51k syscalls to ~39k, with ~2x the number of link()
>> and unlink() in ns/batched-fsync.
>>
>> In the process of doing this conversion we lost the "bulk" mode for
>> files added on the command-line. I don't think it's useful to optimize
>> that, but we could if anyone cared.
>>
>> We've also converted this to a string_list, we could walk with
>> getline_fn() and get one line "ahead" to see what we have left, but I
>> found that state machine a bit painful, and at least in my testing
>> buffering this doesn't harm things. But we could also change this to
>> stream again, at the cost of some getline_fn() twiddling.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/update-index.c | 31 +++++++++++++++++++++++++++----
>>  1 file changed, 27 insertions(+), 4 deletions(-)
>>
>> diff --git a/builtin/update-index.c b/builtin/update-index.c
>> index af02ff39756..c7cbfe1123b 100644
>> --- a/builtin/update-index.c
>> +++ b/builtin/update-index.c
>> @@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>>         }
>>
>>         if (read_from_stdin) {
>> +               struct string_list list = STRING_LIST_INIT_NODUP;
>>                 struct strbuf line = STRBUF_INIT;
>>                 struct strbuf unquoted = STRBUF_INIT;
>> +               size_t i, nr;
>> +               unsigned oflags;
>>
>>                 setup_work_tree();
>> -               while (getline_fn(&line, stdin) != EOF)
>> -                       line_from_stdin(&line, &unquoted, prefix, prefix_length,
>> -                                       nul_term_line, set_executable_bit, 0);
>> +               while (getline_fn(&line, stdin) != EOF) {
>> +                       size_t len = line.len;
>> +                       char *str = strbuf_detach(&line, NULL);
>> +
>> +                       string_list_append_nodup(&list, str)->util = (void *)len;
>> +               }
>> +
>> +               nr = list.nr;
>> +               oflags = nr > 1 ? HASH_N_OBJECTS : 0;
>> +               for (i = 0; i < nr; i++) {
>> +                       size_t nth = i + 1;
>> +                       unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
>> +                                 nr == nth ? HASH_N_OBJECTS_LAST : 0;
>> +                       struct strbuf buf = STRBUF_INIT;
>> +                       struct string_list_item *item = list.items + i;
>> +                       const size_t len = (size_t)item->util;
>> +
>> +                       strbuf_attach(&buf, item->string, len, len);
>> +                       line_from_stdin(&buf, &unquoted, prefix, prefix_length,
>> +                                       nul_term_line, set_executable_bit,
>> +                                       oflags | f);
>> +                       strbuf_release(&buf);
>> +               }
>>                 strbuf_release(&unquoted);
>> -               strbuf_release(&line);
>> +               string_list_clear(&list, 0);
>>         }
>>
>>         if (split_index > 0) {
>> --
>> 2.35.1.1428.g1c1a0152d61
>>
>
> This buffering introduces the same potential risk of the
> "stdin-feeder" process not being able to see objects right away as my
> version had. I'm planning to mitigate the issue by unplugging the bulk
> checkin when issuing a verbose report so that anyone who's using that
> output to synchronize can still see what they're expecting.

I was rather terse in the commit message, I meant (but forgot some
words) "doesn't harm thing for performance [in the above test]", but
converting this to a string_list is clearly & regression that shouldn't
be kept.

I just wanted to demonstrate method of doing this by passing down the
HASH_* flags, and found that writing the state-machine to "buffer ahead"
by one line so that we can eventually know in the loop if we're in the
"last" line or not was tedious, so I came up with this POC. But we
clearly shouldn't lose the "streaming" aspect.

But anyway, now that I look at this again the smart thing here (surely?)
is to keep the simple getline() loop and not ever issue a
HASH_N_OBJECTS_LAST for the Nth item, instead we should in this case do
the "checkpoint fsync" at the point that we write the actual index.

Because an existing redundancy in your series is that you'll do the
fsync() the same way for "git unpack-objects" as for "git
{update-index,add}".

I.e. in the former case adding the N objects is all we're doing, so the
"last object" is the point at which we need to flush the previous N to
disk.

But for "update-index/add" you'll do at least 2 fsync()'s in the bulk
mode, when it should be one. I.e. the equivalent of (leaving aside the
tmp-objdir migration part of it), if writing objects A && B:

    ## METHOD ONE
    # A
    write(objects/A.tmp)
    bulk_fsync(objects/A.tmp)
    rename(objects/A.tmp, objects/A)
    # B
    write(objects/B.tmp)
    bulk_fsync(objects/B.tmp)
    rename(objects/B.tmp, objects/B)
    # "cookie"
    write(bulk_fsync_XXXXXX)
    fsync(bulk_fsync_XXXXXX)
    # ref
    write(INDEX.tmp, $(git rev-parse B))
    fsync(INDEX.tmp)
    rename(INDEX.tmp, INDEX)

This series on top changes that so we know that we're doing N, so we
don't need the seperate "cookie", we can just use the B object as the
cookie, as we know it comes last;

    ## METHOD TWO
    # A -- SAME as above
    write(objects/A.tmp)
    bulk_fsync(objects/A.tmp)
    rename(objects/A.tmp, objects/A)
    # B -- SAME as above, with s/bulk_fsync/fsync/
    write(objects/B.tmp)
    fsync(objects/B.tmp)
    rename(objects/B.tmp, objects/B)
    # "cookie" -- GONE!
    # ref -- SAME
    write(INDEX.tmp, $(git rev-parse B))
    fsync(INDEX.tmp)
    rename(INDEX.tmp, INDEX)

But really, we should instead realize that we're not doing
"unpack-objects", but have a "ref update" at the end (whether that's a
ref, or an index etc.) and do:

    ## METHOD THREE
    # A -- SAME as above
    write(objects/A.tmp)
    bulk_fsync(objects/A.tmp)
    rename(objects/A.tmp, objects/A)
    # B -- SAME as the first
    write(objects/B.tmp)
    bulk_fsync(objects/B.tmp)
    rename(objects/B.tmp, objects/B)
    # ref -- SAME
    write(INDEX.tmp, $(git rev-parse B))
    fsync(INDEX.tmp)
    rename(INDEX.tmp, INDEX)

Which cuts our number of fsync() operations down from 2 to 1, ina
addition to removing the need for the "cookie", which is only there
because we didn't keep track of where we were in the sequence as in my
2/7 and 5/7.

And it would be the same for tmp-objdir, the rename dance is a bit
different, but we'd do the "full" fsync() while on the INDEX.tmp, then
migrate() the tmp-objdir, and once that's done do the final:

    rename(INDEX.tmp, INDEX)

I.e. we'd fsync() the content once, and only have the renme() or link()
operations left. For POSIX we'd need a few more fsync() for the
metadata, but this (i.e. your) series already makes the hard assumption
that we don't need to do that for rename().

> I think the code you've presented here is a lot of diff to accomplish
> the same thing that my series does, where this specific update-index
> caller has been roto-tilled to provide the needed
> begin/end-transaction points.

Any caller of these APIs will need the "unsigned oflags" sooner than
later anyway, as they need to pass down e.g. HASH_WRITE_OBJECT. We just
do it slightly earlier.

And because of that in the general case it's really not the same, I
think it's a better approach. You've already got the bug in yours of
needlessly setting up the bulk checkin for !HASH_WRITE_OBJECT in
update-index, which this neatly solves by deferring the "bulk" mechanism
until the codepath that's past that and into the "real" object writing.

We can also die() or error out in the object writing before ever getting
to writing the object, in which case we'd do some setup that we'd need
to tear down again, by deferring it until the last moment...

> And I think there will be a lot of
> complexity in supporting the same hints for command-line additions
> (which is roughly equivalent to the git-add workflow).

I left that out due to Junio's comment in
https://lore.kernel.org/git/xmqqzgljyz34.fsf@gitster.g/; i.e. I don't
see why we'd find it worthwhile to optimize that case, but we easily
could (especially per the "just sync the INDEX.tmp" above).

But even if we don't do "THREE" above I think it's still easy, for "TWO"
we already have as parse_options() state machine to parse argv as it
comes in. Doing the fsync() on the last object is just a matter of
"looking ahead" there).

> Every caller
> that wants batch treatment will have to either implement a state
> machine or implement a buffering mechanism in order to figure out the
> begin-end points. Having a separate plug/unplug call eliminates this
> complexity on each caller.

This is subjective, but I really think that's rather easy to do, and
much easier to reason about than the global state on the side via
singletons that your method of avoiding modifying these callers and
instead having them all consult global state via bulk-checkin.c and
cache.h demands.

That API also currently assumes single-threaded writers, if we start
writing some of this in parallel in e.g. "unpack-objects" we'd need
mutexes in bulk-object.[ch]. Isn't that a lot easier when the caller
would instead know something about the special nature of the transaction
they're interacting with, and that the 1st and last item are important
(for a "BEGIN" and "FLUSH").

> Btw, I'm planning in a future series to reduce the system calls
> involved in renaming a file by taking advantage of the renameat2
> system call and equivalents on other platforms.  There's a pretty
> strong motivation to do that on Windows.

What do you have in mind for renameat2() specifically?  I.e. which of
the 3x flags it implements will benefit us? RENAME_NOREPLACE to "move"
the tmp_OBJ to an eventual OBJ?

Generally: There's some low-hanging fruit there. E.g. for tmp-objdir we
slavishly go through the motion of writing an tmp_OBJ, writing (and
possibly syncing it), then renaming that tmp_OBJ to OBJ.

We could clearly just avoid that in some/all cases that use
tmp-objdir. I.e. we're writing to a temporary store anyway, so why the
tmp_OBJ files? We could just write to the final destinations instead,
they're not reachable (by ref or OID lookup) from anyone else yet.

But even then I don't see how you'd get away with reducing some classes
of syscalls past the 2x increase for some (leading an overall increase,
but not a ~2x overall increase as noted in:
https://lore.kernel.org/git/RFC-patch-7.7-481f1d771cb-20220323T033928Z-avarab@gmail.com/)
as long as you use the tmp-objdir API. It's always going to have to
write tmpdir/OBJ and link()/rename() that to OBJ.

Now, I do think there's an easy way by extending the API use I've
introduced in this RFC to do it. I.e. we'd just do:

    ## METHOD FOUR
    # A -- SAME as THREE, except no rename()
    write(objects/A.tmp)
    bulk_fsync(objects/A.tmp)
    # B -- SAME as THREE, except no rename()
    write(objects/B.tmp)
    bulk_fsync(objects/B.tmp)
    # ref -- SAME
    write(INDEX.tmp, $(git rev-parse B))
    fsync(INDEX.tmp)
    # NEW: do all the renames at the end:
    rename(objects/A.tmp, objects/A)
    rename(objects/B.tmp, objects/B)
    rename(INDEX.tmp, INDEX)

That seems like an obvious win to me in any case. I.e. the tmp-objdir
API isn't really a close fit for what we *really* want to do in this
case.

I.e. the reason it does everything this way is because it was explicitly
designed for 722ff7f876c (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), where it's the right trade-off,
because we'd like to cheaply "rm -rf" the whole thing if e.g. the
"pre-receive" hook rejects the push.

*AND* because it's made for the case of other things concurrently
needing access to those objects. So pedantically you would need it for
some modes of "git update-index", but not e.g. "git unpack-objects"
where we really are expecting to keep all of them.

> Thanks for the concrete code,

..but no thanks? I.e. it would be useful to explicitly know if you're
interested or open to running with some of the approach in this RFC.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
@ 2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
  2022-03-24  2:04       ` Neeraj Singh
  3 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 13:26 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
> [..
> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index 889522956e4..a3798dfc334 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -628,6 +628,13 @@ core.fsyncMethod::
>  * `writeout-only` issues pagecache writeback requests, but depending on the
>    filesystem and storage hardware, data added to the repository may not be
>    durable in the event of a system crash. This is the default mode on macOS.
> +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> +  updates in the disk writeback cache and then does a single full fsync of
> +  a dummy file to trigger the disk cache flush at the end of the operation.

I think adding a \n\n here would help make this more readable & break
the flow a bit. I.e. just add a "+" on its own line, followed by
"Currently...

> +  Currently `batch` mode only applies to loose-object files. Other repository
> +  data is made durable as if `fsync` was specified. This mode is expected to
> +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
> +  and on Windows for repos stored on NTFS or ReFS filesystems.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c
  2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                     ` (6 preceding siblings ...)
  2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                   ` Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
                                       ` (6 more replies)
  7 siblings, 7 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Quite a bit less WIP-y but still RFC version of patcehs to
integrate/squash into some form in Neeraj's fsync() series at
https://lore.kernel.org/git/pull.1134.v2.git.1647760560.gitgitgadget@gmail.com/

As noted in the v1 this starts (in 2/7) by removing the tmp-objdir
part of the "bulk checkin" as a POC. Clearly Neeraj wants to keep it,
so we should have it eventually. But this series argues that in both
patch organization and configurability (see the new 7/7!) that the "do
quarantine" should be split up and optional to "do bulk fsync".

The new documentation in 7/7 currently documents a vaporware setting,
but it's what we'd get if this were rebased early into Neeraj's
series, and we made the tmp-objdir part contingent on a configuration
setting.

Unlike v1 (I overzealously ripped out some unrelated bulk-checkin.c
code then) this doesn't fail any tests.

But most importantly the whole fsync() schema here is *much better* in
terms of semantics. We still do away with the "cookie" placeholder to
force an fsync, but now as can be seen in the 4/7 and 5/7 we'll
"fsync()" by using the updated index at the end as our cookie.

I.e. there's no need to introduce a "bulk_fsync" cookie file to force
an fsync() if we can instead alter the relevant calling code to be
aware of the new "fsync() transaction". It can then do the "flush" by
doing the fsync() on the file it wanted to update anyway (now an
index, in the future a ref). So this implements the "METHOD THREE"
noted in [1].



1. https://lore.kernel.org/git/220323.86sfr9ndpr.gmgdl@evledraar.gmail.com/

Ævar Arnfjörð Bjarmason (7):
  unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  object-file: pass down unpack-objects.c flags for "bulk" checkin
  update-index: pass down skeleton "oflags" argument
  update-index: have the index fsync() flush the loose objects
  add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync()
  fsync docs: update for new syncing semantics
  fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old

 Documentation/config/core.txt | 101 +++++++++++++++++++++++++++++-----
 builtin/add.c                 |   6 +-
 builtin/unpack-objects.c      |  62 +++++++++++----------
 builtin/update-index.c        |  39 ++++++-------
 bulk-checkin.c                |  74 -------------------------
 bulk-checkin.h                |   3 -
 cache.h                       |  10 ++--
 object-file.c                 |  39 +++++++++----
 read-cache.c                  |  37 ++++++++++++-
 9 files changed, 214 insertions(+), 157 deletions(-)

Range-diff against v1:
1:  e03c119c784 < -:  ----------- write-or-die.c: remove unused fsync_component() function
2:  00dbffc2331 = 1:  98921aa2052 unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
3:  beda9f99529 ! 2:  c6f776fc2bc object-file: pass down unpack-objects.c flags for "bulk" checkin
    @@ Commit message
         the previous case of fsync_component_or_die(...)" could just be added
         to the existing "fsync_object_files > 0" branch.
     
    -    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Note: This commit reverts much of "core.fsyncmethod: batched disk
    +    flushes for loose-objects". We'll set up new structures to bring what
    +    it was doing back in a different way. I.e. to do the tmp-objdir
    +    plug-in in object-file.c
     
    - ## builtin/add.c ##
    -@@ builtin/add.c: int cmd_add(int argc, const char **argv, const char *prefix)
    - 		string_list_clear(&only_match_skip_worktree, 0);
    - 	}
    - 
    --	plug_bulk_checkin();
    --
    - 	if (add_renormalize)
    - 		exit_status |= renormalize_tracked_files(&pathspec, flags);
    - 	else
    -@@ builtin/add.c: int cmd_add(int argc, const char **argv, const char *prefix)
    - 
    - 	if (chmod_arg && pathspec.nr)
    - 		exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
    --	unplug_bulk_checkin();
    - 
    - finish:
    - 	if (write_locked_index(&the_index, &lock_file,
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/unpack-objects.c ##
     @@ builtin/unpack-objects.c: static void unpack_all(void)
    @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const
     
      ## bulk-checkin.c ##
     @@
    +  */
    + #include "cache.h"
    + #include "bulk-checkin.h"
    +-#include "lockfile.h"
    + #include "repository.h"
    + #include "csum-file.h"
      #include "pack.h"
      #include "strbuf.h"
    - #include "string-list.h"
    +-#include "string-list.h"
     -#include "tmp-objdir.h"
      #include "packfile.h"
      #include "object-store.h"
    @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
      		       int fd, size_t size, enum object_type type,
      		       const char *path, unsigned flags)
     @@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
    - 		finish_bulk_checkin(&bulk_checkin_state);
    - 	return status;
    - }
    --
    --void plug_bulk_checkin(void)
    --{
    --	assert(!bulk_checkin_plugged);
    + void plug_bulk_checkin(void)
    + {
    + 	assert(!bulk_checkin_plugged);
     -
     -	/*
     -	 * A temporary object directory is used to hold the files
    @@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
     -		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
     -	}
     -
    --	bulk_checkin_plugged = 1;
    --}
    --
    --void unplug_bulk_checkin(void)
    --{
    --	assert(bulk_checkin_plugged);
    --	bulk_checkin_plugged = 0;
    --	if (bulk_checkin_state.f)
    --		finish_bulk_checkin(&bulk_checkin_state);
    + 	bulk_checkin_plugged = 1;
    + }
    + 
    +@@ bulk-checkin.c: void unplug_bulk_checkin(void)
    + 	bulk_checkin_plugged = 0;
    + 	if (bulk_checkin_state.f)
    + 		finish_bulk_checkin(&bulk_checkin_state);
     -
     -	do_batch_fsync();
    --}
    + }
     
      ## bulk-checkin.h ##
     @@
    @@ bulk-checkin.h
      int index_bulk_checkin(struct object_id *oid,
      		       int fd, size_t size, enum object_type type,
      		       const char *path, unsigned flags);
    - 
    --void plug_bulk_checkin(void);
    --void unplug_bulk_checkin(void);
    --
    - #endif
     
      ## cache.h ##
    -@@ cache.h: void write_or_die(int fd, const void *buf, size_t count);
    - void fsync_or_die(int fd, const char *);
    +@@ cache.h: void fsync_or_die(int fd, const char *);
    + int fsync_component(enum fsync_component component, int fd);
      void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
      
     -static inline int batch_fsync_enabled(enum fsync_component component)
    @@ object-file.c: static int write_loose_object(const struct object_id *oid, char *
      
      	if (mtime) {
      		struct utimbuf utb;
    -
    - ## t/t1050-large.sh ##
    -@@ t/t1050-large.sh: test_description='adding and checking out large blobs'
    - 
    - . ./test-lib.sh
    - 
    -+skip_all='TODO: migrate the builtin/add.c code'
    -+test_done
    -+
    - test_expect_success setup '
    - 	# clone does not allow us to pass core.bigfilethreshold to
    - 	# new repos, so set core.bigfilethreshold globally
5:  a1474968991 ! 3:  4df8012100a update-index: pass down an "oflags" argument
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    update-index: pass down an "oflags" argument
    +    update-index: pass down skeleton "oflags" argument
     
    -    We do nothing with this yet, but will soon.
    +    As with a preceding change to "unpack-objects" add an "oflags" going
    +    from cmd_update_index() all the way down to the code in
    +    object-file.c. Note also how index_mem() will now call
    +    write_object_file_flags().
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ builtin/update-index.c: static int do_reupdate(int ac, const char **av,
      		free(path);
      		discard_cache_entry(old);
      		if (save_nr != active_nr)
    -@@ builtin/update-index.c: static enum parse_opt_result reupdate_callback(
    - 
    - static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
    - 			    const char *prefix, int prefix_length,
    --			    const int nul_term_line, const int set_executable_bit)
    -+			    const int nul_term_line, const int set_executable_bit,
    -+			    const unsigned oflags)
    - {
    - 	char *p;
    - 
    -@@ builtin/update-index.c: static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
    - 		strbuf_swap(buf, unquoted);
    - 	}
    - 	p = prefix_path(prefix, prefix_length, buf->buf);
    --	update_one(p);
    -+	update_one(p, oflags);
    - 	if (set_executable_bit)
    - 		chmod_path(set_executable_bit, p);
    - 	free(p);
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
      
      			setup_work_tree();
    @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const
      				chmod_path(set_executable_bit, p);
      			free(p);
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    - 		setup_work_tree();
    - 		while (getline_fn(&buf, stdin) != EOF)
    - 			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
    --					nul_term_line, set_executable_bit);
    -+					nul_term_line, set_executable_bit, 0);
    - 		strbuf_release(&unquoted);
    - 		strbuf_release(&buf);
    - 	}
    + 				strbuf_swap(&buf, &unquoted);
    + 			}
    + 			p = prefix_path(prefix, prefix_length, buf.buf);
    +-			update_one(p);
    ++			update_one(p, 0);
    + 			if (set_executable_bit)
    + 				chmod_path(set_executable_bit, p);
    + 			free(p);
     
      ## object-file.c ##
     @@ object-file.c: static int index_mem(struct index_state *istate,
7:  481f1d771cb ! 4:  61f4f3d7ef4 update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
    +    update-index: have the index fsync() flush the loose objects
     
         As with unpack-objects in a preceding commit have update-index.c make
         use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
    @@ Commit message
         Adding the t/* directory from git.git on a Linux ramdisk is a bit
         faster than with the tmp-objdir indirection:
     
    -            git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
    -            Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
    -              Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
    -              Range (min … max):   285.6 ms … 297.0 ms    10 runs
    +            $ git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' -p 'rm -rf repo/.git/objects/* repo/.git/index' './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' --warmup 1 -r 10Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync
    +              Time (mean ± σ):     281.1 ms ±   2.6 ms    [User: 186.2 ms, System: 92.3 ms]
    +              Range (min … max):   278.3 ms … 287.0 ms    10 runs
     
    -            Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
    -              Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
    -              Range (min … max):   267.8 ms … 291.3 ms    10 runs
    +            Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD
    +              Time (mean ± σ):     265.9 ms ±   2.6 ms    [User: 181.7 ms, System: 82.1 ms]
    +              Range (min … max):   262.0 ms … 270.3 ms    10 runs
     
                 Summary
    -              'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
    -                1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
    +              './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
    +                1.06 ± 0.01 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'
     
         And as before running that with "strace --summary-only" slows things
         down a bit (probably mimicking slower I/O a bit). I then get:
     
                 Summary
    -              'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
    -                1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
    +              'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
    +                1.19 ± 0.03 times faster than 'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'
    +
    +    This one has a twist though, instead of fsync()-ing on the last object
    +    we write we'll not do that, and instead defer the fsync() until we
    +    write the index itself. This is outlined in [1] (as "METHOD THREE").
    +
    +    Because of this under FSYNC_METHOD_BATCH we'll do the N
    +    objects (possibly only one, because we're lazy) as HASH_N_OBJECTS, and
    +    we'll even now support doing this via N arguments on the command-line.
    +
    +    Then we won't fsync() any of it, but we will rename it
    +    in-place (which, if we were still using the tmp-objdir, would leave it
    +    "staged" in the tmp-objdir).
    +
    +    We'll then have the fsync() for the index update "flush" that out, and
    +    thus avoid two fsync() calls when one will do.
    +
    +    Running this with the "git hyperfine" command mentioned in a preceding
    +    commit with "strace --summary-only" shows that we do 1 fsync() now
    +    instead of 2, and have one more sync_file_range(), as expected.
     
         We also go from ~51k syscalls to ~39k, with ~2x the number of link()
    -    and unlink() in ns/batched-fsync.
    +    and unlink() in ns/batched-fsync, and of course one fsync() instead of
    +    two()>
     
    -    In the process of doing this conversion we lost the "bulk" mode for
    -    files added on the command-line. I don't think it's useful to optimize
    -    that, but we could if anyone cared.
    +    The flow of this code isn't quite set up for re-plugging the
    +    tmp-objdir back in. In particular we no longer pass
    +    HASH_N_OBJECTS_FIRST (but doing so would be trivial)< and there's no
    +    HASH_N_OBJECTS_LAST.
     
    -    We've also converted this to a string_list, we could walk with
    -    getline_fn() and get one line "ahead" to see what we have left, but I
    -    found that state machine a bit painful, and at least in my testing
    -    buffering this doesn't harm things. But we could also change this to
    -    stream again, at the cost of some getline_fn() twiddling.
    +    So this and other callers would need some light transaction-y API, or
    +    to otherwise pass down a "yes, I'd like to flush it" down to
    +    finalize_hashfile(), but doing so will be trivial.
    +
    +    And since we've started structuring it this way it'll become easy to
    +    do any arbitrary number of things down the line that would "bulk
    +    fsync" before the final fsync(). Now we write some objects and fsync()
    +    on the index, but between those two could do any number of other
    +    things where we'd defer the fsync().
    +
    +    This sort of thing might be especially interesting for "git repack"
    +    when it writes e.g. a *.bitmap, *.rev, *.pack and *.idx. In that case
    +    we could skip the fsync() on all of those, and only do it on the *.idx
    +    before we renamed it in-place. I *think* nothing cares about a *.pack
    +    without an *.idx, but even then we could fsync *.idx, rename *.pack,
    +    rename *.idx and still safely do only one fsync(). See "git show
    +    --first-parent" on 62874602032 (Merge branch
    +    'tb/pack-finalize-ordering' into maint, 2021-10-12) for a good
    +    overview of the code involved in that.
    +
    +    1. https://lore.kernel.org/git/220323.86sfr9ndpr.gmgdl@evledraar.gmail.com/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/update-index.c ##
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    + 
    + 			setup_work_tree();
    + 			p = prefix_path(prefix, prefix_length, path);
    +-			update_one(p, 0);
    ++			update_one(p, HASH_N_OBJECTS);
    + 			if (set_executable_bit)
    + 				chmod_path(set_executable_bit, p);
    + 			free(p);
    +@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    + 				strbuf_swap(&buf, &unquoted);
    + 			}
    + 			p = prefix_path(prefix, prefix_length, buf.buf);
    +-			update_one(p, 0);
    ++			update_one(p, HASH_N_OBJECTS);
    + 			if (set_executable_bit)
    + 				chmod_path(set_executable_bit, p);
    + 			free(p);
    +@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    + 				exit(128);
    + 			unable_to_lock_die(get_index_file(), lock_error);
    + 		}
    +-		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
    ++		if (write_locked_index(&the_index, &lock_file,
    ++				       COMMIT_LOCK | WLI_NEED_LOOSE_FSYNC))
    + 			die("Unable to write new index file");
      	}
      
    - 	if (read_from_stdin) {
    -+		struct string_list list = STRING_LIST_INIT_NODUP;
    - 		struct strbuf line = STRBUF_INIT;
    - 		struct strbuf unquoted = STRBUF_INIT;
    -+		size_t i, nr;
    -+		unsigned oflags;
    +
    + ## cache.h ##
    +@@ cache.h: void ensure_full_index(struct index_state *istate);
    + /* For use with `write_locked_index()`. */
    + #define COMMIT_LOCK		(1 << 0)
    + #define SKIP_IF_UNCHANGED	(1 << 1)
    ++#define WLI_NEED_LOOSE_FSYNC	(1 << 2)
      
    - 		setup_work_tree();
    --		while (getline_fn(&line, stdin) != EOF)
    --			line_from_stdin(&line, &unquoted, prefix, prefix_length,
    --					nul_term_line, set_executable_bit, 0);
    -+		while (getline_fn(&line, stdin) != EOF) {
    -+			size_t len = line.len;
    -+			char *str = strbuf_detach(&line, NULL);
    -+
    -+			string_list_append_nodup(&list, str)->util = (void *)len;
    -+		}
    + /*
    +  * Write the index while holding an already-taken lock. Close the lock,
    +
    + ## read-cache.c ##
    +@@ read-cache.c: static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
    + 	int ieot_entries = 1;
    + 	struct index_entry_offset_table *ieot = NULL;
    + 	int nr, nr_threads;
    ++	unsigned int wflags = FSYNC_COMPONENT_INDEX;
     +
    -+		nr = list.nr;
    -+		oflags = nr > 1 ? HASH_N_OBJECTS : 0;
    -+		for (i = 0; i < nr; i++) {
    -+			size_t nth = i + 1;
    -+			unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
    -+				  nr == nth ? HASH_N_OBJECTS_LAST : 0;
    -+			struct strbuf buf = STRBUF_INIT;
    -+			struct string_list_item *item = list.items + i;
    -+			const size_t len = (size_t)item->util;
     +
    -+			strbuf_attach(&buf, item->string, len, len);
    -+			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
    -+					nul_term_line, set_executable_bit,
    -+					oflags | f);
    -+			strbuf_release(&buf);
    -+		}
    - 		strbuf_release(&unquoted);
    --		strbuf_release(&line);
    -+		string_list_clear(&list, 0);
    - 	}
    ++	/*
    ++	 * TODO: This is abuse of the API recently modified
    ++	 * finalize_hashfile() which reveals a shortcoming of its
    ++	 * "fsync" design.
    ++	 * 
    ++	 * I.e. It expects a "enum fsync_component component" label,
    ++	 * but here we're passing it an OR of the two, knowing that
    ++	 * it'll call fsync_component_or_die() which (in
    ++	 * write-or-die.c) will do "(fsync_components & wflags)" (to
    ++	 * our "wflags" here).
    ++	 *
    ++	 * But the API really should be changed to explicitly take
    ++	 * such flags, because in this case we'd like to fsync() the
    ++	 * index if we're in the bulk mode, *even if* our
    ++	 * "core.fsync=index" isn't configured.
    ++	 *
    ++	 * That's because at this point we've been queuing up object
    ++	 * writes that we didn't fsync(), and are going to use this
    ++	 * fsync() to "flush" the whole thing. Doing it this way
    ++	 * avoids redundantly calling fsync() twice when once will do.
    ++	 */
    ++	if (fsync_method == FSYNC_METHOD_BATCH && 
    ++	    flags & WLI_NEED_LOOSE_FSYNC)
    ++		wflags |= FSYNC_COMPONENT_LOOSE_OBJECT;
    + 
    + 	f = hashfd(tempfile->fd, tempfile->filename.buf);
    + 
    +@@ read-cache.c: static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
    + 	if (!alternate_index_output && (flags & COMMIT_LOCK))
    + 		csum_fsync_flag = CSUM_FSYNC;
    + 
    +-	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
    ++	finalize_hashfile(f, istate->oid.hash, wflags,
    + 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
      
    - 	if (split_index > 0) {
    + 	if (close_tempfile_gently(tempfile)) {
-:  ----------- > 5:  2bf14fd4946 add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync()
6:  4fad333e9a1 ! 6:  c20301d7967 update-index: rename "buf" to "line"
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    update-index: rename "buf" to "line"
    -
    -    This variable renaming makes a subsequent more meaningful change
    -    smaller.
    +    fsync docs: update for new syncing semantics
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## builtin/update-index.c ##
    -@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    - 	}
    - 
    - 	if (read_from_stdin) {
    --		struct strbuf buf = STRBUF_INIT;
    -+		struct strbuf line = STRBUF_INIT;
    - 		struct strbuf unquoted = STRBUF_INIT;
    - 
    - 		setup_work_tree();
    --		while (getline_fn(&buf, stdin) != EOF)
    --			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
    -+		while (getline_fn(&line, stdin) != EOF)
    -+			line_from_stdin(&line, &unquoted, prefix, prefix_length,
    - 					nul_term_line, set_executable_bit, 0);
    - 		strbuf_release(&unquoted);
    --		strbuf_release(&buf);
    -+		strbuf_release(&line);
    - 	}
    + ## Documentation/config/core.txt ##
    +@@ Documentation/config/core.txt: core.fsyncMethod::
    +   filesystem and storage hardware, data added to the repository may not be
    +   durable in the event of a system crash. This is the default mode on macOS.
    + * `batch` enables a mode that uses writeout-only flushes to stage multiple
    +-  updates in the disk writeback cache and then does a single full fsync of
    +-  a dummy file to trigger the disk cache flush at the end of the operation.
    +-  Currently `batch` mode only applies to loose-object files. Other repository
    +-  data is made durable as if `fsync` was specified. This mode is expected to
    +-  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
    +-  and on Windows for repos stored on NTFS or ReFS filesystems.
    ++  updates in the disk writeback cache and, before doing a full fsync() of
    ++  on the "last" file that to trigger the disk cache flush at the end of the
    ++  operation.
    +++
    ++Other repository data is made durable as if `fsync` was
    ++specified. This mode is expected to be as safe as `fsync` on macOS for
    ++repos stored on HFS+ or APFS filesystems and on Windows for repos
    ++stored on NTFS or ReFS filesystems.
    +++
    ++The `batch` is currently only applies to loose-object files and will
    ++kick in when using the linkgit:git-unpack-objects[1] and
    ++linkgit:update-index[1] commands. Note that the "last" file to be
    ++synced may be the last object, as in the case of
    ++linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
    ++"ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
    ++batch syncing of the loose objects may be deferred until a subsequent
    ++fsync() to a file that makes them "active".
      
    - 	if (split_index > 0) {
    + core.fsyncObjectFiles::
    + 	This boolean will enable 'fsync()' when writing object files.
4:  2c5395a3716 ! 7:  a5951366c6e update-index: use a utility function for stdin consumption
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    update-index: use a utility function for stdin consumption
    +    fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old
    +
    +    Add a new fsyncMethod.batch.quarantine setting which defaults to
    +    "false". Preceding (RFC, and not meant to flip-flop like that
    +    eventually) commits ripped out the "tmp-objdir" part of the
    +    core.fsyncMethod=batch.
    +
    +    This documentation proposes to keep that as the default for the
    +    reasons discussed in it, while allowing users to set
    +    "fsyncMethod.batch.quarantine=true".
    +
    +    Furthermore update the discussion of "core.fsyncObjectFiles" with
    +    information about what it *really* does, why you probably shouldn't
    +    use it, and how to safely emulate most of what it gave users in the
    +    past in terms of performance benefit.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## builtin/update-index.c ##
    -@@ builtin/update-index.c: static enum parse_opt_result reupdate_callback(
    - 	return 0;
    - }
    + ## Documentation/config/core.txt ##
    +@@ Documentation/config/core.txt: stored on NTFS or ReFS filesystems.
    + +
    + The `batch` is currently only applies to loose-object files and will
    + kick in when using the linkgit:git-unpack-objects[1] and
    +-linkgit:update-index[1] commands. Note that the "last" file to be
    ++linkgit:git-update-index[1] commands. Note that the "last" file to be
    + synced may be the last object, as in the case of
    + linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
    + "ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
    + batch syncing of the loose objects may be deferred until a subsequent
    + fsync() to a file that makes them "active".
      
    -+static void line_from_stdin(struct strbuf *buf, struct strbuf *unquoted,
    -+			    const char *prefix, int prefix_length,
    -+			    const int nul_term_line, const int set_executable_bit)
    -+{
    -+	char *p;
    -+
    -+	if (!nul_term_line && buf->buf[0] == '"') {
    -+		strbuf_reset(unquoted);
    -+		if (unquote_c_style(unquoted, buf->buf, NULL))
    -+			die("line is badly quoted");
    -+		strbuf_swap(buf, unquoted);
    -+	}
    -+	p = prefix_path(prefix, prefix_length, buf->buf);
    -+	update_one(p);
    -+	if (set_executable_bit)
    -+		chmod_path(set_executable_bit, p);
    -+	free(p);
    -+}
    ++fsyncMethod.batch.quarantine::
    ++	A boolean which if set to `true` will cause "batched" writes
    ++	to objects to be "quarantined" if
    ++	`core.fsyncMethod=batch`. This is `false` by default.
    +++
    ++The primary object of these fsync() settings is to protect against
    ++repository corruption of things which are reachable, i.e. "reachable",
    ++via references, the index etc. Not merely objects that were present in
    ++the object store.
    +++
    ++Historically setting `core.fsyncObjectFiles=false` assumed that on a
    ++filesystem with where an fsync() would flush all preceding outstanding
    ++I/O that we might end up with a corrupt loose object, but that was OK
    ++as long as no reference referred to it. We'd eventually the corrupt
    ++object with linkgit:git-gc[1], and linkgit:git-fsck[1] would only
    ++report it as a minor annoyance
    +++
    ++Setting `fsyncMethod.batch.quarantine=true` takes the view that
    ++something like a corrupt *unreferenced* loose object in the object
    ++store is something we'd like to avoid, at the cost of reduced
    ++performance when using `core.fsyncMethod=batch`.
    +++
    ++Currently this uses the same mechanism described in the "QUARANTINE
    ++ENVIRONMENT" in the linkgit:git-receive-pack[1] documentation, but
    ++that's subject to change. The performance loss is because we need to
    ++"stage" the objects in that quarantine environment, fsync() it, and
    ++once that's done rename() or link() it in-place into the main object
    ++store, possibly with an fsync() of the index or ref at the end
    +++
    ++With `fsyncMethod.batch.quarantine=false` we'll "stage" things in the
    ++main object store, and then do one fsync() at the very end, either on
    ++the last object we write, or file (index or ref) that'll make it
    ++"reachable".
    +++
    ++The bad thing about setting this to `true` is lost performance, as
    ++well as not being able to access the objects as they're written (which
    ++e.g. consumers of linkgit:git-update-index[1]'s `--verbose` mode might
    ++want to do).
    +++
    ++The good thing is that you should be guaranteed not to get e.g. short
    ++or otherwise corrupt loose objects if you pull your power cord, in
    ++practice various git commands deal quite badly with discovering such a
    ++stray corrupt object (including perhaps assuming it's valid based on
    ++its existence, or hard dying on an error rather than replacing
    ++it). Repairing such "unreachable corruption" can require manual
    ++intervention.
     +
    - int cmd_update_index(int argc, const char **argv, const char *prefix)
    - {
    - 	int newfd, entries, has_errors = 0, nul_term_line = 0;
    -@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
    - 		struct strbuf unquoted = STRBUF_INIT;
    + core.fsyncObjectFiles::
    +-	This boolean will enable 'fsync()' when writing object files.
    +-	This setting is deprecated. Use core.fsync instead.
    +-+
    +-This setting affects data added to the Git repository in loose-object
    +-form. When set to true, Git will issue an fsync or similar system call
    +-to flush caches so that loose-objects remain consistent in the face
    +-of a unclean system shutdown.
    ++	This boolean will enable 'fsync()' when writing loose object
    ++	files.
    +++
    ++This setting is the historical fsync configuration setting. It's now
    ++*deprecated*, you should use `core.fsync` instead, perhaps in
    ++combination with `core.fsyncMethod=batch`.
    +++
    ++The `core.fsyncObjectFiles` was initially added based on integrity
    ++assumptions that early (pre-ext-4) versions of Linux's "ext"
    ++filesystems provided.
    +++
    ++I.e. that a write of file A without an `fsync()` followed by a write
    ++of file `B` with `fsync()` would implicitly guarantee that `A' would
    ++be `fsync()`'d by calling `fsync()` on `B`. This asssumption is *not*
    ++backed up by any standard (e.g. POSIX), but worked in practice on some
    ++Linux setups.
    +++
    ++Nowadays you should almost certainly want to use
    ++`core.fsync=loose-object` instead in combination with
    ++`core.fsyncMethod=bulk`, and possibly with
    ++`fsyncMethod.batch.quarantine=true`, see above. On modern OS's (Linux,
    ++OSX, Windows) that gives you most of the performance benefit of
    ++`core.fsyncObjectFiles=false` with all of the safety of the old
    ++`core.fsyncObjectFiles=true`.
      
    - 		setup_work_tree();
    --		while (getline_fn(&buf, stdin) != EOF) {
    --			char *p;
    --			if (!nul_term_line && buf.buf[0] == '"') {
    --				strbuf_reset(&unquoted);
    --				if (unquote_c_style(&unquoted, buf.buf, NULL))
    --					die("line is badly quoted");
    --				strbuf_swap(&buf, &unquoted);
    --			}
    --			p = prefix_path(prefix, prefix_length, buf.buf);
    --			update_one(p);
    --			if (set_executable_bit)
    --				chmod_path(set_executable_bit, p);
    --			free(p);
    --		}
    -+		while (getline_fn(&buf, stdin) != EOF)
    -+			line_from_stdin(&buf, &unquoted, prefix, prefix_length,
    -+					nul_term_line, set_executable_bit);
    - 		strbuf_release(&unquoted);
    - 		strbuf_release(&buf);
    - 	}
    + core.preloadIndex::
    + 	Enable parallel index preload for operations like 'git diff'
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 20:23                       ` Neeraj Singh
  2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
                                       ` (5 subsequent siblings)
  6 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

In preparation for making the bulk-checkin.c logic operate from
object-file.c itself in some common cases let's add
HASH_N_OBJECTS{,_{FIRST,LAST}} flags.

This will allow us to adjust for-loops that add N objects to just pass
down whether they have >1 objects (HASH_N_OBJECTS), as well as passing
down flags for whether we have the first or last object.

We'll thus be able to drive any sort of batch-object mechanism from
write_object_file_flags() directly, which until now didn't know if it
was doing one object, or some arbitrary N.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/unpack-objects.c | 60 +++++++++++++++++++++++-----------------
 cache.h                  |  3 ++
 2 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index c55b6616aed..ec40c6fd966 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -233,7 +233,8 @@ static void write_rest(void)
 }
 
 static void added_object(unsigned nr, enum object_type type,
-			 void *data, unsigned long size);
+			 void *data, unsigned long size,
+			 unsigned oflags);
 
 /*
  * Write out nr-th object from the list, now we know the contents
@@ -241,21 +242,21 @@ static void added_object(unsigned nr, enum object_type type,
  * to be checked at the end.
  */
 static void write_object(unsigned nr, enum object_type type,
-			 void *buf, unsigned long size)
+			 void *buf, unsigned long size, unsigned oflags)
 {
 	if (!strict) {
-		if (write_object_file(buf, size, type,
-				      &obj_list[nr].oid) < 0)
+		if (write_object_file_flags(buf, size, type,
+				      &obj_list[nr].oid, oflags) < 0)
 			die("failed to write object");
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		free(buf);
 		obj_list[nr].obj = NULL;
 	} else if (type == OBJ_BLOB) {
 		struct blob *blob;
-		if (write_object_file(buf, size, type,
-				      &obj_list[nr].oid) < 0)
+		if (write_object_file_flags(buf, size, type,
+					    &obj_list[nr].oid, oflags) < 0)
 			die("failed to write object");
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		free(buf);
 
 		blob = lookup_blob(the_repository, &obj_list[nr].oid);
@@ -269,7 +270,7 @@ static void write_object(unsigned nr, enum object_type type,
 		int eaten;
 		hash_object_file(the_hash_algo, buf, size, type,
 				 &obj_list[nr].oid);
-		added_object(nr, type, buf, size);
+		added_object(nr, type, buf, size, oflags);
 		obj = parse_object_buffer(the_repository, &obj_list[nr].oid,
 					  type, size, buf,
 					  &eaten);
@@ -283,7 +284,7 @@ static void write_object(unsigned nr, enum object_type type,
 
 static void resolve_delta(unsigned nr, enum object_type type,
 			  void *base, unsigned long base_size,
-			  void *delta, unsigned long delta_size)
+			  void *delta, unsigned long delta_size, unsigned oflags)
 {
 	void *result;
 	unsigned long result_size;
@@ -294,7 +295,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
 	if (!result)
 		die("failed to apply delta");
 	free(delta);
-	write_object(nr, type, result, result_size);
+	write_object(nr, type, result, result_size, oflags);
 }
 
 /*
@@ -302,7 +303,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
  * resolve all the deltified objects that are based on it.
  */
 static void added_object(unsigned nr, enum object_type type,
-			 void *data, unsigned long size)
+			 void *data, unsigned long size, unsigned oflags)
 {
 	struct delta_info **p = &delta_list;
 	struct delta_info *info;
@@ -313,7 +314,7 @@ static void added_object(unsigned nr, enum object_type type,
 			*p = info->next;
 			p = &delta_list;
 			resolve_delta(info->nr, type, data, size,
-				      info->delta, info->size);
+				      info->delta, info->size, oflags);
 			free(info);
 			continue;
 		}
@@ -322,18 +323,19 @@ static void added_object(unsigned nr, enum object_type type,
 }
 
 static void unpack_non_delta_entry(enum object_type type, unsigned long size,
-				   unsigned nr)
+				   unsigned nr, unsigned oflags)
 {
 	void *buf = get_data(size);
 
 	if (!dry_run && buf)
-		write_object(nr, type, buf, size);
+		write_object(nr, type, buf, size, oflags);
 	else
 		free(buf);
 }
 
 static int resolve_against_held(unsigned nr, const struct object_id *base,
-				void *delta_data, unsigned long delta_size)
+				void *delta_data, unsigned long delta_size,
+				unsigned oflags)
 {
 	struct object *obj;
 	struct obj_buffer *obj_buffer;
@@ -344,12 +346,12 @@ static int resolve_against_held(unsigned nr, const struct object_id *base,
 	if (!obj_buffer)
 		return 0;
 	resolve_delta(nr, obj->type, obj_buffer->buffer,
-		      obj_buffer->size, delta_data, delta_size);
+		      obj_buffer->size, delta_data, delta_size, oflags);
 	return 1;
 }
 
 static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
-			       unsigned nr)
+			       unsigned nr, unsigned oflags)
 {
 	void *delta_data, *base;
 	unsigned long base_size;
@@ -366,7 +368,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		if (has_object_file(&base_oid))
 			; /* Ok we have this one */
 		else if (resolve_against_held(nr, &base_oid,
-					      delta_data, delta_size))
+					      delta_data, delta_size, oflags))
 			return; /* we are done */
 		else {
 			/* cannot resolve yet --- queue it */
@@ -428,7 +430,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		}
 	}
 
-	if (resolve_against_held(nr, &base_oid, delta_data, delta_size))
+	if (resolve_against_held(nr, &base_oid, delta_data, delta_size, oflags))
 		return;
 
 	base = read_object_file(&base_oid, &type, &base_size);
@@ -440,11 +442,11 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		has_errors = 1;
 		return;
 	}
-	resolve_delta(nr, type, base, base_size, delta_data, delta_size);
+	resolve_delta(nr, type, base, base_size, delta_data, delta_size, oflags);
 	free(base);
 }
 
-static void unpack_one(unsigned nr)
+static void unpack_one(unsigned nr, unsigned oflags)
 {
 	unsigned shift;
 	unsigned char *pack;
@@ -472,11 +474,11 @@ static void unpack_one(unsigned nr)
 	case OBJ_TREE:
 	case OBJ_BLOB:
 	case OBJ_TAG:
-		unpack_non_delta_entry(type, size, nr);
+		unpack_non_delta_entry(type, size, nr, oflags);
 		return;
 	case OBJ_REF_DELTA:
 	case OBJ_OFS_DELTA:
-		unpack_delta_entry(type, size, nr);
+		unpack_delta_entry(type, size, nr, oflags);
 		return;
 	default:
 		error("bad object type %d", type);
@@ -491,6 +493,7 @@ static void unpack_all(void)
 {
 	int i;
 	struct pack_header *hdr = fill(sizeof(struct pack_header));
+	unsigned oflags;
 
 	nr_objects = ntohl(hdr->hdr_entries);
 
@@ -505,9 +508,14 @@ static void unpack_all(void)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
 	plug_bulk_checkin();
+	oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
 	for (i = 0; i < nr_objects; i++) {
-		unpack_one(i);
-		display_progress(progress, i + 1);
+		int nth = i + 1;
+		unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
+			nr_objects == nth ? HASH_N_OBJECTS_LAST : 0;
+
+		unpack_one(i, oflags | f);
+		display_progress(progress, nth);
 	}
 	unplug_bulk_checkin();
 	stop_progress(&progress);
diff --git a/cache.h b/cache.h
index 84fafe2ed71..72c91c91286 100644
--- a/cache.h
+++ b/cache.h
@@ -896,6 +896,9 @@ int ie_modified(struct index_state *, const struct cache_entry *, struct stat *,
 #define HASH_FORMAT_CHECK 2
 #define HASH_RENORMALIZE  4
 #define HASH_SILENT 8
+#define HASH_N_OBJECTS 1<<4
+#define HASH_N_OBJECTS_FIRST 1<<5
+#define HASH_N_OBJECTS_LAST 1<<6
 int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
 int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
 
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 20:25                       ` Neeraj Singh
  2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
                                       ` (4 subsequent siblings)
  6 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Remove much of this as a POC for exploring some of what I mentioned in
https://lore.kernel.org/git/220322.86mthinxnn.gmgdl@evledraar.gmail.com/

This commit is obviously not what we *should* do as end-state, but
demonstrates what's needed (I think) for a bare-minimum implementation
of just the "bulk" syncing method for loose objects without the part
where we do the tmp-objdir.c dance.

Performance with this is already quite promising. Benchmarking with:

	git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' \
	    	-p 'rm -rf r.git && git init --bare r.git' \
		'./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack'

I.e. unpacking a small packfile (my dotfiles) yields, on a Linux
ramdisk:

	Benchmark 1: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync
	  Time (mean ± σ):     815.9 ms ±   8.2 ms    [User: 522.9 ms, System: 287.9 ms]
	  Range (min … max):   805.6 ms … 835.9 ms    10 runs

	Benchmark 2: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD
	  Time (mean ± σ):     779.4 ms ±  15.4 ms    [User: 505.7 ms, System: 270.2 ms]
	  Range (min … max):   763.1 ms … 813.9 ms    10 runs

	Summary
	  './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
	    1.05 ± 0.02 times faster than './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'

Doing the same with "strace --summary-only", which probably helps to
emulate cases with slower syscalls is ~15% faster than using the
tmp-objdir indirection:

	Summary
	  'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
	    1.16 ± 0.01 times faster than 'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'

Which makes sense in terms of syscalls. In my case HEAD has ~101k
calls, and the parent topic is making ~129k calls, with around 2x the
number of unlink(), link() as expected.

Of course some users will want to use the tmp-objdir.c method. So a
version of this commit could be rewritten to come earlier in the
series, with the "bulk" on top being optional.

It seems to me that it's a much better strategy to do this whole thing
in close_loose_object() after passing down the new HASH_N_OBJECTS /
HASH_N_OBJECTS_FIRST / HASH_N_OBJECTS_LAST flags.

Doing that for the "builtin/add.c" and "builtin/unpack-objects.c" code
having its {un,}plug_bulk_checkin() removed here is then just a matter
of passing down a similar set of flags indicating whether we're
dealing with N objects, and if so if we're dealing with the last one
or not.

As we'll see in subsequent commits doing it this way also effortlessly
integrates with other HASH_* flags. E.g. for "update-index" the code
being rm'd here doesn't handle the interaction with
"HASH_WRITE_OBJECT" properly, but once we've moved all this sync
bootstrapping logic to close_loose_object() we'll never get to it if
we're not actually writing something.

This code currently doesn't use the HASH_N_OBJECTS_FIRST flag, but
that's what we'd use later to optionally call tmp_objdir_create().

Aside: This also changes logic that was a bit confusing and repetitive
in close_loose_object(). Previously we'd first call
batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT) which is just as
shorthand for:

	fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT &&
	fsync_method == FSYNC_METHOD_BATCH

We'd then proceed to call
fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT) later in the same
function, which is just a way of calling fsync_or_die() if:

	fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT

Now we instead just define a local "fsync_loose" variable by checking
"fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT", which shows us that
the previous case of fsync_component_or_die(...)" could just be added
to the existing "fsync_object_files > 0" branch.

Note: This commit reverts much of "core.fsyncmethod: batched disk
flushes for loose-objects". We'll set up new structures to bring what
it was doing back in a different way. I.e. to do the tmp-objdir
plug-in in object-file.c

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/unpack-objects.c |  2 --
 builtin/update-index.c   |  4 ---
 bulk-checkin.c           | 74 ----------------------------------------
 bulk-checkin.h           |  3 --
 cache.h                  |  5 ---
 object-file.c            | 37 ++++++++++++++------
 6 files changed, 26 insertions(+), 99 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index ec40c6fd966..93da436581b 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -507,7 +507,6 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
-	plug_bulk_checkin();
 	oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
 	for (i = 0; i < nr_objects; i++) {
 		int nth = i + 1;
@@ -517,7 +516,6 @@ static void unpack_all(void)
 		unpack_one(i, oflags | f);
 		display_progress(progress, nth);
 	}
-	unplug_bulk_checkin();
 	stop_progress(&progress);
 
 	if (delta_list)
diff --git a/builtin/update-index.c b/builtin/update-index.c
index cbd2b0d633b..95ed3c47b2e 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1118,8 +1118,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	parse_options_start(&ctx, argc, argv, prefix,
 			    options, PARSE_OPT_STOP_AT_NON_OPTION);
 
-	/* optimize adding many objects to the object database */
-	plug_bulk_checkin();
 	while (ctx.argc) {
 		if (parseopt_state != PARSE_OPT_DONE)
 			parseopt_state = parse_options_step(&ctx, options,
@@ -1194,8 +1192,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
-	/* by now we must have added all of the new objects */
-	unplug_bulk_checkin();
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
diff --git a/bulk-checkin.c b/bulk-checkin.c
index a0dca79ba6a..577b135e39c 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -3,20 +3,15 @@
  */
 #include "cache.h"
 #include "bulk-checkin.h"
-#include "lockfile.h"
 #include "repository.h"
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
-#include "string-list.h"
-#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int bulk_checkin_plugged;
 
-static struct tmp_objdir *bulk_fsync_objdir;
-
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
@@ -85,40 +80,6 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
 	reprepare_packed_git(the_repository);
 }
 
-/*
- * Cleanup after batch-mode fsync_object_files.
- */
-static void do_batch_fsync(void)
-{
-	struct strbuf temp_path = STRBUF_INIT;
-	struct tempfile *temp;
-
-	if (!bulk_fsync_objdir)
-		return;
-
-	/*
-	 * Issue a full hardware flush against a temporary file to ensure
-	 * that all objects are durable before any renames occur. The code in
-	 * fsync_loose_object_bulk_checkin has already issued a writeout
-	 * request, but it has not flushed any writeback cache in the storage
-	 * hardware or any filesystem logs. This fsync call acts as a barrier
-	 * to ensure that the data in each new object file is durable before
-	 * the final name is visible.
-	 */
-	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
-	temp = xmks_tempfile(temp_path.buf);
-	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
-	delete_tempfile(&temp);
-	strbuf_release(&temp_path);
-
-	/*
-	 * Make the object files visible in the primary ODB after their data is
-	 * fully durable.
-	 */
-	tmp_objdir_migrate(bulk_fsync_objdir);
-	bulk_fsync_objdir = NULL;
-}
-
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -313,26 +274,6 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
-void prepare_loose_object_bulk_checkin(void)
-{
-	if (bulk_checkin_plugged && !bulk_fsync_objdir)
-		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
-}
-
-void fsync_loose_object_bulk_checkin(int fd, const char *filename)
-{
-	/*
-	 * If we have a plugged bulk checkin, we issue a call that
-	 * cleans the filesystem page cache but avoids a hardware flush
-	 * command. Later on we will issue a single hardware flush
-	 * before as part of do_batch_fsync.
-	 */
-	if (!bulk_fsync_objdir ||
-	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
-		fsync_or_die(fd, filename);
-	}
-}
-
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -347,19 +288,6 @@ int index_bulk_checkin(struct object_id *oid,
 void plug_bulk_checkin(void)
 {
 	assert(!bulk_checkin_plugged);
-
-	/*
-	 * A temporary object directory is used to hold the files
-	 * while they are not fsynced.
-	 */
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
-		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
-		if (!bulk_fsync_objdir)
-			die(_("Could not create temporary object directory for core.fsyncMethod=batch"));
-
-		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
-	}
-
 	bulk_checkin_plugged = 1;
 }
 
@@ -369,6 +297,4 @@ void unplug_bulk_checkin(void)
 	bulk_checkin_plugged = 0;
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
-
-	do_batch_fsync();
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index 181d3447ff9..b26f3dc3b74 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,9 +6,6 @@
 
 #include "cache.h"
 
-void prepare_loose_object_bulk_checkin(void);
-void fsync_loose_object_bulk_checkin(int fd, const char *filename);
-
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
diff --git a/cache.h b/cache.h
index 72c91c91286..2f3831fa853 100644
--- a/cache.h
+++ b/cache.h
@@ -1772,11 +1772,6 @@ void fsync_or_die(int fd, const char *);
 int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
-static inline int batch_fsync_enabled(enum fsync_component component)
-{
-	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
-}
-
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/object-file.c b/object-file.c
index cd0ddb49e4b..dbeb3df502d 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1886,19 +1886,37 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
 	hash_object_file_literally(algo, buf, len, type_name(type), oid);
 }
 
+static void sync_loose_object_batch(int fd, const char *filename,
+				    const unsigned oflags)
+{
+	const int last = oflags & HASH_N_OBJECTS_LAST;
+
+	/*
+	 * We're doing a sync_file_range() (or equivalent) for 1..N-1
+	 * objects, and then a "real" fsync() for N. On some OS's
+	 * enabling core.fsync=loose-object && core.fsyncMethod=batch
+	 * improves the performance by a lot.
+	 */
+	if (last || (!last && git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0))
+		fsync_or_die(fd, filename);
+}
+
 /* Finalize a file on disk, and close it. */
-static void close_loose_object(int fd, const char *filename)
+static void close_loose_object(int fd, const char *filename,
+			       const unsigned oflags)
 {
+	int fsync_loose;
+
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		fsync_loose_object_bulk_checkin(fd, filename);
-	else if (fsync_object_files > 0)
+	fsync_loose = fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT;
+
+	if (oflags & HASH_N_OBJECTS && fsync_loose &&
+	    fsync_method == FSYNC_METHOD_BATCH)
+		sync_loose_object_batch(fd, filename, oflags);
+	else if (fsync_object_files > 0 || fsync_loose)
 		fsync_or_die(fd, filename);
-	else
-		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
-				       filename);
 
 out:
 	if (close(fd) != 0)
@@ -1962,9 +1980,6 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	static struct strbuf filename = STRBUF_INIT;
 
-	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
-		prepare_loose_object_bulk_checkin();
-
 	loose_object_path(the_repository, &filename, oid);
 
 	fd = create_tmpfile(&tmp_file, filename.buf);
@@ -2015,7 +2030,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd, tmp_file.buf);
+	close_loose_object(fd, tmp_file.buf, flags);
 
 	if (mtime) {
 		struct utimbuf utb;
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
                                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

As with a preceding change to "unpack-objects" add an "oflags" going
from cmd_update_index() all the way down to the code in
object-file.c. Note also how index_mem() will now call
write_object_file_flags().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c | 32 ++++++++++++++++++--------------
 object-file.c          |  2 +-
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 95ed3c47b2e..34aaaa16c20 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -267,10 +267,12 @@ static int process_lstat_error(const char *path, int err)
 	return error("lstat(\"%s\"): %s", path, strerror(err));
 }
 
-static int add_one_path(const struct cache_entry *old, const char *path, int len, struct stat *st)
+static int add_one_path(const struct cache_entry *old, const char *path,
+			int len, struct stat *st, const unsigned oflags)
 {
 	int option;
 	struct cache_entry *ce;
+	unsigned f;
 
 	/* Was the old index entry already up-to-date? */
 	if (old && !ce_stage(old) && !ce_match_stat(old, st, 0))
@@ -283,8 +285,8 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
 	fill_stat_cache_info(&the_index, ce, st);
 	ce->ce_mode = ce_mode_from_stat(old, st->st_mode);
 
-	if (index_path(&the_index, &ce->oid, path, st,
-		       info_only ? 0 : HASH_WRITE_OBJECT)) {
+	f = oflags | (info_only ? 0 : HASH_WRITE_OBJECT);
+	if (index_path(&the_index, &ce->oid, path, st, f)) {
 		discard_cache_entry(ce);
 		return -1;
 	}
@@ -320,7 +322,8 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
  *  - it doesn't exist at all in the index, but it is a valid
  *    git directory, and it should be *added* as a gitlink.
  */
-static int process_directory(const char *path, int len, struct stat *st)
+static int process_directory(const char *path, int len, struct stat *st,
+			     const unsigned oflags)
 {
 	struct object_id oid;
 	int pos = cache_name_pos(path, len);
@@ -334,7 +337,7 @@ static int process_directory(const char *path, int len, struct stat *st)
 			if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
 				return 0;
 
-			return add_one_path(ce, path, len, st);
+			return add_one_path(ce, path, len, st, oflags);
 		}
 		/* Should this be an unconditional error? */
 		return remove_one_path(path);
@@ -358,13 +361,14 @@ static int process_directory(const char *path, int len, struct stat *st)
 
 	/* No match - should we add it as a gitlink? */
 	if (!resolve_gitlink_ref(path, "HEAD", &oid))
-		return add_one_path(NULL, path, len, st);
+		return add_one_path(NULL, path, len, st, oflags);
 
 	/* Error out. */
 	return error("%s: is a directory - add files inside instead", path);
 }
 
-static int process_path(const char *path, struct stat *st, int stat_errno)
+static int process_path(const char *path, struct stat *st, int stat_errno,
+			const unsigned oflags)
 {
 	int pos, len;
 	const struct cache_entry *ce;
@@ -395,9 +399,9 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 		return process_lstat_error(path, stat_errno);
 
 	if (S_ISDIR(st->st_mode))
-		return process_directory(path, len, st);
+		return process_directory(path, len, st, oflags);
 
-	return add_one_path(ce, path, len, st);
+	return add_one_path(ce, path, len, st, oflags);
 }
 
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
@@ -446,7 +450,7 @@ static void chmod_path(char flip, const char *path)
 	die("git update-index: cannot chmod %cx '%s'", flip, path);
 }
 
-static void update_one(const char *path)
+static void update_one(const char *path, const unsigned oflags)
 {
 	int stat_errno = 0;
 	struct stat st;
@@ -485,7 +489,7 @@ static void update_one(const char *path)
 		report("remove '%s'", path);
 		return;
 	}
-	if (process_path(path, &st, stat_errno))
+	if (process_path(path, &st, stat_errno, oflags))
 		die("Unable to process path %s", path);
 	report("add '%s'", path);
 }
@@ -776,7 +780,7 @@ static int do_reupdate(int ac, const char **av,
 		 */
 		save_nr = active_nr;
 		path = xstrdup(ce->name);
-		update_one(path);
+		update_one(path, 0);
 		free(path);
 		discard_cache_entry(old);
 		if (save_nr != active_nr)
@@ -1138,7 +1142,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 			setup_work_tree();
 			p = prefix_path(prefix, prefix_length, path);
-			update_one(p);
+			update_one(p, 0);
 			if (set_executable_bit)
 				chmod_path(set_executable_bit, p);
 			free(p);
@@ -1183,7 +1187,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				strbuf_swap(&buf, &unquoted);
 			}
 			p = prefix_path(prefix, prefix_length, buf.buf);
-			update_one(p);
+			update_one(p, 0);
 			if (set_executable_bit)
 				chmod_path(set_executable_bit, p);
 			free(p);
diff --git a/object-file.c b/object-file.c
index dbeb3df502d..8999fce2b15 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2211,7 +2211,7 @@ static int index_mem(struct index_state *istate,
 	}
 
 	if (write_object)
-		ret = write_object_file(buf, size, type, oid);
+		ret = write_object_file_flags(buf, size, type, oid, flags);
 	else
 		hash_object_file(the_hash_algo, buf, size, type, oid);
 	if (re_allocated)
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                       ` (2 preceding siblings ...)
  2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 20:30                       ` Neeraj Singh
  2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
                                       ` (2 subsequent siblings)
  6 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

As with unpack-objects in a preceding commit have update-index.c make
use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
mode again for "update-index".

Adding the t/* directory from git.git on a Linux ramdisk is a bit
faster than with the tmp-objdir indirection:

	$ git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' -p 'rm -rf repo/.git/objects/* repo/.git/index' './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' --warmup 1 -r 10Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync
	  Time (mean ± σ):     281.1 ms ±   2.6 ms    [User: 186.2 ms, System: 92.3 ms]
	  Range (min … max):   278.3 ms … 287.0 ms    10 runs

	Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD
	  Time (mean ± σ):     265.9 ms ±   2.6 ms    [User: 181.7 ms, System: 82.1 ms]
	  Range (min … max):   262.0 ms … 270.3 ms    10 runs

	Summary
	  './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
	    1.06 ± 0.01 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'

And as before running that with "strace --summary-only" slows things
down a bit (probably mimicking slower I/O a bit). I then get:

	Summary
	  'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
	    1.19 ± 0.03 times faster than 'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'

This one has a twist though, instead of fsync()-ing on the last object
we write we'll not do that, and instead defer the fsync() until we
write the index itself. This is outlined in [1] (as "METHOD THREE").

Because of this under FSYNC_METHOD_BATCH we'll do the N
objects (possibly only one, because we're lazy) as HASH_N_OBJECTS, and
we'll even now support doing this via N arguments on the command-line.

Then we won't fsync() any of it, but we will rename it
in-place (which, if we were still using the tmp-objdir, would leave it
"staged" in the tmp-objdir).

We'll then have the fsync() for the index update "flush" that out, and
thus avoid two fsync() calls when one will do.

Running this with the "git hyperfine" command mentioned in a preceding
commit with "strace --summary-only" shows that we do 1 fsync() now
instead of 2, and have one more sync_file_range(), as expected.

We also go from ~51k syscalls to ~39k, with ~2x the number of link()
and unlink() in ns/batched-fsync, and of course one fsync() instead of
two()>

The flow of this code isn't quite set up for re-plugging the
tmp-objdir back in. In particular we no longer pass
HASH_N_OBJECTS_FIRST (but doing so would be trivial)< and there's no
HASH_N_OBJECTS_LAST.

So this and other callers would need some light transaction-y API, or
to otherwise pass down a "yes, I'd like to flush it" down to
finalize_hashfile(), but doing so will be trivial.

And since we've started structuring it this way it'll become easy to
do any arbitrary number of things down the line that would "bulk
fsync" before the final fsync(). Now we write some objects and fsync()
on the index, but between those two could do any number of other
things where we'd defer the fsync().

This sort of thing might be especially interesting for "git repack"
when it writes e.g. a *.bitmap, *.rev, *.pack and *.idx. In that case
we could skip the fsync() on all of those, and only do it on the *.idx
before we renamed it in-place. I *think* nothing cares about a *.pack
without an *.idx, but even then we could fsync *.idx, rename *.pack,
rename *.idx and still safely do only one fsync(). See "git show
--first-parent" on 62874602032 (Merge branch
'tb/pack-finalize-ordering' into maint, 2021-10-12) for a good
overview of the code involved in that.

1. https://lore.kernel.org/git/220323.86sfr9ndpr.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/update-index.c |  7 ++++---
 cache.h                |  1 +
 read-cache.c           | 29 ++++++++++++++++++++++++++++-
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 34aaaa16c20..6cfec6efb38 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1142,7 +1142,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 			setup_work_tree();
 			p = prefix_path(prefix, prefix_length, path);
-			update_one(p, 0);
+			update_one(p, HASH_N_OBJECTS);
 			if (set_executable_bit)
 				chmod_path(set_executable_bit, p);
 			free(p);
@@ -1187,7 +1187,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				strbuf_swap(&buf, &unquoted);
 			}
 			p = prefix_path(prefix, prefix_length, buf.buf);
-			update_one(p, 0);
+			update_one(p, HASH_N_OBJECTS);
 			if (set_executable_bit)
 				chmod_path(set_executable_bit, p);
 			free(p);
@@ -1263,7 +1263,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				exit(128);
 			unable_to_lock_die(get_index_file(), lock_error);
 		}
-		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
+		if (write_locked_index(&the_index, &lock_file,
+				       COMMIT_LOCK | WLI_NEED_LOOSE_FSYNC))
 			die("Unable to write new index file");
 	}
 
diff --git a/cache.h b/cache.h
index 2f3831fa853..7542e009a34 100644
--- a/cache.h
+++ b/cache.h
@@ -751,6 +751,7 @@ void ensure_full_index(struct index_state *istate);
 /* For use with `write_locked_index()`. */
 #define COMMIT_LOCK		(1 << 0)
 #define SKIP_IF_UNCHANGED	(1 << 1)
+#define WLI_NEED_LOOSE_FSYNC	(1 << 2)
 
 /*
  * Write the index while holding an already-taken lock. Close the lock,
diff --git a/read-cache.c b/read-cache.c
index 3e0e7d41837..275f6308c32 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2860,6 +2860,33 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	int ieot_entries = 1;
 	struct index_entry_offset_table *ieot = NULL;
 	int nr, nr_threads;
+	unsigned int wflags = FSYNC_COMPONENT_INDEX;
+
+
+	/*
+	 * TODO: This is abuse of the API recently modified
+	 * finalize_hashfile() which reveals a shortcoming of its
+	 * "fsync" design.
+	 * 
+	 * I.e. It expects a "enum fsync_component component" label,
+	 * but here we're passing it an OR of the two, knowing that
+	 * it'll call fsync_component_or_die() which (in
+	 * write-or-die.c) will do "(fsync_components & wflags)" (to
+	 * our "wflags" here).
+	 *
+	 * But the API really should be changed to explicitly take
+	 * such flags, because in this case we'd like to fsync() the
+	 * index if we're in the bulk mode, *even if* our
+	 * "core.fsync=index" isn't configured.
+	 *
+	 * That's because at this point we've been queuing up object
+	 * writes that we didn't fsync(), and are going to use this
+	 * fsync() to "flush" the whole thing. Doing it this way
+	 * avoids redundantly calling fsync() twice when once will do.
+	 */
+	if (fsync_method == FSYNC_METHOD_BATCH && 
+	    flags & WLI_NEED_LOOSE_FSYNC)
+		wflags |= FSYNC_COMPONENT_LOOSE_OBJECT;
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
@@ -3094,7 +3121,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	if (!alternate_index_output && (flags & COMMIT_LOCK))
 		csum_fsync_flag = CSUM_FSYNC;
 
-	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
+	finalize_hashfile(f, istate->oid.hash, wflags,
 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
 
 	if (close_tempfile_gently(tempfile)) {
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync()
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                       ` (3 preceding siblings ...)
  2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

We can now bring "bulk" syncing back to "git add" using a mechanism
discussed in the preceding commit where we fsync() on the index, not
the last object we write.

On a ramdisk:

	$ git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/' -p 'rm -rf repo/.git/objects/* repo/.git/
	index' './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' --warmup 1
	Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'ns/batched-fsync
	  Time (mean ± σ):     299.5 ms ±   1.6 ms    [User: 193.4 ms, System: 103.7 ms]
	  Range (min … max):   296.6 ms … 301.6 ms    10 runs

	Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'HEAD
	  Time (mean ± σ):     282.8 ms ±   2.1 ms    [User: 193.8 ms, System: 86.6 ms]
	  Range (min … max):   279.1 ms … 285.6 ms    10 runs

	Summary
	  './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'HEAD' ran
	    1.06 ± 0.01 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'ns/batched-fsync'

My times on my spinning disk are too fuzzy to quote with confidence,
but I have seen it go as well as 15-30% faster. FWIW doing "strace
--summary-only" on the ramdisk is ~20% faster:

	$ git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/' -p 'rm -rf repo/.git/objects/* repo/.git/index' 'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' --warmup 1
	Benchmark 1: strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'ns/batched-fsync
	  Time (mean ± σ):     917.4 ms ±  18.8 ms    [User: 388.7 ms, System: 672.1 ms]
	  Range (min … max):   885.3 ms … 948.1 ms    10 runs

	Benchmark 2: strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'HEAD
	  Time (mean ± σ):     769.0 ms ±   9.2 ms    [User: 358.2 ms, System: 521.2 ms]
	  Range (min … max):   760.7 ms … 792.6 ms    10 runs

	Summary
	  'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'HEAD' ran
	    1.19 ± 0.03 times faster than 'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'ns/batched-fsync'

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/add.c | 6 ++++--
 cache.h       | 1 +
 read-cache.c  | 8 ++++++++
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 3ffb86a4338..6ef18b6246c 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -580,7 +580,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		 (intent_to_add ? ADD_CACHE_INTENT : 0) |
 		 (ignore_add_errors ? ADD_CACHE_IGNORE_ERRORS : 0) |
 		 (!(addremove || take_worktree_changes)
-		  ? ADD_CACHE_IGNORE_REMOVAL : 0));
+		  ? ADD_CACHE_IGNORE_REMOVAL : 0)) |
+		ADD_CACHE_HASH_N_OBJECTS;
 
 	if (read_cache_preload(&pathspec) < 0)
 		die(_("index file corrupt"));
@@ -686,7 +687,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 finish:
 	if (write_locked_index(&the_index, &lock_file,
-			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
+			       COMMIT_LOCK | SKIP_IF_UNCHANGED |
+			       WLI_NEED_LOOSE_FSYNC))
 		die(_("Unable to write new index file"));
 
 	dir_clear(&dir);
diff --git a/cache.h b/cache.h
index 7542e009a34..d57af938cbc 100644
--- a/cache.h
+++ b/cache.h
@@ -857,6 +857,7 @@ int remove_file_from_index(struct index_state *, const char *path);
 #define ADD_CACHE_IGNORE_ERRORS	4
 #define ADD_CACHE_IGNORE_REMOVAL 8
 #define ADD_CACHE_INTENT 16
+#define ADD_CACHE_HASH_N_OBJECTS 32
 /*
  * These two are used to add the contents of the file at path
  * to the index, marking the working tree up-to-date by storing
diff --git a/read-cache.c b/read-cache.c
index 275f6308c32..788423b6dde 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -755,6 +755,14 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	unsigned hash_flags = pretend ? 0 : HASH_WRITE_OBJECT;
 	struct object_id oid;
 
+	/*
+	 * TODO: Can't we also set HASH_N_OBJECTS_FIRST as a function
+	 * of !(ce->ce_flags & CE_ADDED) or something? I'm not too
+	 * familiar with the cache API...
+	 */
+	if (flags & ADD_CACHE_HASH_N_OBJECTS)
+		hash_flags |= HASH_N_OBJECTS;
+
 	if (flags & ADD_CACHE_RENORMALIZE)
 		hash_flags |= HASH_RENORMALIZE;
 
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                       ` (4 preceding siblings ...)
  2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/core.txt | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index cf0e9b8b088..f598925b597 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -596,12 +596,23 @@ core.fsyncMethod::
   filesystem and storage hardware, data added to the repository may not be
   durable in the event of a system crash. This is the default mode on macOS.
 * `batch` enables a mode that uses writeout-only flushes to stage multiple
-  updates in the disk writeback cache and then does a single full fsync of
-  a dummy file to trigger the disk cache flush at the end of the operation.
-  Currently `batch` mode only applies to loose-object files. Other repository
-  data is made durable as if `fsync` was specified. This mode is expected to
-  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
-  and on Windows for repos stored on NTFS or ReFS filesystems.
+  updates in the disk writeback cache and, before doing a full fsync() of
+  on the "last" file that to trigger the disk cache flush at the end of the
+  operation.
++
+Other repository data is made durable as if `fsync` was
+specified. This mode is expected to be as safe as `fsync` on macOS for
+repos stored on HFS+ or APFS filesystems and on Windows for repos
+stored on NTFS or ReFS filesystems.
++
+The `batch` is currently only applies to loose-object files and will
+kick in when using the linkgit:git-unpack-objects[1] and
+linkgit:update-index[1] commands. Note that the "last" file to be
+synced may be the last object, as in the case of
+linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
+"ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
+batch syncing of the loose objects may be deferred until a subsequent
+fsync() to a file that makes them "active".
 
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old
  2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
                                       ` (5 preceding siblings ...)
  2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
@ 2022-03-23 14:18                     ` Ævar Arnfjörð Bjarmason
  2022-03-23 21:08                       ` Neeraj Singh
  6 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-23 14:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Neeraj Singh, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh,
	Ævar Arnfjörð Bjarmason

Add a new fsyncMethod.batch.quarantine setting which defaults to
"false". Preceding (RFC, and not meant to flip-flop like that
eventually) commits ripped out the "tmp-objdir" part of the
core.fsyncMethod=batch.

This documentation proposes to keep that as the default for the
reasons discussed in it, while allowing users to set
"fsyncMethod.batch.quarantine=true".

Furthermore update the discussion of "core.fsyncObjectFiles" with
information about what it *really* does, why you probably shouldn't
use it, and how to safely emulate most of what it gave users in the
past in terms of performance benefit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/core.txt | 80 +++++++++++++++++++++++++++++++----
 1 file changed, 72 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index f598925b597..365a12dc7ae 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -607,21 +607,85 @@ stored on NTFS or ReFS filesystems.
 +
 The `batch` is currently only applies to loose-object files and will
 kick in when using the linkgit:git-unpack-objects[1] and
-linkgit:update-index[1] commands. Note that the "last" file to be
+linkgit:git-update-index[1] commands. Note that the "last" file to be
 synced may be the last object, as in the case of
 linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
 "ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
 batch syncing of the loose objects may be deferred until a subsequent
 fsync() to a file that makes them "active".
 
+fsyncMethod.batch.quarantine::
+	A boolean which if set to `true` will cause "batched" writes
+	to objects to be "quarantined" if
+	`core.fsyncMethod=batch`. This is `false` by default.
++
+The primary object of these fsync() settings is to protect against
+repository corruption of things which are reachable, i.e. "reachable",
+via references, the index etc. Not merely objects that were present in
+the object store.
++
+Historically setting `core.fsyncObjectFiles=false` assumed that on a
+filesystem with where an fsync() would flush all preceding outstanding
+I/O that we might end up with a corrupt loose object, but that was OK
+as long as no reference referred to it. We'd eventually the corrupt
+object with linkgit:git-gc[1], and linkgit:git-fsck[1] would only
+report it as a minor annoyance
++
+Setting `fsyncMethod.batch.quarantine=true` takes the view that
+something like a corrupt *unreferenced* loose object in the object
+store is something we'd like to avoid, at the cost of reduced
+performance when using `core.fsyncMethod=batch`.
++
+Currently this uses the same mechanism described in the "QUARANTINE
+ENVIRONMENT" in the linkgit:git-receive-pack[1] documentation, but
+that's subject to change. The performance loss is because we need to
+"stage" the objects in that quarantine environment, fsync() it, and
+once that's done rename() or link() it in-place into the main object
+store, possibly with an fsync() of the index or ref at the end
++
+With `fsyncMethod.batch.quarantine=false` we'll "stage" things in the
+main object store, and then do one fsync() at the very end, either on
+the last object we write, or file (index or ref) that'll make it
+"reachable".
++
+The bad thing about setting this to `true` is lost performance, as
+well as not being able to access the objects as they're written (which
+e.g. consumers of linkgit:git-update-index[1]'s `--verbose` mode might
+want to do).
++
+The good thing is that you should be guaranteed not to get e.g. short
+or otherwise corrupt loose objects if you pull your power cord, in
+practice various git commands deal quite badly with discovering such a
+stray corrupt object (including perhaps assuming it's valid based on
+its existence, or hard dying on an error rather than replacing
+it). Repairing such "unreachable corruption" can require manual
+intervention.
+
 core.fsyncObjectFiles::
-	This boolean will enable 'fsync()' when writing object files.
-	This setting is deprecated. Use core.fsync instead.
-+
-This setting affects data added to the Git repository in loose-object
-form. When set to true, Git will issue an fsync or similar system call
-to flush caches so that loose-objects remain consistent in the face
-of a unclean system shutdown.
+	This boolean will enable 'fsync()' when writing loose object
+	files.
++
+This setting is the historical fsync configuration setting. It's now
+*deprecated*, you should use `core.fsync` instead, perhaps in
+combination with `core.fsyncMethod=batch`.
++
+The `core.fsyncObjectFiles` was initially added based on integrity
+assumptions that early (pre-ext-4) versions of Linux's "ext"
+filesystems provided.
++
+I.e. that a write of file A without an `fsync()` followed by a write
+of file `B` with `fsync()` would implicitly guarantee that `A' would
+be `fsync()`'d by calling `fsync()` on `B`. This asssumption is *not*
+backed up by any standard (e.g. POSIX), but worked in practice on some
+Linux setups.
++
+Nowadays you should almost certainly want to use
+`core.fsync=loose-object` instead in combination with
+`core.fsyncMethod=bulk`, and possibly with
+`fsyncMethod.batch.quarantine=true`, see above. On modern OS's (Linux,
+OSX, Windows) that gives you most of the performance benefit of
+`core.fsyncObjectFiles=false` with all of the safety of the old
+`core.fsyncObjectFiles=true`.
 
 core.preloadIndex::
 	Enable parallel index preload for operations like 'git diff'
-- 
2.35.1.1428.g1c1a0152d61


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
@ 2022-03-23 20:19                         ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23 20:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

I'm going to respond in more detail to your individual patches,
(expect the last mail to contain a comment at the end "LAST MAIL").

On Wed, Mar 23, 2022 at 3:52 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Mar 22 2022, Neeraj Singh wrote:
>
> > On Tue, Mar 22, 2022 at 8:48 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> As with unpack-objects in a preceding commit have update-index.c make
> >> use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
> >> mode again for "update-index".
> >>
> >> Adding the t/* directory from git.git on a Linux ramdisk is a bit
> >> faster than with the tmp-objdir indirection:
> >>
> >>         git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10
> >>         Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync
> >>           Time (mean ± σ):     289.8 ms ±   4.0 ms    [User: 186.3 ms, System: 103.2 ms]
> >>           Range (min … max):   285.6 ms … 297.0 ms    10 runs
> >>
> >>         Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD
> >>           Time (mean ± σ):     273.9 ms ±   7.3 ms    [User: 189.3 ms, System: 84.1 ms]
> >>           Range (min … max):   267.8 ms … 291.3 ms    10 runs
> >>
> >>         Summary
> >>           'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
> >>             1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
> >>
> >> And as before running that with "strace --summary-only" slows things
> >> down a bit (probably mimicking slower I/O a bit). I then get:
> >>
> >>         Summary
> >>           'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran
> >>             1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync'
> >>
> >> We also go from ~51k syscalls to ~39k, with ~2x the number of link()
> >> and unlink() in ns/batched-fsync.
> >>
> >> In the process of doing this conversion we lost the "bulk" mode for
> >> files added on the command-line. I don't think it's useful to optimize
> >> that, but we could if anyone cared.
> >>
> >> We've also converted this to a string_list, we could walk with
> >> getline_fn() and get one line "ahead" to see what we have left, but I
> >> found that state machine a bit painful, and at least in my testing
> >> buffering this doesn't harm things. But we could also change this to
> >> stream again, at the cost of some getline_fn() twiddling.
> >>
> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >> ---
> >>  builtin/update-index.c | 31 +++++++++++++++++++++++++++----
> >>  1 file changed, 27 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/builtin/update-index.c b/builtin/update-index.c
> >> index af02ff39756..c7cbfe1123b 100644
> >> --- a/builtin/update-index.c
> >> +++ b/builtin/update-index.c
> >> @@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >>         }
> >>
> >>         if (read_from_stdin) {
> >> +               struct string_list list = STRING_LIST_INIT_NODUP;
> >>                 struct strbuf line = STRBUF_INIT;
> >>                 struct strbuf unquoted = STRBUF_INIT;
> >> +               size_t i, nr;
> >> +               unsigned oflags;
> >>
> >>                 setup_work_tree();
> >> -               while (getline_fn(&line, stdin) != EOF)
> >> -                       line_from_stdin(&line, &unquoted, prefix, prefix_length,
> >> -                                       nul_term_line, set_executable_bit, 0);
> >> +               while (getline_fn(&line, stdin) != EOF) {
> >> +                       size_t len = line.len;
> >> +                       char *str = strbuf_detach(&line, NULL);
> >> +
> >> +                       string_list_append_nodup(&list, str)->util = (void *)len;
> >> +               }
> >> +
> >> +               nr = list.nr;
> >> +               oflags = nr > 1 ? HASH_N_OBJECTS : 0;
> >> +               for (i = 0; i < nr; i++) {
> >> +                       size_t nth = i + 1;
> >> +                       unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
> >> +                                 nr == nth ? HASH_N_OBJECTS_LAST : 0;
> >> +                       struct strbuf buf = STRBUF_INIT;
> >> +                       struct string_list_item *item = list.items + i;
> >> +                       const size_t len = (size_t)item->util;
> >> +
> >> +                       strbuf_attach(&buf, item->string, len, len);
> >> +                       line_from_stdin(&buf, &unquoted, prefix, prefix_length,
> >> +                                       nul_term_line, set_executable_bit,
> >> +                                       oflags | f);
> >> +                       strbuf_release(&buf);
> >> +               }
> >>                 strbuf_release(&unquoted);
> >> -               strbuf_release(&line);
> >> +               string_list_clear(&list, 0);
> >>         }
> >>
> >>         if (split_index > 0) {
> >> --
> >> 2.35.1.1428.g1c1a0152d61
> >>
> >
> > This buffering introduces the same potential risk of the
> > "stdin-feeder" process not being able to see objects right away as my
> > version had. I'm planning to mitigate the issue by unplugging the bulk
> > checkin when issuing a verbose report so that anyone who's using that
> > output to synchronize can still see what they're expecting.
>
> I was rather terse in the commit message, I meant (but forgot some
> words) "doesn't harm thing for performance [in the above test]", but
> converting this to a string_list is clearly & regression that shouldn't
> be kept.
>
> I just wanted to demonstrate method of doing this by passing down the
> HASH_* flags, and found that writing the state-machine to "buffer ahead"
> by one line so that we can eventually know in the loop if we're in the
> "last" line or not was tedious, so I came up with this POC. But we
> clearly shouldn't lose the "streaming" aspect.
>

From my experience working on several state machines in the Windows
OS, they are notoriously difficult to understand and extend.  I
wouldn't want every top-level command that does something interesting
to have to deal with that.

> But anyway, now that I look at this again the smart thing here (surely?)
> is to keep the simple getline() loop and not ever issue a
> HASH_N_OBJECTS_LAST for the Nth item, instead we should in this case do
> the "checkpoint fsync" at the point that we write the actual index.
>
> Because an existing redundancy in your series is that you'll do the
> fsync() the same way for "git unpack-objects" as for "git
> {update-index,add}".
>
> I.e. in the former case adding the N objects is all we're doing, so the
> "last object" is the point at which we need to flush the previous N to
> disk.
>
> But for "update-index/add" you'll do at least 2 fsync()'s in the bulk
> mode, when it should be one. I.e. the equivalent of (leaving aside the
> tmp-objdir migration part of it), if writing objects A && B:
>
>     ## METHOD ONE
>     # A
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # "cookie"
>     write(bulk_fsync_XXXXXX)
>     fsync(bulk_fsync_XXXXXX)
>     # ref
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> This series on top changes that so we know that we're doing N, so we
> don't need the seperate "cookie", we can just use the B object as the
> cookie, as we know it comes last;
>
>     ## METHOD TWO
>     # A -- SAME as above
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B -- SAME as above, with s/bulk_fsync/fsync/
>     write(objects/B.tmp)
>     fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # "cookie" -- GONE!
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> But really, we should instead realize that we're not doing
> "unpack-objects", but have a "ref update" at the end (whether that's a
> ref, or an index etc.) and do:
>
>     ## METHOD THREE
>     # A -- SAME as above
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     rename(objects/A.tmp, objects/A)
>     # B -- SAME as the first
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     rename(objects/B.tmp, objects/B)
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     rename(INDEX.tmp, INDEX)
>
> Which cuts our number of fsync() operations down from 2 to 1, ina
> addition to removing the need for the "cookie", which is only there
> because we didn't keep track of where we were in the sequence as in my
> 2/7 and 5/7.
>

I agree that this is a great direction to go in as an extension to
this work (i.e. a subsequent patch).  I saw in one of your mails on v2
of your rfc series that you mentioned a "lightweight transaction-y
thing".  I've been thinking along the same lines myself, but wanted to
treat that as a separable concern.  In my ideal world, we'd just use a
real database for loose objects, the index, and refs and let that
handle the transaction management.  But in lieu of that, having a
transaction that looks across the ODB, index, and refs would let us
batch syncs optimally.

> And it would be the same for tmp-objdir, the rename dance is a bit
> different, but we'd do the "full" fsync() while on the INDEX.tmp, then
> migrate() the tmp-objdir, and once that's done do the final:
>
>     rename(INDEX.tmp, INDEX)
>
> I.e. we'd fsync() the content once, and only have the renme() or link()
> operations left. For POSIX we'd need a few more fsync() for the
> metadata, but this (i.e. your) series already makes the hard assumption
> that we don't need to do that for rename().
>
> > I think the code you've presented here is a lot of diff to accomplish
> > the same thing that my series does, where this specific update-index
> > caller has been roto-tilled to provide the needed
> > begin/end-transaction points.
>
> Any caller of these APIs will need the "unsigned oflags" sooner than
> later anyway, as they need to pass down e.g. HASH_WRITE_OBJECT. We just
> do it slightly earlier.
>
> And because of that in the general case it's really not the same, I
> think it's a better approach. You've already got the bug in yours of
> needlessly setting up the bulk checkin for !HASH_WRITE_OBJECT in
> update-index, which this neatly solves by deferring the "bulk" mechanism
> until the codepath that's past that and into the "real" object writing.
>
> We can also die() or error out in the object writing before ever getting
> to writing the object, in which case we'd do some setup that we'd need
> to tear down again, by deferring it until the last moment...
>

I'll be submitting a new version to the list which sets up the tmp
objdir lazily on first actual write, so the concern about writing to
the ODB needlessly should go away.

> > And I think there will be a lot of
> > complexity in supporting the same hints for command-line additions
> > (which is roughly equivalent to the git-add workflow).
>
> I left that out due to Junio's comment in
> https://lore.kernel.org/git/xmqqzgljyz34.fsf@gitster.g/; i.e. I don't
> see why we'd find it worthwhile to optimize that case, but we easily
> could (especially per the "just sync the INDEX.tmp" above).
>
> But even if we don't do "THREE" above I think it's still easy, for "TWO"
> we already have as parse_options() state machine to parse argv as it
> comes in. Doing the fsync() on the last object is just a matter of
> "looking ahead" there).
>
> > Every caller
> > that wants batch treatment will have to either implement a state
> > machine or implement a buffering mechanism in order to figure out the
> > begin-end points. Having a separate plug/unplug call eliminates this
> > complexity on each caller.
>
> This is subjective, but I really think that's rather easy to do, and
> much easier to reason about than the global state on the side via
> singletons that your method of avoiding modifying these callers and
> instead having them all consult global state via bulk-checkin.c and
> cache.h demands.

The nice thing about having the ODB handle the batch stuff internally
is that it can present a nice minimal interface to all of the callers.
Yes, it has a complex implementation internally, but that complexity
backs a rather simple API surface:
1. Begin/end transaction (plug/unplug checkin).
2. Find-object by SHA
3. Add object if it doesn't exist
4. Get the SHA without adding anything.

The ODB work is implemented once and callers can easily adopt the
transaction API without having to implement their own stuff on the
side.  Future series can make the transaction span nicely across the
ODB, index, and refs.

> That API also currently assumes single-threaded writers, if we start
> writing some of this in parallel in e.g. "unpack-objects" we'd need
> mutexes in bulk-object.[ch]. Isn't that a lot easier when the caller
> would instead know something about the special nature of the transaction
> they're interacting with, and that the 1st and last item are important
> (for a "BEGIN" and "FLUSH").
>

The API as sketched above doesn't deeply assume single-threadedness
for the "find object by SHA" or "add object if it doesn't exist".
There is a single-threaded assumption for begin/end-transaction.  The
implementation can use pthread_once to handle anything that needs to
be done lazily when adding objects.

> > Btw, I'm planning in a future series to reduce the system calls
> > involved in renaming a file by taking advantage of the renameat2
> > system call and equivalents on other platforms.  There's a pretty
> > strong motivation to do that on Windows.
>
> What do you have in mind for renameat2() specifically?  I.e. which of
> the 3x flags it implements will benefit us? RENAME_NOREPLACE to "move"
> the tmp_OBJ to an eventual OBJ?
>

Yes RENAME_NOREPLACE.  I'd want to introduce a helper called
git_rename_noreplace and use it instead of the link dance.

> Generally: There's some low-hanging fruit there. E.g. for tmp-objdir we
> slavishly go through the motion of writing an tmp_OBJ, writing (and
> possibly syncing it), then renaming that tmp_OBJ to OBJ.
>
> We could clearly just avoid that in some/all cases that use
> tmp-objdir. I.e. we're writing to a temporary store anyway, so why the
> tmp_OBJ files? We could just write to the final destinations instead,
> they're not reachable (by ref or OID lookup) from anyone else yet.
>

We were thinking before that there could be some concurrency in the
tmp_objdir, though I personally don't believe it's possible for the
typical bulk checkin case.  Using the final name in the tmp objdir
would be a nice optimization, but I think that it's a separable
concern that shouldn't block the bigger win from eliminating the cache
flushes.

> But even then I don't see how you'd get away with reducing some classes
> of syscalls past the 2x increase for some (leading an overall increase,
> but not a ~2x overall increase as noted in:
> https://lore.kernel.org/git/RFC-patch-7.7-481f1d771cb-20220323T033928Z-avarab@gmail.com/)
> as long as you use the tmp-objdir API. It's always going to have to
> write tmpdir/OBJ and link()/rename() that to OBJ.
>
> Now, I do think there's an easy way by extending the API use I've
> introduced in this RFC to do it. I.e. we'd just do:
>
>     ## METHOD FOUR
>     # A -- SAME as THREE, except no rename()
>     write(objects/A.tmp)
>     bulk_fsync(objects/A.tmp)
>     # B -- SAME as THREE, except no rename()
>     write(objects/B.tmp)
>     bulk_fsync(objects/B.tmp)
>     # ref -- SAME
>     write(INDEX.tmp, $(git rev-parse B))
>     fsync(INDEX.tmp)
>     # NEW: do all the renames at the end:
>     rename(objects/A.tmp, objects/A)
>     rename(objects/B.tmp, objects/B)
>     rename(INDEX.tmp, INDEX)
>
> That seems like an obvious win to me in any case. I.e. the tmp-objdir
> API isn't really a close fit for what we *really* want to do in this
> case.
>

I think this is the right place to get to eventually.  I believe the
best way to get there is to keep the plug/unplug bulk checkin
functionality (rebranding it as an 'ODB transaction') and then make
that a sub-transaction of a larger 'git repo transaction.'

> I.e. the reason it does everything this way is because it was explicitly
> designed for 722ff7f876c (receive-pack: quarantine objects until
> pre-receive accepts, 2016-10-03), where it's the right trade-off,
> because we'd like to cheaply "rm -rf" the whole thing if e.g. the
> "pre-receive" hook rejects the push.
>
> *AND* because it's made for the case of other things concurrently
> needing access to those objects. So pedantically you would need it for
> some modes of "git update-index", but not e.g. "git unpack-objects"
> where we really are expecting to keep all of them.
>
> > Thanks for the concrete code,
>
> ..but no thanks? I.e. it would be useful to explicitly know if you're
> interested or open to running with some of the approach in this RFC.

I'm still at the point of arguing with you about your RFC, but I'm
_not_ currently leaning toward adopting your approach.  I think from a
separation-of-concerns perspective, we shouldn't change top-level git
commands to try hard to track first/last object.  The ODB should
conceptually handle it internally as part of a higher-level
transaction.  Consider cmd_add, which does its interesting
add_file_to_index from the update_callback coming from the diff code:
I believe it would be hopelessly complex/impossible to do the tracking
required to pass the LAST_OF_N flag to a multiplexed write API.

We have a pretty clear example from the database world that
begin/end-transaction is the right way to design the API for the task
we want to accomplish.  It's also how many filesystems work
internally.  I don't want to reinvent the bicycle here.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags
  2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
@ 2022-03-23 20:23                       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23 20:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Wed, Mar 23, 2022 at 7:18 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> In preparation for making the bulk-checkin.c logic operate from
> object-file.c itself in some common cases let's add
> HASH_N_OBJECTS{,_{FIRST,LAST}} flags.
>
> This will allow us to adjust for-loops that add N objects to just pass
> down whether they have >1 objects (HASH_N_OBJECTS), as well as passing
> down flags for whether we have the first or last object.
>
> We'll thus be able to drive any sort of batch-object mechanism from
> write_object_file_flags() directly, which until now didn't know if it
> was doing one object, or some arbitrary N.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/unpack-objects.c | 60 +++++++++++++++++++++++-----------------
>  cache.h                  |  3 ++
>  2 files changed, 37 insertions(+), 26 deletions(-)
>
> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index c55b6616aed..ec40c6fd966 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -233,7 +233,8 @@ static void write_rest(void)
>  }
>
>  static void added_object(unsigned nr, enum object_type type,
> -                        void *data, unsigned long size);
> +                        void *data, unsigned long size,
> +                        unsigned oflags);
>
>  /*
>   * Write out nr-th object from the list, now we know the contents
> @@ -241,21 +242,21 @@ static void added_object(unsigned nr, enum object_type type,
>   * to be checked at the end.
>   */
>  static void write_object(unsigned nr, enum object_type type,
> -                        void *buf, unsigned long size)
> +                        void *buf, unsigned long size, unsigned oflags)
>  {
>         if (!strict) {
> -               if (write_object_file(buf, size, type,
> -                                     &obj_list[nr].oid) < 0)
> +               if (write_object_file_flags(buf, size, type,
> +                                     &obj_list[nr].oid, oflags) < 0)
>                         die("failed to write object");
> -               added_object(nr, type, buf, size);
> +               added_object(nr, type, buf, size, oflags);
>                 free(buf);
>                 obj_list[nr].obj = NULL;
>         } else if (type == OBJ_BLOB) {
>                 struct blob *blob;
> -               if (write_object_file(buf, size, type,
> -                                     &obj_list[nr].oid) < 0)
> +               if (write_object_file_flags(buf, size, type,
> +                                           &obj_list[nr].oid, oflags) < 0)
>                         die("failed to write object");
> -               added_object(nr, type, buf, size);
> +               added_object(nr, type, buf, size, oflags);
>                 free(buf);
>
>                 blob = lookup_blob(the_repository, &obj_list[nr].oid);
> @@ -269,7 +270,7 @@ static void write_object(unsigned nr, enum object_type type,
>                 int eaten;
>                 hash_object_file(the_hash_algo, buf, size, type,
>                                  &obj_list[nr].oid);
> -               added_object(nr, type, buf, size);
> +               added_object(nr, type, buf, size, oflags);
>                 obj = parse_object_buffer(the_repository, &obj_list[nr].oid,
>                                           type, size, buf,
>                                           &eaten);
> @@ -283,7 +284,7 @@ static void write_object(unsigned nr, enum object_type type,
>
>  static void resolve_delta(unsigned nr, enum object_type type,
>                           void *base, unsigned long base_size,
> -                         void *delta, unsigned long delta_size)
> +                         void *delta, unsigned long delta_size, unsigned oflags)
>  {
>         void *result;
>         unsigned long result_size;
> @@ -294,7 +295,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
>         if (!result)
>                 die("failed to apply delta");
>         free(delta);
> -       write_object(nr, type, result, result_size);
> +       write_object(nr, type, result, result_size, oflags);
>  }
>
>  /*
> @@ -302,7 +303,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
>   * resolve all the deltified objects that are based on it.
>   */
>  static void added_object(unsigned nr, enum object_type type,
> -                        void *data, unsigned long size)
> +                        void *data, unsigned long size, unsigned oflags)
>  {
>         struct delta_info **p = &delta_list;
>         struct delta_info *info;
> @@ -313,7 +314,7 @@ static void added_object(unsigned nr, enum object_type type,
>                         *p = info->next;
>                         p = &delta_list;
>                         resolve_delta(info->nr, type, data, size,
> -                                     info->delta, info->size);
> +                                     info->delta, info->size, oflags);
>                         free(info);
>                         continue;
>                 }
> @@ -322,18 +323,19 @@ static void added_object(unsigned nr, enum object_type type,
>  }
>
>  static void unpack_non_delta_entry(enum object_type type, unsigned long size,
> -                                  unsigned nr)
> +                                  unsigned nr, unsigned oflags)
>  {
>         void *buf = get_data(size);
>
>         if (!dry_run && buf)
> -               write_object(nr, type, buf, size);
> +               write_object(nr, type, buf, size, oflags);
>         else
>                 free(buf);
>  }
>
>  static int resolve_against_held(unsigned nr, const struct object_id *base,
> -                               void *delta_data, unsigned long delta_size)
> +                               void *delta_data, unsigned long delta_size,
> +                               unsigned oflags)
>  {
>         struct object *obj;
>         struct obj_buffer *obj_buffer;
> @@ -344,12 +346,12 @@ static int resolve_against_held(unsigned nr, const struct object_id *base,
>         if (!obj_buffer)
>                 return 0;
>         resolve_delta(nr, obj->type, obj_buffer->buffer,
> -                     obj_buffer->size, delta_data, delta_size);
> +                     obj_buffer->size, delta_data, delta_size, oflags);
>         return 1;
>  }
>
>  static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
> -                              unsigned nr)
> +                              unsigned nr, unsigned oflags)
>  {
>         void *delta_data, *base;
>         unsigned long base_size;
> @@ -366,7 +368,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
>                 if (has_object_file(&base_oid))
>                         ; /* Ok we have this one */
>                 else if (resolve_against_held(nr, &base_oid,
> -                                             delta_data, delta_size))
> +                                             delta_data, delta_size, oflags))
>                         return; /* we are done */
>                 else {
>                         /* cannot resolve yet --- queue it */
> @@ -428,7 +430,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
>                 }
>         }
>
> -       if (resolve_against_held(nr, &base_oid, delta_data, delta_size))
> +       if (resolve_against_held(nr, &base_oid, delta_data, delta_size, oflags))
>                 return;
>
>         base = read_object_file(&base_oid, &type, &base_size);
> @@ -440,11 +442,11 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
>                 has_errors = 1;
>                 return;
>         }
> -       resolve_delta(nr, type, base, base_size, delta_data, delta_size);
> +       resolve_delta(nr, type, base, base_size, delta_data, delta_size, oflags);
>         free(base);
>  }
>
> -static void unpack_one(unsigned nr)
> +static void unpack_one(unsigned nr, unsigned oflags)
>  {
>         unsigned shift;
>         unsigned char *pack;
> @@ -472,11 +474,11 @@ static void unpack_one(unsigned nr)
>         case OBJ_TREE:
>         case OBJ_BLOB:
>         case OBJ_TAG:
> -               unpack_non_delta_entry(type, size, nr);
> +               unpack_non_delta_entry(type, size, nr, oflags);
>                 return;
>         case OBJ_REF_DELTA:
>         case OBJ_OFS_DELTA:
> -               unpack_delta_entry(type, size, nr);
> +               unpack_delta_entry(type, size, nr, oflags);
>                 return;
>         default:
>                 error("bad object type %d", type);
> @@ -491,6 +493,7 @@ static void unpack_all(void)
>  {
>         int i;
>         struct pack_header *hdr = fill(sizeof(struct pack_header));
> +       unsigned oflags;
>
>         nr_objects = ntohl(hdr->hdr_entries);
>
> @@ -505,9 +508,14 @@ static void unpack_all(void)
>                 progress = start_progress(_("Unpacking objects"), nr_objects);
>         CALLOC_ARRAY(obj_list, nr_objects);
>         plug_bulk_checkin();
> +       oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
>         for (i = 0; i < nr_objects; i++) {
> -               unpack_one(i);
> -               display_progress(progress, i + 1);
> +               int nth = i + 1;
> +               unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST :
> +                       nr_objects == nth ? HASH_N_OBJECTS_LAST : 0;
> +
> +               unpack_one(i, oflags | f);
> +               display_progress(progress, nth);
>         }
>         unplug_bulk_checkin();
>         stop_progress(&progress);
> diff --git a/cache.h b/cache.h
> index 84fafe2ed71..72c91c91286 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -896,6 +896,9 @@ int ie_modified(struct index_state *, const struct cache_entry *, struct stat *,
>  #define HASH_FORMAT_CHECK 2
>  #define HASH_RENORMALIZE  4
>  #define HASH_SILENT 8
> +#define HASH_N_OBJECTS 1<<4
> +#define HASH_N_OBJECTS_FIRST 1<<5
> +#define HASH_N_OBJECTS_LAST 1<<6
>  int index_fd(struct index_state *istate, struct object_id *oid, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
>  int index_path(struct index_state *istate, struct object_id *oid, const char *path, struct stat *st, unsigned flags);
>
> --
> 2.35.1.1428.g1c1a0152d61
>

This patch works out okay because unpack-objects is the easy case.
You have a well defined number of objects.  I'd be fine with your
design if all cases were like this.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin
  2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
@ 2022-03-23 20:25                       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23 20:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Wed, Mar 23, 2022 at 7:18 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Remove much of this as a POC for exploring some of what I mentioned in
> https://lore.kernel.org/git/220322.86mthinxnn.gmgdl@evledraar.gmail.com/
>
> This commit is obviously not what we *should* do as end-state, but
> demonstrates what's needed (I think) for a bare-minimum implementation
> of just the "bulk" syncing method for loose objects without the part
> where we do the tmp-objdir.c dance.
>
> Performance with this is already quite promising. Benchmarking with:
>
>         git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' \
>                 -p 'rm -rf r.git && git init --bare r.git' \
>                 './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack'
>
> I.e. unpacking a small packfile (my dotfiles) yields, on a Linux
> ramdisk:
>
>         Benchmark 1: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync
>           Time (mean ± σ):     815.9 ms ±   8.2 ms    [User: 522.9 ms, System: 287.9 ms]
>           Range (min … max):   805.6 ms … 835.9 ms    10 runs
>
>         Benchmark 2: ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD
>           Time (mean ± σ):     779.4 ms ±  15.4 ms    [User: 505.7 ms, System: 270.2 ms]
>           Range (min … max):   763.1 ms … 813.9 ms    10 runs
>
>         Summary
>           './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
>             1.05 ± 0.02 times faster than './git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'
>
> Doing the same with "strace --summary-only", which probably helps to
> emulate cases with slower syscalls is ~15% faster than using the
> tmp-objdir indirection:
>
>         Summary
>           'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'HEAD' ran
>             1.16 ± 0.01 times faster than 'strace --summary-only ./git -C r.git -c core.fsync=loose-object -c core.fsyncMethod=batch unpack-objects </tmp/pack-dotfiles.pack' in 'ns/batched-fsync'
>
> Which makes sense in terms of syscalls. In my case HEAD has ~101k
> calls, and the parent topic is making ~129k calls, with around 2x the
> number of unlink(), link() as expected.
>
> Of course some users will want to use the tmp-objdir.c method. So a
> version of this commit could be rewritten to come earlier in the
> series, with the "bulk" on top being optional.
>
> It seems to me that it's a much better strategy to do this whole thing
> in close_loose_object() after passing down the new HASH_N_OBJECTS /
> HASH_N_OBJECTS_FIRST / HASH_N_OBJECTS_LAST flags.
>
> Doing that for the "builtin/add.c" and "builtin/unpack-objects.c" code
> having its {un,}plug_bulk_checkin() removed here is then just a matter
> of passing down a similar set of flags indicating whether we're
> dealing with N objects, and if so if we're dealing with the last one
> or not.
>
> As we'll see in subsequent commits doing it this way also effortlessly
> integrates with other HASH_* flags. E.g. for "update-index" the code
> being rm'd here doesn't handle the interaction with
> "HASH_WRITE_OBJECT" properly, but once we've moved all this sync
> bootstrapping logic to close_loose_object() we'll never get to it if
> we're not actually writing something.
>
> This code currently doesn't use the HASH_N_OBJECTS_FIRST flag, but
> that's what we'd use later to optionally call tmp_objdir_create().
>
> Aside: This also changes logic that was a bit confusing and repetitive
> in close_loose_object(). Previously we'd first call
> batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT) which is just as
> shorthand for:
>
>         fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT &&
>         fsync_method == FSYNC_METHOD_BATCH
>
> We'd then proceed to call
> fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT) later in the same
> function, which is just a way of calling fsync_or_die() if:
>
>         fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT
>
> Now we instead just define a local "fsync_loose" variable by checking
> "fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT", which shows us that
> the previous case of fsync_component_or_die(...)" could just be added
> to the existing "fsync_object_files > 0" branch.
>
> Note: This commit reverts much of "core.fsyncmethod: batched disk
> flushes for loose-objects". We'll set up new structures to bring what
> it was doing back in a different way. I.e. to do the tmp-objdir
> plug-in in object-file.c
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/unpack-objects.c |  2 --
>  builtin/update-index.c   |  4 ---
>  bulk-checkin.c           | 74 ----------------------------------------
>  bulk-checkin.h           |  3 --
>  cache.h                  |  5 ---
>  object-file.c            | 37 ++++++++++++++------
>  6 files changed, 26 insertions(+), 99 deletions(-)
>
> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index ec40c6fd966..93da436581b 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -507,7 +507,6 @@ static void unpack_all(void)
>         if (!quiet)
>                 progress = start_progress(_("Unpacking objects"), nr_objects);
>         CALLOC_ARRAY(obj_list, nr_objects);
> -       plug_bulk_checkin();
>         oflags = nr_objects > 1 ? HASH_N_OBJECTS : 0;
>         for (i = 0; i < nr_objects; i++) {
>                 int nth = i + 1;
> @@ -517,7 +516,6 @@ static void unpack_all(void)
>                 unpack_one(i, oflags | f);
>                 display_progress(progress, nth);
>         }
> -       unplug_bulk_checkin();
>         stop_progress(&progress);
>
>         if (delta_list)
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index cbd2b0d633b..95ed3c47b2e 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -1118,8 +1118,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>         parse_options_start(&ctx, argc, argv, prefix,
>                             options, PARSE_OPT_STOP_AT_NON_OPTION);
>
> -       /* optimize adding many objects to the object database */
> -       plug_bulk_checkin();
>         while (ctx.argc) {
>                 if (parseopt_state != PARSE_OPT_DONE)
>                         parseopt_state = parse_options_step(&ctx, options,
> @@ -1194,8 +1192,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>                 strbuf_release(&buf);
>         }
>
> -       /* by now we must have added all of the new objects */
> -       unplug_bulk_checkin();
>         if (split_index > 0) {
>                 if (git_config_get_split_index() == 0)
>                         warning(_("core.splitIndex is set to false; "
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index a0dca79ba6a..577b135e39c 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -3,20 +3,15 @@
>   */
>  #include "cache.h"
>  #include "bulk-checkin.h"
> -#include "lockfile.h"
>  #include "repository.h"
>  #include "csum-file.h"
>  #include "pack.h"
>  #include "strbuf.h"
> -#include "string-list.h"
> -#include "tmp-objdir.h"
>  #include "packfile.h"
>  #include "object-store.h"
>
>  static int bulk_checkin_plugged;
>
> -static struct tmp_objdir *bulk_fsync_objdir;
> -
>  static struct bulk_checkin_state {
>         char *pack_tmp_name;
>         struct hashfile *f;
> @@ -85,40 +80,6 @@ static void finish_bulk_checkin(struct bulk_checkin_state *state)
>         reprepare_packed_git(the_repository);
>  }
>
> -/*
> - * Cleanup after batch-mode fsync_object_files.
> - */
> -static void do_batch_fsync(void)
> -{
> -       struct strbuf temp_path = STRBUF_INIT;
> -       struct tempfile *temp;
> -
> -       if (!bulk_fsync_objdir)
> -               return;
> -
> -       /*
> -        * Issue a full hardware flush against a temporary file to ensure
> -        * that all objects are durable before any renames occur. The code in
> -        * fsync_loose_object_bulk_checkin has already issued a writeout
> -        * request, but it has not flushed any writeback cache in the storage
> -        * hardware or any filesystem logs. This fsync call acts as a barrier
> -        * to ensure that the data in each new object file is durable before
> -        * the final name is visible.
> -        */
> -       strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
> -       temp = xmks_tempfile(temp_path.buf);
> -       fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
> -       delete_tempfile(&temp);
> -       strbuf_release(&temp_path);
> -
> -       /*
> -        * Make the object files visible in the primary ODB after their data is
> -        * fully durable.
> -        */
> -       tmp_objdir_migrate(bulk_fsync_objdir);
> -       bulk_fsync_objdir = NULL;
> -}
> -
>  static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
>  {
>         int i;
> @@ -313,26 +274,6 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
>         return 0;
>  }
>
> -void prepare_loose_object_bulk_checkin(void)
> -{
> -       if (bulk_checkin_plugged && !bulk_fsync_objdir)
> -               bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> -}
> -
> -void fsync_loose_object_bulk_checkin(int fd, const char *filename)
> -{
> -       /*
> -        * If we have a plugged bulk checkin, we issue a call that
> -        * cleans the filesystem page cache but avoids a hardware flush
> -        * command. Later on we will issue a single hardware flush
> -        * before as part of do_batch_fsync.
> -        */
> -       if (!bulk_fsync_objdir ||
> -           git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
> -               fsync_or_die(fd, filename);
> -       }
> -}
> -
>  int index_bulk_checkin(struct object_id *oid,
>                        int fd, size_t size, enum object_type type,
>                        const char *path, unsigned flags)
> @@ -347,19 +288,6 @@ int index_bulk_checkin(struct object_id *oid,
>  void plug_bulk_checkin(void)
>  {
>         assert(!bulk_checkin_plugged);
> -
> -       /*
> -        * A temporary object directory is used to hold the files
> -        * while they are not fsynced.
> -        */
> -       if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
> -               bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
> -               if (!bulk_fsync_objdir)
> -                       die(_("Could not create temporary object directory for core.fsyncMethod=batch"));
> -
> -               tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
> -       }
> -
>         bulk_checkin_plugged = 1;
>  }
>
> @@ -369,6 +297,4 @@ void unplug_bulk_checkin(void)
>         bulk_checkin_plugged = 0;
>         if (bulk_checkin_state.f)
>                 finish_bulk_checkin(&bulk_checkin_state);
> -
> -       do_batch_fsync();
>  }
> diff --git a/bulk-checkin.h b/bulk-checkin.h
> index 181d3447ff9..b26f3dc3b74 100644
> --- a/bulk-checkin.h
> +++ b/bulk-checkin.h
> @@ -6,9 +6,6 @@
>
>  #include "cache.h"
>
> -void prepare_loose_object_bulk_checkin(void);
> -void fsync_loose_object_bulk_checkin(int fd, const char *filename);
> -
>  int index_bulk_checkin(struct object_id *oid,
>                        int fd, size_t size, enum object_type type,
>                        const char *path, unsigned flags);
> diff --git a/cache.h b/cache.h
> index 72c91c91286..2f3831fa853 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1772,11 +1772,6 @@ void fsync_or_die(int fd, const char *);
>  int fsync_component(enum fsync_component component, int fd);
>  void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
>
> -static inline int batch_fsync_enabled(enum fsync_component component)
> -{
> -       return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
> -}
> -
>  ssize_t read_in_full(int fd, void *buf, size_t count);
>  ssize_t write_in_full(int fd, const void *buf, size_t count);
>  ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
> diff --git a/object-file.c b/object-file.c
> index cd0ddb49e4b..dbeb3df502d 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1886,19 +1886,37 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
>         hash_object_file_literally(algo, buf, len, type_name(type), oid);
>  }
>
> +static void sync_loose_object_batch(int fd, const char *filename,
> +                                   const unsigned oflags)
> +{
> +       const int last = oflags & HASH_N_OBJECTS_LAST;
> +
> +       /*
> +        * We're doing a sync_file_range() (or equivalent) for 1..N-1
> +        * objects, and then a "real" fsync() for N. On some OS's
> +        * enabling core.fsync=loose-object && core.fsyncMethod=batch
> +        * improves the performance by a lot.
> +        */
> +       if (last || (!last && git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0))
> +               fsync_or_die(fd, filename);
> +}
> +
>  /* Finalize a file on disk, and close it. */
> -static void close_loose_object(int fd, const char *filename)
> +static void close_loose_object(int fd, const char *filename,
> +                              const unsigned oflags)
>  {
> +       int fsync_loose;
> +
>         if (the_repository->objects->odb->will_destroy)
>                 goto out;
>
> -       if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> -               fsync_loose_object_bulk_checkin(fd, filename);
> -       else if (fsync_object_files > 0)
> +       fsync_loose = fsync_components & FSYNC_COMPONENT_LOOSE_OBJECT;
> +
> +       if (oflags & HASH_N_OBJECTS && fsync_loose &&
> +           fsync_method == FSYNC_METHOD_BATCH)
> +               sync_loose_object_batch(fd, filename, oflags);
> +       else if (fsync_object_files > 0 || fsync_loose)
>                 fsync_or_die(fd, filename);
> -       else
> -               fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
> -                                      filename);
>
>  out:
>         if (close(fd) != 0)
> @@ -1962,9 +1980,6 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
>         static struct strbuf tmp_file = STRBUF_INIT;
>         static struct strbuf filename = STRBUF_INIT;
>
> -       if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
> -               prepare_loose_object_bulk_checkin();
> -
>         loose_object_path(the_repository, &filename, oid);
>
>         fd = create_tmpfile(&tmp_file, filename.buf);
> @@ -2015,7 +2030,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
>                 die(_("confused by unstable object source data for %s"),
>                     oid_to_hex(oid));
>
> -       close_loose_object(fd, tmp_file.buf);
> +       close_loose_object(fd, tmp_file.buf, flags);
>
>         if (mtime) {
>                 struct utimbuf utb;
> --
> 2.35.1.1428.g1c1a0152d61
>

Fine. Doing this patch series as non-RFC, we could start from prior to
my fsyncMethod=batch series.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects
  2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
@ 2022-03-23 20:30                       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23 20:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Wed, Mar 23, 2022 at 7:18 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> As with unpack-objects in a preceding commit have update-index.c make
> use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch"
> mode again for "update-index".
>
> Adding the t/* directory from git.git on a Linux ramdisk is a bit
> faster than with the tmp-objdir indirection:
>
>         $ git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' -p 'rm -rf repo/.git/objects/* repo/.git/index' './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' --warmup 1 -r 10Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync
>           Time (mean ± σ):     281.1 ms ±   2.6 ms    [User: 186.2 ms, System: 92.3 ms]
>           Range (min … max):   278.3 ms … 287.0 ms    10 runs
>
>         Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD
>           Time (mean ± σ):     265.9 ms ±   2.6 ms    [User: 181.7 ms, System: 82.1 ms]
>           Range (min … max):   262.0 ms … 270.3 ms    10 runs
>
>         Summary
>           './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
>             1.06 ± 0.01 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'
>
> And as before running that with "strace --summary-only" slows things
> down a bit (probably mimicking slower I/O a bit). I then get:
>
>         Summary
>           'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'HEAD' ran
>             1.19 ± 0.03 times faster than 'strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'ns/batched-fsync'
>
> This one has a twist though, instead of fsync()-ing on the last object
> we write we'll not do that, and instead defer the fsync() until we
> write the index itself. This is outlined in [1] (as "METHOD THREE").
>
> Because of this under FSYNC_METHOD_BATCH we'll do the N
> objects (possibly only one, because we're lazy) as HASH_N_OBJECTS, and
> we'll even now support doing this via N arguments on the command-line.
>
> Then we won't fsync() any of it, but we will rename it
> in-place (which, if we were still using the tmp-objdir, would leave it
> "staged" in the tmp-objdir).
>
> We'll then have the fsync() for the index update "flush" that out, and
> thus avoid two fsync() calls when one will do.
>
> Running this with the "git hyperfine" command mentioned in a preceding
> commit with "strace --summary-only" shows that we do 1 fsync() now
> instead of 2, and have one more sync_file_range(), as expected.
>
> We also go from ~51k syscalls to ~39k, with ~2x the number of link()
> and unlink() in ns/batched-fsync, and of course one fsync() instead of
> two()>
>
> The flow of this code isn't quite set up for re-plugging the
> tmp-objdir back in. In particular we no longer pass
> HASH_N_OBJECTS_FIRST (but doing so would be trivial)< and there's no
> HASH_N_OBJECTS_LAST.
>
> So this and other callers would need some light transaction-y API, or
> to otherwise pass down a "yes, I'd like to flush it" down to
> finalize_hashfile(), but doing so will be trivial.
>
> And since we've started structuring it this way it'll become easy to
> do any arbitrary number of things down the line that would "bulk
> fsync" before the final fsync(). Now we write some objects and fsync()
> on the index, but between those two could do any number of other
> things where we'd defer the fsync().
>
> This sort of thing might be especially interesting for "git repack"
> when it writes e.g. a *.bitmap, *.rev, *.pack and *.idx. In that case
> we could skip the fsync() on all of those, and only do it on the *.idx
> before we renamed it in-place. I *think* nothing cares about a *.pack
> without an *.idx, but even then we could fsync *.idx, rename *.pack,
> rename *.idx and still safely do only one fsync(). See "git show
> --first-parent" on 62874602032 (Merge branch
> 'tb/pack-finalize-ordering' into maint, 2021-10-12) for a good
> overview of the code involved in that.
>
> 1. https://lore.kernel.org/git/220323.86sfr9ndpr.gmgdl@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/update-index.c |  7 ++++---
>  cache.h                |  1 +
>  read-cache.c           | 29 ++++++++++++++++++++++++++++-
>  3 files changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 34aaaa16c20..6cfec6efb38 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -1142,7 +1142,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>
>                         setup_work_tree();
>                         p = prefix_path(prefix, prefix_length, path);
> -                       update_one(p, 0);
> +                       update_one(p, HASH_N_OBJECTS);
>                         if (set_executable_bit)
>                                 chmod_path(set_executable_bit, p);
>                         free(p);
> @@ -1187,7 +1187,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>                                 strbuf_swap(&buf, &unquoted);
>                         }
>                         p = prefix_path(prefix, prefix_length, buf.buf);
> -                       update_one(p, 0);
> +                       update_one(p, HASH_N_OBJECTS);
>                         if (set_executable_bit)
>                                 chmod_path(set_executable_bit, p);
>                         free(p);
> @@ -1263,7 +1263,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>                                 exit(128);
>                         unable_to_lock_die(get_index_file(), lock_error);
>                 }
> -               if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
> +               if (write_locked_index(&the_index, &lock_file,
> +                                      COMMIT_LOCK | WLI_NEED_LOOSE_FSYNC))
>                         die("Unable to write new index file");
>         }
>
> diff --git a/cache.h b/cache.h
> index 2f3831fa853..7542e009a34 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -751,6 +751,7 @@ void ensure_full_index(struct index_state *istate);
>  /* For use with `write_locked_index()`. */
>  #define COMMIT_LOCK            (1 << 0)
>  #define SKIP_IF_UNCHANGED      (1 << 1)
> +#define WLI_NEED_LOOSE_FSYNC   (1 << 2)
>
>  /*
>   * Write the index while holding an already-taken lock. Close the lock,
> diff --git a/read-cache.c b/read-cache.c
> index 3e0e7d41837..275f6308c32 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2860,6 +2860,33 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
>         int ieot_entries = 1;
>         struct index_entry_offset_table *ieot = NULL;
>         int nr, nr_threads;
> +       unsigned int wflags = FSYNC_COMPONENT_INDEX;
> +
> +
> +       /*
> +        * TODO: This is abuse of the API recently modified
> +        * finalize_hashfile() which reveals a shortcoming of its
> +        * "fsync" design.
> +        *
> +        * I.e. It expects a "enum fsync_component component" label,
> +        * but here we're passing it an OR of the two, knowing that
> +        * it'll call fsync_component_or_die() which (in
> +        * write-or-die.c) will do "(fsync_components & wflags)" (to
> +        * our "wflags" here).
> +        *
> +        * But the API really should be changed to explicitly take
> +        * such flags, because in this case we'd like to fsync() the
> +        * index if we're in the bulk mode, *even if* our
> +        * "core.fsync=index" isn't configured.
> +        *
> +        * That's because at this point we've been queuing up object
> +        * writes that we didn't fsync(), and are going to use this
> +        * fsync() to "flush" the whole thing. Doing it this way
> +        * avoids redundantly calling fsync() twice when once will do.
> +        */
> +       if (fsync_method == FSYNC_METHOD_BATCH &&
> +           flags & WLI_NEED_LOOSE_FSYNC)
> +               wflags |= FSYNC_COMPONENT_LOOSE_OBJECT;
>
>         f = hashfd(tempfile->fd, tempfile->filename.buf);
>
> @@ -3094,7 +3121,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
>         if (!alternate_index_output && (flags & COMMIT_LOCK))
>                 csum_fsync_flag = CSUM_FSYNC;
>
> -       finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
> +       finalize_hashfile(f, istate->oid.hash, wflags,
>                           CSUM_HASH_IN_STREAM | csum_fsync_flag);
>
>         if (close_tempfile_gently(tempfile)) {
> --
> 2.35.1.1428.g1c1a0152d61
>

In the long run, we should attach the "need to fsync the index" to an
ongoing 'repo-transaction' so that we can composably sync at the best
point regardless of what the top-level git operation does.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old
  2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
@ 2022-03-23 21:08                       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-23 21:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Wed, Mar 23, 2022 at 7:18 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Add a new fsyncMethod.batch.quarantine setting which defaults to
> "false". Preceding (RFC, and not meant to flip-flop like that
> eventually) commits ripped out the "tmp-objdir" part of the
> core.fsyncMethod=batch.
>
> This documentation proposes to keep that as the default for the
> reasons discussed in it, while allowing users to set
> "fsyncMethod.batch.quarantine=true".
>
> Furthermore update the discussion of "core.fsyncObjectFiles" with
> information about what it *really* does, why you probably shouldn't
> use it, and how to safely emulate most of what it gave users in the
> past in terms of performance benefit.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  Documentation/config/core.txt | 80 +++++++++++++++++++++++++++++++----
>  1 file changed, 72 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index f598925b597..365a12dc7ae 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -607,21 +607,85 @@ stored on NTFS or ReFS filesystems.
>  +
>  The `batch` is currently only applies to loose-object files and will
>  kick in when using the linkgit:git-unpack-objects[1] and
> -linkgit:update-index[1] commands. Note that the "last" file to be
> +linkgit:git-update-index[1] commands. Note that the "last" file to be
>  synced may be the last object, as in the case of
>  linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
>  "ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
>  batch syncing of the loose objects may be deferred until a subsequent
>  fsync() to a file that makes them "active".
>
> +fsyncMethod.batch.quarantine::
> +       A boolean which if set to `true` will cause "batched" writes
> +       to objects to be "quarantined" if
> +       `core.fsyncMethod=batch`. This is `false` by default.
> ++
> +The primary object of these fsync() settings is to protect against
> +repository corruption of things which are reachable, i.e. "reachable",
> +via references, the index etc. Not merely objects that were present in
> +the object store.
> ++
> +Historically setting `core.fsyncObjectFiles=false` assumed that on a
> +filesystem with where an fsync() would flush all preceding outstanding
> +I/O that we might end up with a corrupt loose object, but that was OK
> +as long as no reference referred to it. We'd eventually the corrupt
> +object with linkgit:git-gc[1], and linkgit:git-fsck[1] would only
> +report it as a minor annoyance
> ++
> +Setting `fsyncMethod.batch.quarantine=true` takes the view that
> +something like a corrupt *unreferenced* loose object in the object
> +store is something we'd like to avoid, at the cost of reduced
> +performance when using `core.fsyncMethod=batch`.
> ++
> +Currently this uses the same mechanism described in the "QUARANTINE
> +ENVIRONMENT" in the linkgit:git-receive-pack[1] documentation, but
> +that's subject to change. The performance loss is because we need to
> +"stage" the objects in that quarantine environment, fsync() it, and
> +once that's done rename() or link() it in-place into the main object
> +store, possibly with an fsync() of the index or ref at the end
> ++
> +With `fsyncMethod.batch.quarantine=false` we'll "stage" things in the
> +main object store, and then do one fsync() at the very end, either on
> +the last object we write, or file (index or ref) that'll make it
> +"reachable".
> ++
> +The bad thing about setting this to `true` is lost performance, as
> +well as not being able to access the objects as they're written (which
> +e.g. consumers of linkgit:git-update-index[1]'s `--verbose` mode might
> +want to do).

I wasn't able to understand clearly from your performance numbers.
What did you measure as the additional cost from quarantine=true
versus quarantine=false? Just if you have the numbers handy...

> ++
> +The good thing is that you should be guaranteed not to get e.g. short
> +or otherwise corrupt loose objects if you pull your power cord, in
> +practice various git commands deal quite badly with discovering such a
> +stray corrupt object (including perhaps assuming it's valid based on
> +its existence, or hard dying on an error rather than replacing
> +it). Repairing such "unreachable corruption" can require manual
> +intervention.
> +
>  core.fsyncObjectFiles::
> -       This boolean will enable 'fsync()' when writing object files.
> -       This setting is deprecated. Use core.fsync instead.
> -+
> -This setting affects data added to the Git repository in loose-object
> -form. When set to true, Git will issue an fsync or similar system call
> -to flush caches so that loose-objects remain consistent in the face
> -of a unclean system shutdown.
> +       This boolean will enable 'fsync()' when writing loose object
> +       files.
> ++
> +This setting is the historical fsync configuration setting. It's now
> +*deprecated*, you should use `core.fsync` instead, perhaps in
> +combination with `core.fsyncMethod=batch`.
> ++
> +The `core.fsyncObjectFiles` was initially added based on integrity
> +assumptions that early (pre-ext-4) versions of Linux's "ext"
> +filesystems provided.
> ++
> +I.e. that a write of file A without an `fsync()` followed by a write
> +of file `B` with `fsync()` would implicitly guarantee that `A' would
> +be `fsync()`'d by calling `fsync()` on `B`. This asssumption is *not*
> +backed up by any standard (e.g. POSIX), but worked in practice on some
> +Linux setups.
> ++
> +Nowadays you should almost certainly want to use
> +`core.fsync=loose-object` instead in combination with
> +`core.fsyncMethod=bulk`, and possibly with
> +`fsyncMethod.batch.quarantine=true`, see above. On modern OS's (Linux,
> +OSX, Windows) that gives you most of the performance benefit of
> +`core.fsyncObjectFiles=false` with all of the safety of the old
> +`core.fsyncObjectFiles=true`.
>
>  core.preloadIndex::
>         Enable parallel index preload for operations like 'git diff'
> --
> 2.35.1.1428.g1c1a0152d61
>

I think the notion of minimizing fsyncs across the whole repository is
a great one.  However, your implementation isn't clean from an API
perspective, since people modifying the top-level commands need to
reason about the full set of operations to avoid silently breaking the
fsync requirements.  I think we should phrase this as a "transaction"
that the top level command can begin and end. Subcomponents of the
repo can "enlist" in the transaction and do the right thing optimally
when the overall transaction commits or aborts.

In the end, I think the optimal solution should be layered on top of
the final form of my current patch series as an incremental
improvement.  I'm going to start the rebranding of
plug/unplug_bulk_checkin in V3 of the patch series.

Thanks,
Neeraj
LAST MAIL

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
@ 2022-03-24  2:04       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24  2:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Wed, Mar 23, 2022 at 6:27 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Sun, Mar 20 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> > [..
> > diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> > index 889522956e4..a3798dfc334 100644
> > --- a/Documentation/config/core.txt
> > +++ b/Documentation/config/core.txt
> > @@ -628,6 +628,13 @@ core.fsyncMethod::
> >  * `writeout-only` issues pagecache writeback requests, but depending on the
> >    filesystem and storage hardware, data added to the repository may not be
> >    durable in the event of a system crash. This is the default mode on macOS.
> > +* `batch` enables a mode that uses writeout-only flushes to stage multiple
> > +  updates in the disk writeback cache and then does a single full fsync of
> > +  a dummy file to trigger the disk cache flush at the end of the operation.
>
> I think adding a \n\n here would help make this more readable & break
> the flow a bit. I.e. just add a "+" on its own line, followed by
> "Currently...
>
> > +  Currently `batch` mode only applies to loose-object files. Other repository
> > +  data is made durable as if `fsync` was specified. This mode is expected to
> > +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
> > +  and on Windows for repos stored on NTFS or ReFS filesystems.

Thanks, will fix.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
                     ` (7 preceding siblings ...)
  2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
@ 2022-03-24  4:58   ` Neeraj K. Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
                       ` (12 more replies)
  8 siblings, 13 replies; 175+ messages in thread
From: Neeraj K. Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

V3 changes:

 * Rebrand plug/unplug-bulk-checkin to "begin_odb_transaction" and
   "end_odb_transaction"
 * Add a patch to pass filenames to fsync_or_die, rather than the string
   "loose object"
 * Update the commit description for "core.fsyncmethod to explain why we do
   not directly expose objects until an fsync occurs.
 * Also explain in the commit description why we're using a dummy file for
   the fsync.
 * Create the bulk-fsync tmp-objdir lazily the first time a loose object is
   added. We now do fsync iff that objdir exists.
 * Do batch fsync if core.fsyncMethod=batch and core.fsync contains
   loose-object, regardless of the core.fsyncObjectFiles setting.
 * Mitigate the risk in update-index of an object not being visible due to
   bulk checkin.
 * Add a perf comment to justify the unpack-objects usage of bulk-checkin.
 * Add a new patch to create helpers for parsing OIDs from git commands.
 * Add a comment to the lib-unique-files.sh helper about uniqueness only
   within a repo.
 * Fix style and add '&&' chaining to test helpers.
 * Comment on some magic numbers in tests.
 * Take the object list as an argument in
   ./t5300-pack-object.sh:check_unpack ()
 * Drop accidental change to t/perf/perf-lib.sh

V2 changes:

 * Change doc to indicate that only some repo updates are batched
 * Null and zero out control variables in do_batch_fsync under
   unplug_bulk_checkin
 * Make batch mode default on Windows.
 * Update the description for the initial patch that cleans up the
   bulk-checkin infrastructure.
 * Rebase onto 'seen' at 0cac37f38f9.

--Original definition-- When core.fsync includes loose-object, we issue an
fsync after every written object. For a 'git-add' or similar command that
adds a lot of files to the repo, the costs of these fsyncs adds up. One
major factor in this cost is the time it takes for the physical storage
controller to flush its caches to durable media.

This series takes advantage of the writeout-only mode of git_fsync to issue
OS cache writebacks for all of the objects being added to the repository
followed by a single fsync to a dummy file, which should trigger a
filesystem log flush and storage controller cache flush. This mechanism is
known to be safe on common Windows filesystems and expected to be safe on
macOS. Some linux filesystems, such as XFS, will probably do the right thing
as well. See [1] for previous discussion on the predecessor of this patch
series.

This series is important on Windows, where loose-objects are included in the
fsync set by default in Git-For-Windows. In this series, I'm also setting
the default mode for Windows to turn on loose object fsyncing with batch
mode, so that we can get CI coverage of the actual git-for-windows
configuration upstream. We still don't actually issue fsyncs for the test
suite since GIT_TEST_FSYNC is set to 0, but we exercise all of the
surrounding batch mode code.

This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.

[1]
https://lore.kernel.org/git/2c1ddef6057157d85da74a7274e03eacf0374e45.1629856293.git.gitgitgadget@gmail.com/

Neeraj Singh (11):
  bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  object-file: pass filename to fsync_or_die
  core.fsyncmethod: batched disk flushes for loose-objects
  update-index: use the bulk-checkin infrastructure
  unpack-objects: use the bulk-checkin infrastructure
  core.fsync: use batch mode and sync loose objects by default on
    Windows
  test-lib-functions: add parsing helpers for ls-files and ls-tree
  core.fsyncmethod: tests for batch mode
  core.fsyncmethod: performance tests for add and stash
  core.fsyncmethod: correctly camel-case warning message

 Documentation/config/core.txt          |  8 +++
 builtin/add.c                          |  4 +-
 builtin/unpack-objects.c               |  3 +
 builtin/update-index.c                 | 33 +++++++++
 bulk-checkin.c                         | 97 ++++++++++++++++++++++----
 bulk-checkin.h                         | 17 ++++-
 cache.h                                | 12 +++-
 compat/mingw.h                         |  3 +
 config.c                               |  6 +-
 git-compat-util.h                      |  2 +
 object-file.c                          | 15 ++--
 t/lib-unique-files.sh                  | 32 +++++++++
 t/perf/p3700-add.sh                    | 59 ++++++++++++++++
 t/perf/p3900-stash.sh                  | 62 ++++++++++++++++
 t/t3700-add.sh                         | 28 ++++++++
 t/t3903-stash.sh                       | 20 ++++++
 t/t5300-pack-object.sh                 | 41 +++++++----
 t/t5317-pack-objects-filter-objects.sh | 91 ++++++++++++------------
 t/test-lib-functions.sh                | 10 +++
 19 files changed, 458 insertions(+), 85 deletions(-)
 create mode 100644 t/lib-unique-files.sh
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh


base-commit: c54b8eb302ffb72f31e73a26044c8a864e2cb307
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1134%2Fneerajsi-msft%2Fns%2Fbatched-fsync-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1134/neerajsi-msft/ns/batched-fsync-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1134

Range-diff vs v2:

  -:  ----------- >  1:  53261f0099d bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  1:  9c2abd12bbb !  2:  b2d9766a662 bulk-checkin: rename 'state' variable and separate 'plugged' boolean
     @@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
       	return status;
       }
       
     - void plug_bulk_checkin(void)
     + void begin_odb_transaction(void)
       {
      -	state.plugged = 1;
      +	assert(!bulk_checkin_plugged);
      +	bulk_checkin_plugged = 1;
       }
       
     - void unplug_bulk_checkin(void)
     + void end_odb_transaction(void)
       {
      -	state.plugged = 0;
      -	if (state.f)
  -:  ----------- >  3:  26ce5b8fdda object-file: pass filename to fsync_or_die
  2:  3ed1dcd9b9b !  4:  52638326790 core.fsyncmethod: batched disk flushes for loose-objects
     @@ Commit message
          One major source of the cost of fsync is the implied flush of the
          hardware writeback cache within the disk drive. This commit introduces
          a new `core.fsyncMethod=batch` option that batches up hardware flushes.
     -    It hooks into the bulk-checkin plugging and unplugging functionality,
     -    takes advantage of tmp-objdir, and uses the writeout-only support code.
     +    It hooks into the bulk-checkin odb-transaction functionality, takes
     +    advantage of tmp-objdir, and uses the writeout-only support code.
      
          When the new mode is enabled, we do the following for each new object:
     -    1. Create the object in a tmp-objdir.
     -    2. Issue a pagecache writeback request and wait for it to complete.
     +    1a. Create the object in a tmp-objdir.
     +    2a. Issue a pagecache writeback request and wait for it to complete.
      
          At the end of the entire transaction when unplugging bulk checkin:
     -    1. Issue an fsync against a dummy file to flush the hardware writeback
     -       cache, which should by now have seen the tmp-objdir writes.
     -    2. Rename all of the tmp-objdir files to their final names.
     -    3. When updating the index and/or refs, we assume that Git will issue
     +    1b. Issue an fsync against a dummy file to flush the log and hardware
     +       writeback cache, which should by now have seen the tmp-objdir writes.
     +    2b. Rename all of the tmp-objdir files to their final names.
     +    3b. When updating the index and/or refs, we assume that Git will issue
             another fsync internal to that operation. This is not the default
             today, but the user now has the option of syncing the index and there
             is a separate patch series to implement syncing of refs.
     @@ Commit message
          operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
          we would expect the fsync to trigger a journal writeout so that this
          sequence is enough to ensure that the user's data is durable by the time
     -    the git command returns.
     +    the git command returns. This sequence also ensures that no object files
     +    appear in the main object store unless they are fsync-durable.
      
     -    Batch mode is only enabled if core.fsyncObjectFiles is false or unset.
     +    Batch mode is only enabled if core.fsync includes loose-objects. If
     +    the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
     +    not include loose-objects, we will use file-by-file fsyncing.
     +
     +    In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
     +    work if no loose objects are ever added to the ODB. We use a tmp-objdir
     +    to maintain the invariant that no loose-objects are visible in the main
     +    ODB unless they are properly fsync-durable. This is important since
     +    future ODB operations that try to create an object with specific
     +    contents will silently drop the new data if an object with the target
     +    hash exists without checking that the loose-object contents match the
     +    hash. Only a full git-fsck would restore the ODB to a functional state
     +    where dataloss doesn't occur.
     +
     +    In step (1b) of the sequence, we issue a fsync against a dummy file
     +    created specifically for the purpose. This method has a little higher
     +    cost than using one of the input object files, but makes adding new
     +    callers of this mechanism easier, since we don't need to figure out
     +    which object file is "last" or risk sharing violations by caching the fd
     +    of the last object file.
      
          _Performance numbers_:
      
     @@ Documentation/config/core.txt: core.fsyncMethod::
      +* `batch` enables a mode that uses writeout-only flushes to stage multiple
      +  updates in the disk writeback cache and then does a single full fsync of
      +  a dummy file to trigger the disk cache flush at the end of the operation.
     +++
      +  Currently `batch` mode only applies to loose-object files. Other repository
      +  data is made durable as if `fsync` was specified. This mode is expected to
      +  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
     @@ bulk-checkin.c
       #include "object-store.h"
       
       static int bulk_checkin_plugged;
     -+static int needs_batch_fsync;
     -+
     -+static struct tmp_objdir *bulk_fsync_objdir;
       
     ++static struct tmp_objdir *bulk_fsync_objdir;
     ++
       static struct bulk_checkin_state {
       	char *pack_tmp_name;
     + 	struct hashfile *f;
      @@ bulk-checkin.c: clear_exit:
       	reprepare_packed_git(the_repository);
       }
     @@ bulk-checkin.c: clear_exit:
      + */
      +static void do_batch_fsync(void)
      +{
     ++	struct strbuf temp_path = STRBUF_INIT;
     ++	struct tempfile *temp;
     ++
     ++	if (!bulk_fsync_objdir)
     ++		return;
     ++
      +	/*
      +	 * Issue a full hardware flush against a temporary file to ensure
     -+	 * that all objects are durable before any renames occur.  The code in
     ++	 * that all objects are durable before any renames occur. The code in
      +	 * fsync_loose_object_bulk_checkin has already issued a writeout
      +	 * request, but it has not flushed any writeback cache in the storage
     -+	 * hardware.
     ++	 * hardware or any filesystem logs. This fsync call acts as a barrier
     ++	 * to ensure that the data in each new object file is durable before
     ++	 * the final name is visible.
      +	 */
     ++	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
     ++	temp = xmks_tempfile(temp_path.buf);
     ++	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
     ++	delete_tempfile(&temp);
     ++	strbuf_release(&temp_path);
      +
     -+	if (needs_batch_fsync) {
     -+		struct strbuf temp_path = STRBUF_INIT;
     -+		struct tempfile *temp;
     -+
     -+		strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
     -+		temp = xmks_tempfile(temp_path.buf);
     -+		fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
     -+		delete_tempfile(&temp);
     -+		strbuf_release(&temp_path);
     -+		needs_batch_fsync = 0;
     -+	}
     -+
     -+	if (bulk_fsync_objdir) {
     -+		tmp_objdir_migrate(bulk_fsync_objdir);
     -+		bulk_fsync_objdir = NULL;
     -+	}
     ++	/*
     ++	 * Make the object files visible in the primary ODB after their data is
     ++	 * fully durable.
     ++	 */
     ++	tmp_objdir_migrate(bulk_fsync_objdir);
     ++	bulk_fsync_objdir = NULL;
      +}
      +
       static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
     @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
       	return 0;
       }
       
     -+void fsync_loose_object_bulk_checkin(int fd)
     ++void prepare_loose_object_bulk_checkin(void)
     ++{
     ++	/*
     ++	 * We lazily create the temporary object directory
     ++	 * the first time an object might be added, since
     ++	 * callers may not know whether any objects will be
     ++	 * added at the time they call begin_odb_transaction.
     ++	 */
     ++	if (!bulk_checkin_plugged || bulk_fsync_objdir)
     ++		return;
     ++
     ++	bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
     ++	if (bulk_fsync_objdir)
     ++		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
     ++}
     ++
     ++void fsync_loose_object_bulk_checkin(int fd, const char *filename)
      +{
      +	/*
      +	 * If we have a plugged bulk checkin, we issue a call that
     @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
      +	 * command. Later on we will issue a single hardware flush
      +	 * before as part of do_batch_fsync.
      +	 */
     -+	if (bulk_checkin_plugged &&
     -+	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) {
     -+		assert(bulk_fsync_objdir);
     -+		if (!needs_batch_fsync)
     -+			needs_batch_fsync = 1;
     -+	} else {
     -+		fsync_or_die(fd, "loose object file");
     ++	if (!bulk_fsync_objdir ||
     ++	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
     ++		fsync_or_die(fd, filename);
      +	}
      +}
      +
       int index_bulk_checkin(struct object_id *oid,
       		       int fd, size_t size, enum object_type type,
       		       const char *path, unsigned flags)
     -@@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
     - void plug_bulk_checkin(void)
     - {
     - 	assert(!bulk_checkin_plugged);
     -+
     -+	/*
     -+	 * A temporary object directory is used to hold the files
     -+	 * while they are not fsynced.
     -+	 */
     -+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) {
     -+		bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
     -+		if (!bulk_fsync_objdir)
     -+			die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch"));
     -+
     -+		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
     -+	}
     -+
     - 	bulk_checkin_plugged = 1;
     - }
     - 
     -@@ bulk-checkin.c: void unplug_bulk_checkin(void)
     +@@ bulk-checkin.c: void end_odb_transaction(void)
       	bulk_checkin_plugged = 0;
       	if (bulk_checkin_state.f)
       		finish_bulk_checkin(&bulk_checkin_state);
     @@ bulk-checkin.h
       
       #include "cache.h"
       
     -+void fsync_loose_object_bulk_checkin(int fd);
     ++void prepare_loose_object_bulk_checkin(void);
     ++void fsync_loose_object_bulk_checkin(int fd, const char *filename);
      +
       int index_bulk_checkin(struct object_id *oid,
       		       int fd, size_t size, enum object_type type,
     @@ cache.h: extern int use_fsync;
       	FSYNC_METHOD_FSYNC,
      -	FSYNC_METHOD_WRITEOUT_ONLY
      +	FSYNC_METHOD_WRITEOUT_ONLY,
     -+	FSYNC_METHOD_BATCH
     ++	FSYNC_METHOD_BATCH,
       };
       
       extern enum fsync_method fsync_method;
     @@ config.c: static int git_default_core_config(const char *var, const char *value,
       
      
       ## object-file.c ##
     -@@ object-file.c: static void close_loose_object(int fd)
     +@@ object-file.c: static void close_loose_object(int fd, const char *filename)
     + 	if (the_repository->objects->odb->will_destroy)
     + 		goto out;
       
     - 	if (fsync_object_files > 0)
     - 		fsync_or_die(fd, "loose object file");
     -+	else if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
     -+		fsync_loose_object_bulk_checkin(fd);
     +-	if (fsync_object_files > 0)
     ++	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
     ++		fsync_loose_object_bulk_checkin(fd, filename);
     ++	else if (fsync_object_files > 0)
     + 		fsync_or_die(fd, filename);
       	else
       		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
     - 				       "loose object file");
     +@@ object-file.c: static int write_loose_object(const struct object_id *oid, char *hdr,
     + 	static struct strbuf tmp_file = STRBUF_INIT;
     + 	static struct strbuf filename = STRBUF_INIT;
     + 
     ++	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
     ++		prepare_loose_object_bulk_checkin();
     ++
     + 	loose_object_path(the_repository, &filename, oid);
     + 
     + 	fd = create_tmpfile(&tmp_file, filename.buf);
  3:  54797dbc520 !  5:  913ce1b3df9 update-index: use the bulk-checkin infrastructure
     @@ Commit message
          The update-index functionality is used internally by 'git stash push' to
          setup the internal stashed commit.
      
     -    This change enables bulk-checkin for update-index infrastructure to
     +    This change enables odb-transactions for update-index infrastructure to
          speed up adding new objects to the object database by leveraging the
          batch fsync functionality.
      
          There is some risk with this change, since under batch fsync, the object
     -    files will be in a tmp-objdir until update-index is complete.  This
     -    usage is unlikely, since any tool invoking update-index and expecting to
     -    see objects would have to synchronize with the update-index process
     -    after passing it a file path.
     +    files will be in a tmp-objdir until update-index is complete, so callers
     +    using the --stdin option will not see them until update-index is done.
     +    This risk is mitigated by unplugging the batch when reporting verbose
     +    output, which is the only way a --stdin caller might synchronize with
     +    the addition of an object.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ builtin/update-index.c
       #include "config.h"
       #include "lockfile.h"
       #include "quote.h"
     -@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
     +@@ builtin/update-index.c: static int allow_replace;
     + static int info_only;
     + static int force_remove;
     + static int verbose;
     ++static int odb_transaction_active;
     + static int mark_valid_only;
     + static int mark_skip_worktree_only;
     + static int mark_fsmonitor_only;
     +@@ builtin/update-index.c: enum uc_mode {
     + 	UC_FORCE
     + };
       
     - 	the_index.updated_skipworktree = 1;
     ++static void end_odb_transaction_if_active(void)
     ++{
     ++	if (!odb_transaction_active)
     ++		return;
     ++
     ++	end_odb_transaction();
     ++	odb_transaction_active = 0;
     ++}
     ++
     + __attribute__((format (printf, 1, 2)))
     + static void report(const char *fmt, ...)
     + {
     +@@ builtin/update-index.c: static void report(const char *fmt, ...)
     + 	if (!verbose)
     + 		return;
       
     -+	/* we might be adding many objects to the object database */
     -+	plug_bulk_checkin();
     ++	/*
     ++	 * It is possible, though unlikely, that a caller
     ++	 * could use the verbose output to synchronize with
     ++	 * addition of objects to the object database, so
     ++	 * unplug bulk checkin to make sure that future objects
     ++	 * are immediately visible.
     ++	 */
     ++
     ++	end_odb_transaction_if_active();
      +
     - 	/*
     - 	 * Custom copy of parse_options() because we want to handle
     - 	 * filename arguments as they come.
     + 	va_start(vp, fmt);
     + 	vprintf(fmt, vp);
     + 	putchar('\n');
     +@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
     + 	 */
     + 	parse_options_start(&ctx, argc, argv, prefix,
     + 			    options, PARSE_OPT_STOP_AT_NON_OPTION);
     ++
     ++	/*
     ++	 * Allow the object layer to optimize adding multiple objects in
     ++	 * a batch.
     ++	 */
     ++	begin_odb_transaction();
     ++	odb_transaction_active = 1;
     + 	while (ctx.argc) {
     + 		if (parseopt_state != PARSE_OPT_DONE)
     + 			parseopt_state = parse_options_step(&ctx, options,
      @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
       		strbuf_release(&buf);
       	}
       
     -+	/* by now we must have added all of the new objects */
     -+	unplug_bulk_checkin();
     ++	/*
     ++	 * By now we have added all of the new objects
     ++	 */
     ++	end_odb_transaction_if_active();
     ++
       	if (split_index > 0) {
       		if (git_config_get_split_index() == 0)
       			warning(_("core.splitIndex is set to false; "
  4:  6662e2dae0f !  6:  84fd144ef18 unpack-objects: use the bulk-checkin infrastructure
     @@ Commit message
          to turn the transfered data into object database entries when there are
          fewer objects than the 'unpacklimit' setting.
      
     -    By enabling bulk-checkin when unpacking objects, we can take advantage
     +    By enabling an odb-transaction when unpacking objects, we can take advantage
          of batched fsyncs.
      
     +    Here are some performance numbers to justify batch mode for
     +    unpack-objects, collected on a WSL2 Ubuntu VM.
     +
     +    Fsync Mode | Time for 90 objects (ms)
     +    -------------------------------------
     +           Off | 170
     +      On,fsync | 760
     +      On,batch | 230
     +
     +    Note that the default unpackLimit is 100 objects, so there's a 3x
     +    benefit in the worst case. The non-batch mode fsync scales linearly
     +    with the number of objects, so there are significant benefits even with
     +    smaller numbers of objects.
     +
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
       ## builtin/unpack-objects.c ##
     @@ builtin/unpack-objects.c: static void unpack_all(void)
       	if (!quiet)
       		progress = start_progress(_("Unpacking objects"), nr_objects);
       	CALLOC_ARRAY(obj_list, nr_objects);
     -+	plug_bulk_checkin();
     ++	begin_odb_transaction();
       	for (i = 0; i < nr_objects; i++) {
       		unpack_one(i);
       		display_progress(progress, i + 1);
       	}
     -+	unplug_bulk_checkin();
     ++	end_odb_transaction();
       	stop_progress(&progress);
       
       	if (delta_list)
  5:  03bf591742a !  7:  447263e8ef1 core.fsync: use batch mode and sync loose objects by default on Windows
     @@ Commit message
          in upstream Git so that we can get broad coverage of the new code
          upstream.
      
     -    We don't actually do fsyncs in the test suite, since GIT_TEST_FSYNC is
     -    set to 0. However, we do exercise all of the surrounding batch mode code
     -    since GIT_TEST_FSYNC merely makes the maybe_fsync wrapper always appear
     -    to succeed.
     +    We don't actually do fsyncs in the most of the test suite, since
     +    GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
     +    surrounding batch mode code since GIT_TEST_FSYNC merely makes the
     +    maybe_fsync wrapper always appear to succeed.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
  -:  ----------- >  8:  8f1b01c9ca0 test-lib-functions: add parsing helpers for ls-files and ls-tree
  6:  1937746df47 !  9:  b5f371e97fe core.fsyncmethod: tests for batch mode
     @@ Commit message
          In this change we introduce a new test helper lib-unique-files.sh. The
          goal of this library is to create a tree of files that have different
          oids from any other files that may have been created in the current test
     -    repo. This helps us avoid missing validation of an object being added due
     -    to it already being in the repo.
     -
     -    We aren't actually issuing any fsyncs in these tests, since
     -    GIT_TEST_FSYNC is 0, but we still exercise all of the tmp_objdir logic
     -    in bulk-checkin.
     +    repo. This helps us avoid missing validation of an object being added
     +    due to it already being in the repo.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ t/lib-unique-files.sh (new)
      @@
      +# Helper to create files with unique contents
      +
     -+
     -+# Create multiple files with unique contents. Takes the number of
     -+# directories, the number of files in each directory, and the base
     ++# Create multiple files with unique contents within this test run. Takes the
     ++# number of directories, the number of files in each directory, and the base
      +# directory.
      +#
      +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
     -+#					 each in my_dir, all with unique
     -+#					 contents.
     ++#					 each in my_dir, all with contents
     ++#					 different from previous invocations
     ++#					 of this command in this run.
      +
     -+test_create_unique_files() {
     ++test_create_unique_files () {
      +	test "$#" -ne 3 && BUG "3 param"
      +
     -+	local dirs=$1
     -+	local files=$2
     -+	local basedir=$3
     -+	local counter=0
     -+	test_tick
     -+	local basedata=$test_tick
     -+
     -+
     -+	rm -rf $basedir
     -+
     ++	local dirs="$1" &&
     ++	local files="$2" &&
     ++	local basedir="$3" &&
     ++	local counter=0 &&
     ++	test_tick &&
     ++	local basedata=$basedir$test_tick &&
     ++	rm -rf "$basedir" &&
      +	for i in $(test_seq $dirs)
      +	do
     -+		local dir=$basedir/dir$i
     -+
     -+		mkdir -p "$dir"
     ++		local dir=$basedir/dir$i &&
     ++		mkdir -p "$dir" &&
      +		for j in $(test_seq $files)
      +		do
     -+			counter=$((counter + 1))
     -+			echo "$basedata.$counter"  >"$dir/file$j.txt"
     ++			counter=$((counter + 1)) &&
     ++			echo "$basedata.$counter">"$dir/file$j.txt"
      +		done
      +	done
      +}
     @@ t/t3700-add.sh: test_expect_success \
      +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
      +
      +test_expect_success 'git add: core.fsyncmethod=batch' "
     -+	test_create_unique_files 2 4 fsync-files &&
     -+	git $BATCH_CONFIGURATION add -- ./fsync-files/ &&
     -+	rm -f fsynced_files &&
     -+	git ls-files --stage fsync-files/ > fsynced_files &&
     -+	test_line_count = 8 fsynced_files &&
     -+	awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e
     ++	test_create_unique_files 2 4 files_base_dir1 &&
     ++	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION add -- ./files_base_dir1/ &&
     ++	git ls-files --stage files_base_dir1/ |
     ++	test_parse_ls_files_stage_oids >added_files_oids &&
     ++
     ++	# We created 2 subdirs with 4 files each (8 files total) above
     ++	test_line_count = 8 added_files_oids &&
     ++	git cat-file --batch-check='%(objectname)' <added_files_oids >added_files_actual &&
     ++	test_cmp added_files_oids added_files_actual
      +"
      +
      +test_expect_success 'git update-index: core.fsyncmethod=batch' "
     -+	test_create_unique_files 2 4 fsync-files2 &&
     -+	find fsync-files2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
     -+	rm -f fsynced_files2 &&
     -+	git ls-files --stage fsync-files2/ > fsynced_files2 &&
     -+	test_line_count = 8 fsynced_files2 &&
     -+	awk -- '{print \$2}' fsynced_files2 | xargs -n1 git cat-file -e
     ++	test_create_unique_files 2 4 files_base_dir2 &&
     ++	find files_base_dir2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
     ++	git ls-files --stage files_base_dir2 |
     ++	test_parse_ls_files_stage_oids >added_files2_oids &&
     ++
     ++	# We created 2 subdirs with 4 files each (8 files total) above
     ++	test_line_count = 8 added_files2_oids &&
     ++	git cat-file --batch-check='%(objectname)' <added_files2_oids >added_files2_actual &&
     ++	test_cmp added_files2_oids added_files2_actual
      +"
      +
       test_expect_success \
     @@ t/t3903-stash.sh: test_expect_success 'stash handles skip-worktree entries nicel
      +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
      +
      +test_expect_success 'stash with core.fsyncmethod=batch' "
     -+	test_create_unique_files 2 4 fsync-files &&
     -+	git $BATCH_CONFIGURATION stash push -u -- ./fsync-files/ &&
     -+	rm -f fsynced_files &&
     ++	test_create_unique_files 2 4 files_base_dir &&
     ++	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION stash push -u -- ./files_base_dir/ &&
      +
      +	# The files were untracked, so use the third parent,
      +	# which contains the untracked files
     -+	git ls-tree -r stash^3 -- ./fsync-files/ > fsynced_files &&
     -+	test_line_count = 8 fsynced_files &&
     -+	awk -- '{print \$3}' fsynced_files | xargs -n1 git cat-file -e
     ++	git ls-tree -r stash^3 -- ./files_base_dir/ |
     ++	test_parse_ls_tree_oids >stashed_files_oids &&
     ++
     ++	# We created 2 dirs with 4 files each (8 files total) above
     ++	test_line_count = 8 stashed_files_oids &&
     ++	git cat-file --batch-check='%(objectname)' <stashed_files_oids >stashed_files_actual &&
     ++	test_cmp stashed_files_oids stashed_files_actual
      +"
      +
      +
     @@ t/t3903-stash.sh: test_expect_success 'stash handles skip-worktree entries nicel
      
       ## t/t5300-pack-object.sh ##
      @@ t/t5300-pack-object.sh: test_expect_success 'pack-objects with bogus arguments' '
     + '
       
       check_unpack () {
     ++	local packname="$1" &&
     ++	local object_list="$2" &&
     ++	local git_config="$3" &&
       	test_when_finished "rm -rf git2" &&
      -	git init --bare git2 &&
      -	git -C git2 unpack-objects -n <"$1".pack &&
     @@ t/t5300-pack-object.sh: test_expect_success 'pack-objects with bogus arguments'
      -			return 1
      -		}
      -	done
     -+	git $2 init --bare git2 &&
     ++	git $git_config init --bare git2 &&
      +	(
     -+		git $2 -C git2 unpack-objects -n <"$1".pack &&
     -+		git $2 -C git2 unpack-objects <"$1".pack &&
     -+		git $2 -C git2 cat-file --batch-check="%(objectname)"
     -+	) <obj-list >current &&
     -+	cmp obj-list current
     ++		git $git_config -C git2 unpack-objects -n <"$packname".pack &&
     ++		git $git_config -C git2 unpack-objects <"$packname".pack &&
     ++		git $git_config -C git2 cat-file --batch-check="%(objectname)"
     ++	) <"$object_list" >current &&
     ++	cmp "$object_list" current
       }
       
       test_expect_success 'unpack without delta' '
     - 	check_unpack test-1-${packname_1}
     - '
     - 
     +-	check_unpack test-1-${packname_1}
     ++	check_unpack test-1-${packname_1} obj-list
     ++'
     ++
      +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
      +
      +test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
     -+	check_unpack test-1-${packname_1} "$BATCH_CONFIGURATION"
     -+'
     -+
     ++	check_unpack test-1-${packname_1} obj-list "$BATCH_CONFIGURATION"
     + '
     + 
       test_expect_success 'pack with REF_DELTA' '
     - 	packname_2=$(git pack-objects --progress test-2 <obj-list 2>stderr) &&
     - 	check_deltas stderr -gt 0
     -@@ t/t5300-pack-object.sh: test_expect_success 'unpack with REF_DELTA' '
     - 	check_unpack test-2-${packname_2}
     +@@ t/t5300-pack-object.sh: test_expect_success 'pack with REF_DELTA' '
       '
       
     -+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
     -+       check_unpack test-2-${packname_2} "$BATCH_CONFIGURATION"
     + test_expect_success 'unpack with REF_DELTA' '
     +-	check_unpack test-2-${packname_2}
     ++	check_unpack test-2-${packname_2} obj-list
      +'
      +
     ++test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
     ++       check_unpack test-2-${packname_2} obj-list "$BATCH_CONFIGURATION"
     + '
     + 
       test_expect_success 'pack with OFS_DELTA' '
     - 	packname_3=$(git pack-objects --progress --delta-base-offset test-3 \
     - 			<obj-list 2>stderr) &&
     -@@ t/t5300-pack-object.sh: test_expect_success 'unpack with OFS_DELTA' '
     - 	check_unpack test-3-${packname_3}
     +@@ t/t5300-pack-object.sh: test_expect_success 'pack with OFS_DELTA' '
       '
       
     -+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
     -+       check_unpack test-3-${packname_3} "$BATCH_CONFIGURATION"
     + test_expect_success 'unpack with OFS_DELTA' '
     +-	check_unpack test-3-${packname_3}
     ++	check_unpack test-3-${packname_3} obj-list
      +'
      +
     ++test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
     ++       check_unpack test-3-${packname_3} obj-list "$BATCH_CONFIGURATION"
     + '
     + 
       test_expect_success 'compare delta flavors' '
     - 	perl -e '\''
     - 		defined($_ = -s $_) or die for @ARGV;
  7:  624244078c7 ! 10:  b99b32a469c core.fsyncmethod: performance tests for add and stash
     @@ Commit message
      
          Add basic performance tests for "git add" and "git stash" of a lot of
          new objects with various fsync settings. This shows the benefit of batch
     -    mode relative to an ordinary stash command.
     +    mode relative to full fsync.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ t/perf/p3900-stash.sh (new)
      +done
      +
      +test_done
     -
     - ## t/perf/perf-lib.sh ##
     -@@ t/perf/perf-lib.sh: test_perf_create_repo_from () {
     - 	mkdir -p "$repo/.git"
     - 	(
     - 		cd "$source" &&
     --		{ cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null ||
     --			cp -R "$objects_dir" "$repo/.git/"; } &&
     -+		{ cp -Rl "$objects_dir" "$repo/.git/" ||
     -+			cp -R "$objects_dir" "$repo/.git/" 2>/dev/null;} &&
     - 
     - 		# common_dir must come first here, since we want source_git to
     - 		# take precedence and overwrite any overlapping files
  -:  ----------- > 11:  6b832e89bc4 core.fsyncmethod: correctly camel-case warning message

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
  2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
                       ` (11 subsequent siblings)
  12 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Make it clearer in the naming and documentation of the plug_bulk_checkin
and unplug_bulk_checkin APIs that they can be thought of as
a "transaction" to optimize operations on the object database.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/add.c  |  4 ++--
 bulk-checkin.c |  4 ++--
 bulk-checkin.h | 14 ++++++++++++--
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 3ffb86a4338..9bf37ceae8e 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -670,7 +670,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		string_list_clear(&only_match_skip_worktree, 0);
 	}
 
-	plug_bulk_checkin();
+	begin_odb_transaction();
 
 	if (add_renormalize)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
@@ -682,7 +682,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (chmod_arg && pathspec.nr)
 		exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
-	unplug_bulk_checkin();
+	end_odb_transaction();
 
 finish:
 	if (write_locked_index(&the_index, &lock_file,
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 6d6c37171c9..a16ae3c629d 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -285,12 +285,12 @@ int index_bulk_checkin(struct object_id *oid,
 	return status;
 }
 
-void plug_bulk_checkin(void)
+void begin_odb_transaction(void)
 {
 	state.plugged = 1;
 }
 
-void unplug_bulk_checkin(void)
+void end_odb_transaction(void)
 {
 	state.plugged = 0;
 	if (state.f)
diff --git a/bulk-checkin.h b/bulk-checkin.h
index b26f3dc3b74..69a94422ac7 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -10,7 +10,17 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
 
-void plug_bulk_checkin(void);
-void unplug_bulk_checkin(void);
+/*
+ * Tell the object database to optimize for adding
+ * multiple objects. end_odb_transaction must be called
+ * to make new objects visible.
+ */
+void begin_odb_transaction(void);
+
+/*
+ * Tell the object database to make any objects from the
+ * current transaction visible.
+ */
+void end_odb_transaction(void);
 
 #endif
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
                       ` (10 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

This commit prepares for adding batch-fsync to the bulk-checkin
infrastructure.

The bulk-checkin infrastructure is currently used to batch up addition
of large blobs to a packfile. When a blob is larger than
big_file_threshold, we unconditionally add it to a pack. If bulk
checkins are 'plugged', we allow multiple large blobs to be added to a
single pack until we reach the packfile size limit; otherwise, we simply
make a new packfile for each large blob. The 'unplug' call tells us when
the series of blob additions is done so that we can finish the packfiles
and make their objects available to subsequent operations.

Stated another way, bulk-checkin allows callers to define a transaction
that adds multiple objects to the object database, where the object
database can optimize its internal operations within the transaction
boundary.

Batched fsync will fit into bulk-checkin by taking advantage of the
plug/unplug functionality to determine the appropriate time to fsync
and make newly-added objects available in the primary object database.

* Rename 'state' variable to 'bulk_checkin_state', since we will later
  be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
  find in the debugger, since the name is more unique.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 bulk-checkin.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index a16ae3c629d..ffe142841b2 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -10,9 +10,9 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static struct bulk_checkin_state {
-	unsigned plugged:1;
+static int bulk_checkin_plugged;
 
+static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
 	off_t offset;
@@ -21,7 +21,7 @@ static struct bulk_checkin_state {
 	struct pack_idx_entry **written;
 	uint32_t alloc_written;
 	uint32_t nr_written;
-} state;
+} bulk_checkin_state;
 
 static void finish_tmp_packfile(struct strbuf *basename,
 				const char *pack_tmp_name,
@@ -278,21 +278,23 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
 {
-	int status = deflate_to_pack(&state, oid, fd, size, type,
+	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
 				     path, flags);
-	if (!state.plugged)
-		finish_bulk_checkin(&state);
+	if (!bulk_checkin_plugged)
+		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
 
 void begin_odb_transaction(void)
 {
-	state.plugged = 1;
+	assert(!bulk_checkin_plugged);
+	bulk_checkin_plugged = 1;
 }
 
 void end_odb_transaction(void)
 {
-	state.plugged = 0;
-	if (state.f)
-		finish_bulk_checkin(&state);
+	assert(bulk_checkin_plugged);
+	bulk_checkin_plugged = 0;
+	if (bulk_checkin_state.f)
+		finish_bulk_checkin(&bulk_checkin_state);
 }
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 03/11] object-file: pass filename to fsync_or_die
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

If we die while trying to fsync a loose object file, pass the actual
filename we're trying to sync. This is likely to be more helpful for a
user trying to diagnose the cause of the failure than the former
'loose object file' string. It also sidesteps any concerns about
translating the die message differently for loose objects versus
something else that has a real path.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 object-file.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/object-file.c b/object-file.c
index b254bc50d70..5ffbf3d4fd4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1888,16 +1888,16 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_loose_object(int fd)
+static void close_loose_object(int fd, const char *filename)
 {
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
 	if (fsync_object_files > 0)
-		fsync_or_die(fd, "loose object file");
+		fsync_or_die(fd, filename);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
-				       "loose object file");
+				       filename);
 
 out:
 	if (close(fd) != 0)
@@ -2011,7 +2011,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd);
+	close_loose_object(fd, tmp_file.buf);
 
 	if (mtime) {
 		struct utimbuf utb;
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
                       ` (8 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin odb-transaction functionality, takes
advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1a. Create the object in a tmp-objdir.
2a. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1b. Issue an fsync against a dummy file to flush the log and hardware
   writeback cache, which should by now have seen the tmp-objdir writes.
2b. Rename all of the tmp-objdir files to their final names.
3b. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns. This sequence also ensures that no object files
appear in the main object store unless they are fsync-durable.

Batch mode is only enabled if core.fsync includes loose-objects. If
the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
not include loose-objects, we will use file-by-file fsyncing.

In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
work if no loose objects are ever added to the ODB. We use a tmp-objdir
to maintain the invariant that no loose-objects are visible in the main
ODB unless they are properly fsync-durable. This is important since
future ODB operations that try to create an object with specific
contents will silently drop the new data if an object with the target
hash exists without checking that the loose-object contents match the
hash. Only a full git-fsck would restore the ODB to a functional state
where dataloss doesn't occur.

In step (1b) of the sequence, we issue a fsync against a dummy file
created specifically for the purpose. This method has a little higher
cost than using one of the input object files, but makes adding new
callers of this mechanism easier, since we don't need to figure out
which object file is "last" or risk sharing violations by caching the fd
of the last object file.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 Documentation/config/core.txt |  8 ++++
 bulk-checkin.c                | 71 +++++++++++++++++++++++++++++++++++
 bulk-checkin.h                |  3 ++
 cache.h                       |  8 +++-
 config.c                      |  2 +
 object-file.c                 |  7 +++-
 6 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 9da3e5d88f6..3c90ba0b395 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -596,6 +596,14 @@ core.fsyncMethod::
 * `writeout-only` issues pagecache writeback requests, but depending on the
   filesystem and storage hardware, data added to the repository may not be
   durable in the event of a system crash. This is the default mode on macOS.
+* `batch` enables a mode that uses writeout-only flushes to stage multiple
+  updates in the disk writeback cache and then does a single full fsync of
+  a dummy file to trigger the disk cache flush at the end of the operation.
++
+  Currently `batch` mode only applies to loose-object files. Other repository
+  data is made durable as if `fsync` was specified. This mode is expected to
+  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
+  and on Windows for repos stored on NTFS or ReFS filesystems.
 
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
diff --git a/bulk-checkin.c b/bulk-checkin.c
index ffe142841b2..a5c40a08b8d 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -3,15 +3,20 @@
  */
 #include "cache.h"
 #include "bulk-checkin.h"
+#include "lockfile.h"
 #include "repository.h"
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
+#include "string-list.h"
+#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int bulk_checkin_plugged;
 
+static struct tmp_objdir *bulk_fsync_objdir;
+
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
@@ -80,6 +85,40 @@ clear_exit:
 	reprepare_packed_git(the_repository);
 }
 
+/*
+ * Cleanup after batch-mode fsync_object_files.
+ */
+static void do_batch_fsync(void)
+{
+	struct strbuf temp_path = STRBUF_INIT;
+	struct tempfile *temp;
+
+	if (!bulk_fsync_objdir)
+		return;
+
+	/*
+	 * Issue a full hardware flush against a temporary file to ensure
+	 * that all objects are durable before any renames occur. The code in
+	 * fsync_loose_object_bulk_checkin has already issued a writeout
+	 * request, but it has not flushed any writeback cache in the storage
+	 * hardware or any filesystem logs. This fsync call acts as a barrier
+	 * to ensure that the data in each new object file is durable before
+	 * the final name is visible.
+	 */
+	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
+	temp = xmks_tempfile(temp_path.buf);
+	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
+	delete_tempfile(&temp);
+	strbuf_release(&temp_path);
+
+	/*
+	 * Make the object files visible in the primary ODB after their data is
+	 * fully durable.
+	 */
+	tmp_objdir_migrate(bulk_fsync_objdir);
+	bulk_fsync_objdir = NULL;
+}
+
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -274,6 +313,36 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
+void prepare_loose_object_bulk_checkin(void)
+{
+	/*
+	 * We lazily create the temporary object directory
+	 * the first time an object might be added, since
+	 * callers may not know whether any objects will be
+	 * added at the time they call begin_odb_transaction.
+	 */
+	if (!bulk_checkin_plugged || bulk_fsync_objdir)
+		return;
+
+	bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
+	if (bulk_fsync_objdir)
+		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
+}
+
+void fsync_loose_object_bulk_checkin(int fd, const char *filename)
+{
+	/*
+	 * If we have a plugged bulk checkin, we issue a call that
+	 * cleans the filesystem page cache but avoids a hardware flush
+	 * command. Later on we will issue a single hardware flush
+	 * before as part of do_batch_fsync.
+	 */
+	if (!bulk_fsync_objdir ||
+	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
+		fsync_or_die(fd, filename);
+	}
+}
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -297,4 +366,6 @@ void end_odb_transaction(void)
 	bulk_checkin_plugged = 0;
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
+
+	do_batch_fsync();
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index 69a94422ac7..70edf745be8 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,6 +6,9 @@
 
 #include "cache.h"
 
+void prepare_loose_object_bulk_checkin(void);
+void fsync_loose_object_bulk_checkin(int fd, const char *filename);
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
diff --git a/cache.h b/cache.h
index ef7d34b7a09..a5bf15a5131 100644
--- a/cache.h
+++ b/cache.h
@@ -1040,7 +1040,8 @@ extern int use_fsync;
 
 enum fsync_method {
 	FSYNC_METHOD_FSYNC,
-	FSYNC_METHOD_WRITEOUT_ONLY
+	FSYNC_METHOD_WRITEOUT_ONLY,
+	FSYNC_METHOD_BATCH,
 };
 
 extern enum fsync_method fsync_method;
@@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *);
 int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
+static inline int batch_fsync_enabled(enum fsync_component component)
+{
+	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/config.c b/config.c
index 3c9b6b589ab..511f4584eeb 100644
--- a/config.c
+++ b/config.c
@@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 			fsync_method = FSYNC_METHOD_FSYNC;
 		else if (!strcmp(value, "writeout-only"))
 			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
+		else if (!strcmp(value, "batch"))
+			fsync_method = FSYNC_METHOD_BATCH;
 		else
 			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
 
diff --git a/object-file.c b/object-file.c
index 5ffbf3d4fd4..d2e0c13198f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1893,7 +1893,9 @@ static void close_loose_object(int fd, const char *filename)
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
-	if (fsync_object_files > 0)
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		fsync_loose_object_bulk_checkin(fd, filename);
+	else if (fsync_object_files > 0)
 		fsync_or_die(fd, filename);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
@@ -1961,6 +1963,9 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	static struct strbuf filename = STRBUF_INIT;
 
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		prepare_loose_object_bulk_checkin();
+
 	loose_object_path(the_repository, &filename, oid);
 
 	fd = create_tmpfile(&tmp_file, filename.buf);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24 18:18       ` Junio C Hamano
  2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
                       ` (7 subsequent siblings)
  12 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.

This change enables odb-transactions for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.

There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete, so callers
using the --stdin option will not see them until update-index is done.
This risk is mitigated by unplugging the batch when reporting verbose
output, which is the only way a --stdin caller might synchronize with
the addition of an object.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/update-index.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index aafe7eeac2a..ae7887cfe37 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -5,6 +5,7 @@
  */
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "lockfile.h"
 #include "quote.h"
@@ -32,6 +33,7 @@ static int allow_replace;
 static int info_only;
 static int force_remove;
 static int verbose;
+static int odb_transaction_active;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
 static int mark_fsmonitor_only;
@@ -49,6 +51,15 @@ enum uc_mode {
 	UC_FORCE
 };
 
+static void end_odb_transaction_if_active(void)
+{
+	if (!odb_transaction_active)
+		return;
+
+	end_odb_transaction();
+	odb_transaction_active = 0;
+}
+
 __attribute__((format (printf, 1, 2)))
 static void report(const char *fmt, ...)
 {
@@ -57,6 +68,16 @@ static void report(const char *fmt, ...)
 	if (!verbose)
 		return;
 
+	/*
+	 * It is possible, though unlikely, that a caller
+	 * could use the verbose output to synchronize with
+	 * addition of objects to the object database, so
+	 * unplug bulk checkin to make sure that future objects
+	 * are immediately visible.
+	 */
+
+	end_odb_transaction_if_active();
+
 	va_start(vp, fmt);
 	vprintf(fmt, vp);
 	putchar('\n');
@@ -1116,6 +1137,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	 */
 	parse_options_start(&ctx, argc, argv, prefix,
 			    options, PARSE_OPT_STOP_AT_NON_OPTION);
+
+	/*
+	 * Allow the object layer to optimize adding multiple objects in
+	 * a batch.
+	 */
+	begin_odb_transaction();
+	odb_transaction_active = 1;
 	while (ctx.argc) {
 		if (parseopt_state != PARSE_OPT_DONE)
 			parseopt_state = parse_options_step(&ctx, options,
@@ -1190,6 +1218,11 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	/*
+	 * By now we have added all of the new objects
+	 */
+	end_odb_transaction_if_active();
+
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 06/11] unpack-objects: use the bulk-checkin infrastructure
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling an odb-transaction when unpacking objects, we can take advantage
of batched fsyncs.

Here are some performance numbers to justify batch mode for
unpack-objects, collected on a WSL2 Ubuntu VM.

Fsync Mode | Time for 90 objects (ms)
-------------------------------------
       Off | 170
  On,fsync | 760
  On,batch | 230

Note that the default unpackLimit is 100 objects, so there's a 3x
benefit in the worst case. The non-batch mode fsync scales linearly
with the number of objects, so there are significant benefits even with
smaller numbers of objects.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/unpack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dbeb0680a58..56d05e2725d 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "object-store.h"
 #include "object.h"
@@ -503,10 +504,12 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
+	begin_odb_transaction();
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
 		display_progress(progress, i + 1);
 	}
+	end_odb_transaction();
 	stop_progress(&progress);
 
 	if (delta_list)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.

We don't actually do fsyncs in the most of the test suite, since
GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
surrounding batch mode code since GIT_TEST_FSYNC merely makes the
maybe_fsync wrapper always appear to succeed.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 cache.h           | 4 ++++
 compat/mingw.h    | 3 +++
 config.c          | 2 +-
 git-compat-util.h | 2 ++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index a5bf15a5131..7f6cbb254b4 100644
--- a/cache.h
+++ b/cache.h
@@ -1031,6 +1031,10 @@ enum fsync_component {
 			      FSYNC_COMPONENT_INDEX | \
 			      FSYNC_COMPONENT_REFERENCE)
 
+#ifndef FSYNC_COMPONENTS_PLATFORM_DEFAULT
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT FSYNC_COMPONENTS_DEFAULT
+#endif
+
 /*
  * A bitmask indicating which components of the repo should be fsynced.
  */
diff --git a/compat/mingw.h b/compat/mingw.h
index 6074a3d3ced..afe30868c04 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -332,6 +332,9 @@ int mingw_getpagesize(void);
 int win32_fsync_no_flush(int fd);
 #define fsync_no_flush win32_fsync_no_flush
 
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT (FSYNC_COMPONENTS_DEFAULT | FSYNC_COMPONENT_LOOSE_OBJECT)
+#define FSYNC_METHOD_DEFAULT (FSYNC_METHOD_BATCH)
+
 struct rlimit {
 	unsigned int rlim_cur;
 };
diff --git a/config.c b/config.c
index 511f4584eeb..e9cac5f4707 100644
--- a/config.c
+++ b/config.c
@@ -1342,7 +1342,7 @@ static const struct fsync_component_name {
 
 static enum fsync_component parse_fsync_components(const char *var, const char *string)
 {
-	enum fsync_component current = FSYNC_COMPONENTS_DEFAULT;
+	enum fsync_component current = FSYNC_COMPONENTS_PLATFORM_DEFAULT;
 	enum fsync_component positive = 0, negative = 0;
 
 	while (string) {
diff --git a/git-compat-util.h b/git-compat-util.h
index 0892e209a2f..fffe42ce7c1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1257,11 +1257,13 @@ __attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...);
 #define BUG(...) BUG_fl(__FILE__, __LINE__, __VA_ARGS__)
 
+#ifndef FSYNC_METHOD_DEFAULT
 #ifdef __APPLE__
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_WRITEOUT_ONLY
 #else
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_FSYNC
 #endif
+#endif
 
 enum fsync_action {
 	FSYNC_WRITEOUT_ONLY,
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Several tests use awk to parse OIDs from the output of 'git ls-files
--stage' and 'git ls-tree'. Introduce helpers to centralize these uses
of awk.

Update t5317-pack-objects-filter-objects.sh to use the new ls-files
helper so that it has some usages to review. Other updates are left for
the future.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/t5317-pack-objects-filter-objects.sh | 91 +++++++++++++-------------
 t/test-lib-functions.sh                | 10 +++
 2 files changed, 54 insertions(+), 47 deletions(-)

diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh
index 33b740ce628..bb633c9b099 100755
--- a/t/t5317-pack-objects-filter-objects.sh
+++ b/t/t5317-pack-objects-filter-objects.sh
@@ -10,9 +10,6 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 # Test blob:none filter.
 
 test_expect_success 'setup r1' '
-	echo "{print \$1}" >print_1.awk &&
-	echo "{print \$2}" >print_2.awk &&
-
 	git init r1 &&
 	for n in 1 2 3 4 5
 	do
@@ -22,10 +19,13 @@ test_expect_success 'setup r1' '
 	done
 '
 
+parse_verify_pack_blob_oid () {
+	awk '{print $1}' -
+}
+
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r1 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -35,7 +35,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r1 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -54,12 +54,12 @@ test_expect_success 'verify blob:none packfile has no blobs' '
 test_expect_success 'verify normal and blob:none packfiles have same commits/trees' '
 	git -C r1 verify-pack -v ../all.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >expected &&
 
 	git -C r1 verify-pack -v ../filter.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -123,8 +123,8 @@ test_expect_success 'setup r2' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -134,7 +134,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r2 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -161,8 +161,8 @@ test_expect_success 'verify blob:limit=1000' '
 '
 
 test_expect_success 'verify blob:limit=1001' '
-	git -C r2 ls-files -s large.1000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1001 >filter.pack <<-EOF &&
@@ -172,15 +172,15 @@ test_expect_success 'verify blob:limit=1001' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=10001' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=10001 >filter.pack <<-EOF &&
@@ -190,15 +190,15 @@ test_expect_success 'verify blob:limit=10001' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=1k' '
-	git -C r2 ls-files -s large.1000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1k >filter.pack <<-EOF &&
@@ -208,15 +208,15 @@ test_expect_success 'verify blob:limit=1k' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify explicitly specifying oversized blob in input' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	echo HEAD >objects &&
@@ -226,15 +226,15 @@ test_expect_success 'verify explicitly specifying oversized blob in input' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=1m' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1m >filter.pack <<-EOF &&
@@ -244,7 +244,7 @@ test_expect_success 'verify blob:limit=1m' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -253,12 +253,12 @@ test_expect_success 'verify blob:limit=1m' '
 test_expect_success 'verify normal and blob:limit packfiles have same commits/trees' '
 	git -C r2 verify-pack -v ../all.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >expected &&
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -289,9 +289,8 @@ test_expect_success 'setup r3' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r3 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -301,7 +300,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r3 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -342,9 +341,8 @@ test_expect_success 'setup r4' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -354,19 +352,19 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r4 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify sparse:oid=OID' '
-	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 ls-files -s pattern >staged &&
-	oid=$(awk -f print_2.awk staged) &&
+	oid=$(test_parse_ls_files_stage_oids <staged) &&
 	git -C r4 pack-objects --revs --stdout --filter=sparse:oid=$oid >filter.pack <<-EOF &&
 	HEAD
 	EOF
@@ -374,15 +372,15 @@ test_expect_success 'verify sparse:oid=OID' '
 
 	git -C r4 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify sparse:oid=oid-ish' '
-	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 pack-objects --revs --stdout --filter=sparse:oid=main:pattern >filter.pack <<-EOF &&
@@ -392,7 +390,7 @@ test_expect_success 'verify sparse:oid=oid-ish' '
 
 	git -C r4 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -402,9 +400,8 @@ test_expect_success 'verify sparse:oid=oid-ish' '
 # This models previously omitted objects that we did not receive.
 
 test_expect_success 'setup r1 - delete loose blobs' '
-	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	for id in `cat expected | sed "s|..|&/|"`
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index a027f0c409e..e6011409e2f 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1782,6 +1782,16 @@ test_oid_to_path () {
 	echo "${1%$basename}/$basename"
 }
 
+# Parse oids from git ls-files --staged output
+test_parse_ls_files_stage_oids () {
+	awk '{print $2}' -
+}
+
+# Parse oids from git ls-tree output
+test_parse_ls_tree_oids () {
+	awk '{print $3}' -
+}
+
 # Choose a port number based on the test script's number and store it in
 # the given variable name, unless that variable already contains a number.
 test_set_port () {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 09/11] core.fsyncmethod: tests for batch mode
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (7 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
  2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
                       ` (3 subsequent siblings)
  12 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added
due to it already being in the repo.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/lib-unique-files.sh  | 32 ++++++++++++++++++++++++++++++++
 t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
 t/t3903-stash.sh       | 20 ++++++++++++++++++++
 t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
 4 files changed, 107 insertions(+), 14 deletions(-)
 create mode 100644 t/lib-unique-files.sh

diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
new file mode 100644
index 00000000000..74efca91dd7
--- /dev/null
+++ b/t/lib-unique-files.sh
@@ -0,0 +1,32 @@
+# Helper to create files with unique contents
+
+# Create multiple files with unique contents within this test run. Takes the
+# number of directories, the number of files in each directory, and the base
+# directory.
+#
+# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
+#					 each in my_dir, all with contents
+#					 different from previous invocations
+#					 of this command in this run.
+
+test_create_unique_files () {
+	test "$#" -ne 3 && BUG "3 param"
+
+	local dirs="$1" &&
+	local files="$2" &&
+	local basedir="$3" &&
+	local counter=0 &&
+	test_tick &&
+	local basedata=$basedir$test_tick &&
+	rm -rf "$basedir" &&
+	for i in $(test_seq $dirs)
+	do
+		local dir=$basedir/dir$i &&
+		mkdir -p "$dir" &&
+		for j in $(test_seq $files)
+		do
+			counter=$((counter + 1)) &&
+			echo "$basedata.$counter">"$dir/file$j.txt"
+		done
+	done
+}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index b1f90ba3250..8979c8a5f03 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+. $TEST_DIRECTORY/lib-unique-files.sh
+
 # Test the file mode "$1" of the file "$2" in the index.
 test_mode_in_index () {
 	case "$(git ls-files -s "$2")" in
@@ -34,6 +36,32 @@ test_expect_success \
     'Test that "git add -- -q" works' \
     'touch -- -q && git add -- -q'
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'git add: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir1 &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION add -- ./files_base_dir1/ &&
+	git ls-files --stage files_base_dir1/ |
+	test_parse_ls_files_stage_oids >added_files_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files_oids >added_files_actual &&
+	test_cmp added_files_oids added_files_actual
+"
+
+test_expect_success 'git update-index: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir2 &&
+	find files_base_dir2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
+	git ls-files --stage files_base_dir2 |
+	test_parse_ls_files_stage_oids >added_files2_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files2_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files2_oids >added_files2_actual &&
+	test_cmp added_files2_oids added_files2_actual
+"
+
 test_expect_success \
 	'git add: Test that executable bit is not used if core.filemode=0' \
 	'git config core.filemode 0 &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 4abbc8fccae..20e94881964 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
+. $TEST_DIRECTORY/lib-unique-files.sh
 
 test_expect_success 'usage on cmd and subcommand invalid option' '
 	test_expect_code 129 git stash --invalid-option 2>usage &&
@@ -1410,6 +1411,25 @@ test_expect_success 'stash handles skip-worktree entries nicely' '
 	git rev-parse --verify refs/stash:A.t
 '
 
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'stash with core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION stash push -u -- ./files_base_dir/ &&
+
+	# The files were untracked, so use the third parent,
+	# which contains the untracked files
+	git ls-tree -r stash^3 -- ./files_base_dir/ |
+	test_parse_ls_tree_oids >stashed_files_oids &&
+
+	# We created 2 dirs with 4 files each (8 files total) above
+	test_line_count = 8 stashed_files_oids &&
+	git cat-file --batch-check='%(objectname)' <stashed_files_oids >stashed_files_actual &&
+	test_cmp stashed_files_oids stashed_files_actual
+"
+
+
 test_expect_success 'git stash succeeds despite directory/file change' '
 	test_create_repo directory_file_switch_v1 &&
 	(
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index a11d61206ad..f8a0f309e2d 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -161,22 +161,27 @@ test_expect_success 'pack-objects with bogus arguments' '
 '
 
 check_unpack () {
+	local packname="$1" &&
+	local object_list="$2" &&
+	local git_config="$3" &&
 	test_when_finished "rm -rf git2" &&
-	git init --bare git2 &&
-	git -C git2 unpack-objects -n <"$1".pack &&
-	git -C git2 unpack-objects <"$1".pack &&
-	(cd .git && find objects -type f -print) |
-	while read path
-	do
-		cmp git2/$path .git/$path || {
-			echo $path differs.
-			return 1
-		}
-	done
+	git $git_config init --bare git2 &&
+	(
+		git $git_config -C git2 unpack-objects -n <"$packname".pack &&
+		git $git_config -C git2 unpack-objects <"$packname".pack &&
+		git $git_config -C git2 cat-file --batch-check="%(objectname)"
+	) <"$object_list" >current &&
+	cmp "$object_list" current
 }
 
 test_expect_success 'unpack without delta' '
-	check_unpack test-1-${packname_1}
+	check_unpack test-1-${packname_1} obj-list
+'
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
+	check_unpack test-1-${packname_1} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with REF_DELTA' '
@@ -185,7 +190,11 @@ test_expect_success 'pack with REF_DELTA' '
 '
 
 test_expect_success 'unpack with REF_DELTA' '
-	check_unpack test-2-${packname_2}
+	check_unpack test-2-${packname_2} obj-list
+'
+
+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-2-${packname_2} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with OFS_DELTA' '
@@ -195,7 +204,11 @@ test_expect_success 'pack with OFS_DELTA' '
 '
 
 test_expect_success 'unpack with OFS_DELTA' '
-	check_unpack test-3-${packname_3}
+	check_unpack test-3-${packname_3} obj-list
+'
+
+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-3-${packname_3} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'compare delta flavors' '
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (8 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add basic performance tests for "git add" and "git stash" of a lot of
new objects with various fsync settings. This shows the benefit of batch
mode relative to full fsync.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p3700-add.sh   | 59 ++++++++++++++++++++++++++++++++++++++++
 t/perf/p3900-stash.sh | 62 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 121 insertions(+)
 create mode 100755 t/perf/p3700-add.sh
 create mode 100755 t/perf/p3900-stash.sh

diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh
new file mode 100755
index 00000000000..2ea78c9449d
--- /dev/null
+++ b/t/perf/p3700-add.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of add"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	test_perf "add $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG add files
+	"
+done
+
+test_done
diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh
new file mode 100755
index 00000000000..3526f06cef4
--- /dev/null
+++ b/t/perf/p3900-stash.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of stash"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+# We need to create the files each time we run the perf test, but
+# we do not want to measure the cost of creating the files, so run
+# the test once.
+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
+then
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+for m in false true batch
+do
+	test_expect_success "create the files for object_fsyncing=$m" '
+		git reset --hard &&
+		# create files across directories
+		test_create_unique_files $dir_count $files_per_dir files
+	'
+
+	case $m in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	# We only stash files in the 'files' subdirectory since
+	# the perf test infrastructure creates files in the
+	# current working directory that need to be preserved
+	test_perf "stash $total_files files (object_fsyncing=$m)" "
+		git $FSYNC_CONFIG stash push -u -- files
+	"
+done
+
+test_done
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (9 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
@ 2022-03-24  4:58     ` Neeraj Singh via GitGitGadget
  2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
  12 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-24  4:58 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The warning for an unrecognized fsyncMethod was not
camel-cased.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config.c b/config.c
index e9cac5f4707..ae819dee20b 100644
--- a/config.c
+++ b/config.c
@@ -1697,7 +1697,7 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "core.fsyncobjectfiles")) {
 		if (fsync_object_files < 0)
-			warning(_("core.fsyncobjectfiles is deprecated; use core.fsync instead"));
+			warning(_("core.fsyncObjectFiles is deprecated; use core.fsync instead"));
 		fsync_object_files = git_config_bool(var, value);
 		return 0;
 	}
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
@ 2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
  2022-03-24 17:52         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-24 16:10 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Thu, Mar 24 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> Make it clearer in the naming and documentation of the plug_bulk_checkin
> and unplug_bulk_checkin APIs that they can be thought of as
> a "transaction" to optimize operations on the object database.
>
> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  builtin/add.c  |  4 ++--
>  bulk-checkin.c |  4 ++--
>  bulk-checkin.h | 14 ++++++++++++--
>  3 files changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index 3ffb86a4338..9bf37ceae8e 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -670,7 +670,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		string_list_clear(&only_match_skip_worktree, 0);
>  	}
>  
> -	plug_bulk_checkin();
> +	begin_odb_transaction();
>  
>  	if (add_renormalize)
>  		exit_status |= renormalize_tracked_files(&pathspec, flags);
> @@ -682,7 +682,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  
>  	if (chmod_arg && pathspec.nr)
>  		exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
> -	unplug_bulk_checkin();
> +	end_odb_transaction();

Aside from anything else we've (dis)agreed on, I found this part really
odd when hacking on my RFC-on-top, i.e. originally I (wrongly) thought
the plug_bulk_checkin() was something that originated with this series
which adds the "bulk" mode.

But no, on second inspection it's a thing Junio added a long time ago so
that in this case we "stream to N pack" where we'd otherwise add N loose
objects.

Which, and I think Junio brought this up in an earlier round, but I
didn't fully understand that at the time makes this whole thing quite
odd to me.

So first, shouldn't we add this begin_odb_transaction() as a new thing?
I.e. surely wanting to do that object target redirection within a given
begin/end "scope" should be orthagonal to how fsync() happens within
that "scope", though in this case that happens to correspond.

And secondly, per the commit message and comment when it was added in
(568508e7657 (bulk-checkin: replace fast-import based implementation,
2011-10-28)) is it something we need *for that purpose* with the series
to unpack-objects without malloc()ing the size of the blob[1].

And, if so and orthagonal to that: If we know how to either stream N
objects to a PACK (as fast-import does), *and* we now (or SOON) know how
to stream loose objects without using size(blob) amounts of memory,
doesn't the "optimize fsync()" rather want to make use of the
stream-to-pack approach?

I.e. have you tried for the caseses where we create say 1k objects for
"git stash" tried to stream those to a pack? How does that compare (both
with/without the fsync changes).

I.e. I do worry (also per [2]) that while the whole "bulk fsync" is neat
(and I think can use it in either case, to defer object syncs until the
"index" or "ref" sync, as my RFC does) I worry that we're adding a bunch
of configuration and complexity for something that:

 1. Ultimately isn't all that important, as already for part of it we
    can mostly configure it away. I.e. "git-unpack-objects" v.s. writing
    a pack, cf. transfer.unpackLimit)
 2. We don't have #1 for "add" and "update-index", but if we stream to
    packs there is there any remaining benefit in practice?

1. https://lore.kernel.org/git/cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com/
2. https://lore.kernel.org/git/220323.86fsn8ohg8.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 09/11] core.fsyncmethod: tests for batch mode
  2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
  2022-03-24 18:23         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-24 16:29 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, Bagas Sanjaya, Neeraj Singh


On Thu, Mar 24 2022, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@microsoft.com>
>
> Add test cases to exercise batch mode for:
>  * 'git add'
>  * 'git stash'
>  * 'git update-index'
>  * 'git unpack-objects'
>
> These tests ensure that the added data winds up in the object database.
>
> In this change we introduce a new test helper lib-unique-files.sh. The
> goal of this library is to create a tree of files that have different
> oids from any other files that may have been created in the current test
> repo. This helps us avoid missing validation of an object being added
> due to it already being in the repo.
>
> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  t/lib-unique-files.sh  | 32 ++++++++++++++++++++++++++++++++
>  t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
>  t/t3903-stash.sh       | 20 ++++++++++++++++++++
>  t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
>  4 files changed, 107 insertions(+), 14 deletions(-)
>  create mode 100644 t/lib-unique-files.sh
>
> diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
> new file mode 100644
> index 00000000000..74efca91dd7
> --- /dev/null
> +++ b/t/lib-unique-files.sh
> @@ -0,0 +1,32 @@
> +# Helper to create files with unique contents
> +
> +# Create multiple files with unique contents within this test run. Takes the
> +# number of directories, the number of files in each directory, and the base
> +# directory.
> +#
> +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> +#					 each in my_dir, all with contents
> +#					 different from previous invocations
> +#					 of this command in this run.
> +
> +test_create_unique_files () {
> +	test "$#" -ne 3 && BUG "3 param"
> +
> +	local dirs="$1" &&
> +	local files="$2" &&
> +	local basedir="$3" &&
> +	local counter=0 &&
> +	test_tick &&
> +	local basedata=$basedir$test_tick &&
> +	rm -rf "$basedir" &&
> +	for i in $(test_seq $dirs)
> +	do
> +		local dir=$basedir/dir$i &&
> +		mkdir -p "$dir" &&
> +		for j in $(test_seq $files)
> +		do
> +			counter=$((counter + 1)) &&
> +			echo "$basedata.$counter">"$dir/file$j.txt"
> +		done
> +	done
> +}

Having written my own perf tests for this series, I still don't get why
this is needed, at all.

tl;dr: the below: I think this whole workaround is because you missed
that "test_when_finished" exists, and how it excludes perf timings.

I.e. I get that if we ran this N times we'd want to wipe our repo
between tests, as for e.g. "git add" you want it to actually add the
objects.

It's what I do with the "hyperfine" command in
https://lore.kernel.org/git/RFC-patch-v2-4.7-61f4f3d7ef4-20220323T140753Z-avarab@gmail.com/
with the "-p" option.

I.e. hyperfine has a way to say "this is setup, but don't measure the
time", which is 1/2 of what you're working around here and in 10/11.

But as 10/11 shows you're limited to one run with t/perf because you
want to not include those "setup" numbers, and "test_perf" has no easy
way to avoid that (but more on that later).

Which b.t.w. I'm really skeptical of as an approach here in any case
(even if we couldn't exclude it from the numbers).

I.e. yes what "hyperfine" does would be preferrable, but in exchange for
avoiding that you're comparing samples of 1 runs.

Surely we're better off with N run (even if noisy). Given enough of them
the difference will shake out, and our estimated +/- will narrow..

But aside from that, why isn't this just:
	
	for cfg in true false blah
	done
		test_expect_success "setup for $cfg" '
			git init repo-$cfg &&
			for f in $(test_seq 1 100)
			do
				>repo-$cfg/$f
			done
		'
	
		test_perf "perf test for $cfg" '
			git -C repo-$cfg
		'
	done

Which surely is going to be more accurate in the context of our limited
t/perf environment because creating unique files is not sufficient at
all to ensure that your tests don't interfere with each other.

That's because in the first iteration we'll create N objects in
.git/objects/aa/* or whatever, which will *still be there* for your
second test, which will impact performance.

Whereas if you just make N repos you don't need unique files, and you
won't be introducing that as a conflating variable.

But anyway, reading perf-lib.sh again I haven't tested, but this whole
workaround seems truly unnecessary. I.e. in test_run_perf_ we do:
	
	test_run_perf_ () {
	        test_cleanup=:
	        test_export_="test_cleanup"
	        export test_cleanup test_export_
	        "$GTIME" -f "%E %U %S" -o test_time.$i "$TEST_SHELL_PATH" -c ' 
                	[... code we run and time ...]
		'
                [... later ...]
                test_eval_ "$test_cleanup"
	}

So can't you just avoid this whole glorious workaround for the low low
cost of approximately one shellscript string assignment? :)

I.e. if you do:

	setup_clean () {
		rm -rf repo
	}

	setup_first () {
		git init repo &&
		[make a bunch of files or whatever in repo]
	}

	setup_next () {
		test_when_finished "setup_clean" &&
		setup_first
	}

	test_expect_success 'setup initial stuff' '
		setup_first
	'

	test_perf 'my perf test' '
		test_when_finished "setup_next" &&
		[your perf test here]
	'

	test_expect_success 'cleanup' '
		# Not really needed, but just for completeness, we are
                # about to nuke the trash dir anyway...
		setup_clean
	'

I haven't tested (and need to run), but i'm pretty sure that does
exactly what you want without these workarounds, i.e. you'll get
"trampoline setup" without that setup being included in the perf
numbers.

Is it pretty? No, but it's a lot less complex than this unique file
business & workarounds, and will give you just the numbers you want, and
most importantly you car run it N times now for better samples.

I.e. "what you want" sans a *tiny* bit of noise that we use to just call
a function to do:

    test_cleanup=setup_next

Which we'll then eval *after* we measure your numbers to setup the next
test.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (10 preceding siblings ...)
  2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
@ 2022-03-24 17:44     ` Junio C Hamano
  2022-03-24 19:21       ` Neeraj Singh
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
  12 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-24 17:44 UTC (permalink / raw)
  To: Neeraj K. Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> V3 changes:
>
>  * Rebrand plug/unplug-bulk-checkin to "begin_odb_transaction" and
>    "end_odb_transaction"

OK.  Makes me wonder (not "object", more appropriate verb than
"object" being "be curious") how well "odb-transaction" meshes with
mechanisms to ensure that the bits hit the disk platter to protect
things outside the odb that you may or may not be covering in this
series (e.g. the index file, the refs, the working tree files).

> This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.

"at ."???


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
@ 2022-03-24 17:52         ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24 17:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Thu, Mar 24, 2022 at 9:24 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Thu, Mar 24 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > Make it clearer in the naming and documentation of the plug_bulk_checkin
> > and unplug_bulk_checkin APIs that they can be thought of as
> > a "transaction" to optimize operations on the object database.
> >
> > Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> > ---
> >  builtin/add.c  |  4 ++--
> >  bulk-checkin.c |  4 ++--
> >  bulk-checkin.h | 14 ++++++++++++--
> >  3 files changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/builtin/add.c b/builtin/add.c
> > index 3ffb86a4338..9bf37ceae8e 100644
> > --- a/builtin/add.c
> > +++ b/builtin/add.c
> > @@ -670,7 +670,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
> >               string_list_clear(&only_match_skip_worktree, 0);
> >       }
> >
> > -     plug_bulk_checkin();
> > +     begin_odb_transaction();
> >
> >       if (add_renormalize)
> >               exit_status |= renormalize_tracked_files(&pathspec, flags);
> > @@ -682,7 +682,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
> >
> >       if (chmod_arg && pathspec.nr)
> >               exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
> > -     unplug_bulk_checkin();
> > +     end_odb_transaction();
>
> Aside from anything else we've (dis)agreed on, I found this part really
> odd when hacking on my RFC-on-top, i.e. originally I (wrongly) thought
> the plug_bulk_checkin() was something that originated with this series
> which adds the "bulk" mode.
>
> But no, on second inspection it's a thing Junio added a long time ago so
> that in this case we "stream to N pack" where we'd otherwise add N loose
> objects.
>
> Which, and I think Junio brought this up in an earlier round, but I
> didn't fully understand that at the time makes this whole thing quite
> odd to me.
>
> So first, shouldn't we add this begin_odb_transaction() as a new thing?
> I.e. surely wanting to do that object target redirection within a given
> begin/end "scope" should be orthagonal to how fsync() happens within
> that "scope", though in this case that happens to correspond.
>
> And secondly, per the commit message and comment when it was added in
> (568508e7657 (bulk-checkin: replace fast-import based implementation,
> 2011-10-28)) is it something we need *for that purpose* with the series
> to unpack-objects without malloc()ing the size of the blob[1].
>

The original change seems to be about optimizing addition of
successive large blobs to the ODB when we know we have a large batch.
It's a batch-mode optimization for the ODB, similar to my patch
series, just targeting large blobs rather than small blobs/trees.  It
also has the same property that the added data is "invisible" until
the transaction ends.

> And, if so and orthagonal to that: If we know how to either stream N
> objects to a PACK (as fast-import does), *and* we now (or SOON) know how
> to stream loose objects without using size(blob) amounts of memory,
> doesn't the "optimize fsync()" rather want to make use of the
> stream-to-pack approach?
>
> I.e. have you tried for the caseses where we create say 1k objects for
> "git stash" tried to stream those to a pack? How does that compare (both
> with/without the fsync changes).
>
> I.e. I do worry (also per [2]) that while the whole "bulk fsync" is neat
> (and I think can use it in either case, to defer object syncs until the
> "index" or "ref" sync, as my RFC does) I worry that we're adding a bunch
> of configuration and complexity for something that:
>
>  1. Ultimately isn't all that important, as already for part of it we
>     can mostly configure it away. I.e. "git-unpack-objects" v.s. writing
>     a pack, cf. transfer.unpackLimit)
>  2. We don't have #1 for "add" and "update-index", but if we stream to
>     packs there is there any remaining benefit in practice?
>
> 1. https://lore.kernel.org/git/cover-v11-0.8-00000000000-20220319T001411Z-avarab@gmail.com/
> 2. https://lore.kernel.org/git/220323.86fsn8ohg8.gmgdl@evledraar.gmail.com/

Stream to pack is a good idea.  But I think we'd want a way to append
to the most recent pack so that we don't explode the number of packs,
which seems to impose a linear cost on ODB operations, at least to
load up the indexes.  I think this is orthogonal and we can always
change the meaning of batch mode to use a pack mechanism when such a
mechanism is ready.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
  2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-24 18:18       ` Junio C Hamano
  2022-03-24 20:25         ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-24 18:18 UTC (permalink / raw)
  To: Neeraj Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, avarab, nksingh85, ps, Bagas Sanjaya,
	Neeraj K. Singh

"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +static void end_odb_transaction_if_active(void)
> +{
> +	if (!odb_transaction_active)
> +		return;
> +
> +	end_odb_transaction();
> +	odb_transaction_active = 0;
> +}

>  __attribute__((format (printf, 1, 2)))
>  static void report(const char *fmt, ...)
>  {
> @@ -57,6 +68,16 @@ static void report(const char *fmt, ...)
>  	if (!verbose)
>  		return;
>  
> +	/*
> +	 * It is possible, though unlikely, that a caller
> +	 * could use the verbose output to synchronize with
> +	 * addition of objects to the object database, so
> +	 * unplug bulk checkin to make sure that future objects
> +	 * are immediately visible.
> +	 */
> +
> +	end_odb_transaction_if_active();
> +
>  	va_start(vp, fmt);
>  	vprintf(fmt, vp);
>  	putchar('\n');
> @@ -1116,6 +1137,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  	 */
>  	parse_options_start(&ctx, argc, argv, prefix,
>  			    options, PARSE_OPT_STOP_AT_NON_OPTION);
> +
> +	/*
> +	 * Allow the object layer to optimize adding multiple objects in
> +	 * a batch.
> +	 */
> +	begin_odb_transaction();
> +	odb_transaction_active = 1;

This looks strange.  Shouldn't begin/end pair be responsible for
knowing if there is a transaction active already?  For that matter,
didn't the original unplug in plug/unplug pair automatically turned
into no-op when it is already unplugged?

IOW, I am not sure end_if_active() should exist in the first place.
Shouldn't end_transaction() do that instead?


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 09/11] core.fsyncmethod: tests for batch mode
  2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
@ 2022-03-24 18:23         ` Neeraj Singh
  2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24 18:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh

On Thu, Mar 24, 2022 at 9:53 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Thu, Mar 24 2022, Neeraj Singh via GitGitGadget wrote:
>
> > From: Neeraj Singh <neerajsi@microsoft.com>
> >
> > Add test cases to exercise batch mode for:
> >  * 'git add'
> >  * 'git stash'
> >  * 'git update-index'
> >  * 'git unpack-objects'
> >
> > These tests ensure that the added data winds up in the object database.
> >
> > In this change we introduce a new test helper lib-unique-files.sh. The
> > goal of this library is to create a tree of files that have different
> > oids from any other files that may have been created in the current test
> > repo. This helps us avoid missing validation of an object being added
> > due to it already being in the repo.
> >
> > Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> > ---
> >  t/lib-unique-files.sh  | 32 ++++++++++++++++++++++++++++++++
> >  t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
> >  t/t3903-stash.sh       | 20 ++++++++++++++++++++
> >  t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
> >  4 files changed, 107 insertions(+), 14 deletions(-)
> >  create mode 100644 t/lib-unique-files.sh
> >
> > diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
> > new file mode 100644
> > index 00000000000..74efca91dd7
> > --- /dev/null
> > +++ b/t/lib-unique-files.sh
> > @@ -0,0 +1,32 @@
> > +# Helper to create files with unique contents
> > +
> > +# Create multiple files with unique contents within this test run. Takes the
> > +# number of directories, the number of files in each directory, and the base
> > +# directory.
> > +#
> > +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> > +#                                     each in my_dir, all with contents
> > +#                                     different from previous invocations
> > +#                                     of this command in this run.
> > +
> > +test_create_unique_files () {
> > +     test "$#" -ne 3 && BUG "3 param"
> > +
> > +     local dirs="$1" &&
> > +     local files="$2" &&
> > +     local basedir="$3" &&
> > +     local counter=0 &&
> > +     test_tick &&
> > +     local basedata=$basedir$test_tick &&
> > +     rm -rf "$basedir" &&
> > +     for i in $(test_seq $dirs)
> > +     do
> > +             local dir=$basedir/dir$i &&
> > +             mkdir -p "$dir" &&
> > +             for j in $(test_seq $files)
> > +             do
> > +                     counter=$((counter + 1)) &&
> > +                     echo "$basedata.$counter">"$dir/file$j.txt"
> > +             done
> > +     done
> > +}
>
> Having written my own perf tests for this series, I still don't get why
> this is needed, at all.
>
> tl;dr: the below: I think this whole workaround is because you missed
> that "test_when_finished" exists, and how it excludes perf timings.
>

I actually noticed test_when_finished, but I didn't think of your
"setup the next round on cleanup of last" idea.  I was debating at the
time adding a "test_perf_setup" helper to do the setup work during
each perf iteration.  How about I do that and just create a new repo
in each test_perf_setup step?

> I.e. I get that if we ran this N times we'd want to wipe our repo
> between tests, as for e.g. "git add" you want it to actually add the
> objects.
>
> It's what I do with the "hyperfine" command in
> https://lore.kernel.org/git/RFC-patch-v2-4.7-61f4f3d7ef4-20220323T140753Z-avarab@gmail.com/
> with the "-p" option.
>
> I.e. hyperfine has a way to say "this is setup, but don't measure the
> time", which is 1/2 of what you're working around here and in 10/11.
>
> But as 10/11 shows you're limited to one run with t/perf because you
> want to not include those "setup" numbers, and "test_perf" has no easy
> way to avoid that (but more on that later).
>
> Which b.t.w. I'm really skeptical of as an approach here in any case
> (even if we couldn't exclude it from the numbers).
>
> I.e. yes what "hyperfine" does would be preferrable, but in exchange for
> avoiding that you're comparing samples of 1 runs.
>
> Surely we're better off with N run (even if noisy). Given enough of them
> the difference will shake out, and our estimated +/- will narrow..
>
> But aside from that, why isn't this just:
>
>         for cfg in true false blah
>         done
>                 test_expect_success "setup for $cfg" '
>                         git init repo-$cfg &&
>                         for f in $(test_seq 1 100)
>                         do
>                                 >repo-$cfg/$f
>                         done
>                 '
>
>                 test_perf "perf test for $cfg" '
>                         git -C repo-$cfg
>                 '
>         done
>
> Which surely is going to be more accurate in the context of our limited
> t/perf environment because creating unique files is not sufficient at
> all to ensure that your tests don't interfere with each other.
>
> That's because in the first iteration we'll create N objects in
> .git/objects/aa/* or whatever, which will *still be there* for your
> second test, which will impact performance.
>
> Whereas if you just make N repos you don't need unique files, and you
> won't be introducing that as a conflating variable.
>
> But anyway, reading perf-lib.sh again I haven't tested, but this whole
> workaround seems truly unnecessary. I.e. in test_run_perf_ we do:
>
>         test_run_perf_ () {
>                 test_cleanup=:
>                 test_export_="test_cleanup"
>                 export test_cleanup test_export_
>                 "$GTIME" -f "%E %U %S" -o test_time.$i "$TEST_SHELL_PATH" -c '
>                         [... code we run and time ...]
>                 '
>                 [... later ...]
>                 test_eval_ "$test_cleanup"
>         }
>
> So can't you just avoid this whole glorious workaround for the low low
> cost of approximately one shellscript string assignment? :)
>
> I.e. if you do:
>
>         setup_clean () {
>                 rm -rf repo
>         }
>
>         setup_first () {
>                 git init repo &&
>                 [make a bunch of files or whatever in repo]
>         }
>
>         setup_next () {
>                 test_when_finished "setup_clean" &&
>                 setup_first
>         }
>
>         test_expect_success 'setup initial stuff' '
>                 setup_first
>         '
>
>         test_perf 'my perf test' '
>                 test_when_finished "setup_next" &&
>                 [your perf test here]
>         '
>
>         test_expect_success 'cleanup' '
>                 # Not really needed, but just for completeness, we are
>                 # about to nuke the trash dir anyway...
>                 setup_clean
>         '
>
> I haven't tested (and need to run), but i'm pretty sure that does
> exactly what you want without these workarounds, i.e. you'll get
> "trampoline setup" without that setup being included in the perf
> numbers.
>
> Is it pretty? No, but it's a lot less complex than this unique file
> business & workarounds, and will give you just the numbers you want, and
> most importantly you car run it N times now for better samples.
>
> I.e. "what you want" sans a *tiny* bit of noise that we use to just call
> a function to do:
>
>     test_cleanup=setup_next
>
> Which we'll then eval *after* we measure your numbers to setup the next
> test.

How about I add a new test_perf_setup mechanism to make your idea work
in a straightforward way?

I still want the test_create_unique_files thing as a way to make
multiple files easily.  And for the non-perf tests it makes sense to
have differing contents within a test run.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
@ 2022-03-24 19:21       ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24 19:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj K. Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Thu, Mar 24, 2022 at 10:44 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj K. Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > V3 changes:
> >
> >  * Rebrand plug/unplug-bulk-checkin to "begin_odb_transaction" and
> >    "end_odb_transaction"
>
> OK.  Makes me wonder (not "object", more appropriate verb than
> "object" being "be curious") how well "odb-transaction" meshes with
> mechanisms to ensure that the bits hit the disk platter to protect
> things outside the odb that you may or may not be covering in this
> series (e.g. the index file, the refs, the working tree files).
>

As of this series, the odb-transaction will ensure that loose-objects
(and trivially packs as well, since they're currently eagerly-synced)
are efficiently made durable by the time the transaction ends.  Other
parts of the repo (index, refs, etc) need to be updated and synced
after the odb transaction ends.  Patrick's original ref syncing work
at [1] also contained a batch mode delimited by the existing ref
transactions.

I think larger transactions would be interesting to have, but I'd
argue that the current patch series is a worthwhile building block for
that world.  It solves the real-world multiplicative pain of adding N
objects to the ODB, where each one needs to be fsynced.  Patrick's
batch mode solves the real-world multiplicative pain of updating R
refs during a big mirror push.  Even talking just about the ODB, we
still have O(TreeSize) fsyncs for the updated trees and a few extra
fsyncs for commits.  We can add odb transactions around those things
too, which should be easy enough going forward.

[1] https://lore.kernel.org/git/d9aa96913b1730f1d0c238d7d52e27c20bc55390.1636544377.git.ps@pks.im/

> > This work is based on 'seen' at . It's dependent on ns/core-fsyncmethod.
>
> "at ."???
>

Sorry, to make GGG/Github happy, I had to rebase onto b9f5d0358d2,
which was the last non-merge commit that's present in next. Then I
could target next with the PR and get the right set of patches.
Basing on fd008b1442 didn't work because GGG doesn't want to see a
merge commit in the set of changes not in the target branch.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
  2022-03-24 18:18       ` Junio C Hamano
@ 2022-03-24 20:25         ` Neeraj Singh
  2022-03-24 21:34           ` Junio C Hamano
  0 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24 20:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Thu, Mar 24, 2022 at 11:18 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +static void end_odb_transaction_if_active(void)
> > +{
> > +     if (!odb_transaction_active)
> > +             return;
> > +
> > +     end_odb_transaction();
> > +     odb_transaction_active = 0;
> > +}
>
> >  __attribute__((format (printf, 1, 2)))
> >  static void report(const char *fmt, ...)
> >  {
> > @@ -57,6 +68,16 @@ static void report(const char *fmt, ...)
> >       if (!verbose)
> >               return;
> >
> > +     /*
> > +      * It is possible, though unlikely, that a caller
> > +      * could use the verbose output to synchronize with
> > +      * addition of objects to the object database, so
> > +      * unplug bulk checkin to make sure that future objects
> > +      * are immediately visible.
> > +      */
> > +
> > +     end_odb_transaction_if_active();
> > +
> >       va_start(vp, fmt);
> >       vprintf(fmt, vp);
> >       putchar('\n');
> > @@ -1116,6 +1137,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> >        */
> >       parse_options_start(&ctx, argc, argv, prefix,
> >                           options, PARSE_OPT_STOP_AT_NON_OPTION);
> > +
> > +     /*
> > +      * Allow the object layer to optimize adding multiple objects in
> > +      * a batch.
> > +      */
> > +     begin_odb_transaction();
> > +     odb_transaction_active = 1;
>
> This looks strange.  Shouldn't begin/end pair be responsible for
> knowing if there is a transaction active already?  For that matter,
> didn't the original unplug in plug/unplug pair automatically turned
> into no-op when it is already unplugged?
>
> IOW, I am not sure end_if_active() should exist in the first place.
> Shouldn't end_transaction() do that instead?
>

Today there's an "assert(bulk_checkin_plugged)" in
end_odb_transaction. In principle we could just drop the assert and
allow a transaction to be ended multiple times.  But maybe in the long
run for composability we'd like to have nested callers to begin/end
transaction (e.g. we could have a nested transaction around writing
the cache tree to the ODB to minimize fsyncs there).  In that world,
having a subsystem not maintain a balanced pairing could be a problem.
An alternative API here could be to have an "flush_odb_transaction"
call to make the objects visible at this point.  Lastly, I could take
your original suggested approach of adding a new flag to update-index.
I preferred the unplug-on-verbose approach since it would
automatically optimize most callers to update-index that might exist
in the wild, without users having to change anything.

Thanks,
Neeraj

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
  2022-03-24 20:25         ` Neeraj Singh
@ 2022-03-24 21:34           ` Junio C Hamano
  2022-03-24 22:21             ` Neeraj Singh
  0 siblings, 1 reply; 175+ messages in thread
From: Junio C Hamano @ 2022-03-24 21:34 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

Neeraj Singh <nksingh85@gmail.com> writes:

>> IOW, I am not sure end_if_active() should exist in the first place.
>> Shouldn't end_transaction() do that instead?
>>
>
> Today there's an "assert(bulk_checkin_plugged)" in
> end_odb_transaction. In principle we could just drop the assert and
> allow a transaction to be ended multiple times.  But maybe in the long
> run for composability we'd like to have nested callers to begin/end
> transaction (e.g. we could have a nested transaction around writing
> the cache tree to the ODB to minimize fsyncs there).

I am not convinced that "transaction" is a good mental model for
this mechanism to begin with, in the sense that the sense that it is
not a bug or failure of the implementation if two or more operations
in the same <begin,end> bracket did not happen (or not happen)
atomically, or if 'begin' and 'end' were not properly nested.  With
the design getting more complex with things like tentative object
store that needs to be explicitly migrated after the outermost level
of end-transaction, we may end up _requiring_ that sufficient number
of 'end' must come once we issued 'begin', which I am not sure is
necessarily a good thing.

In any case, we aspire/envision to have a nested plug/unplug, I
think it is a good thing.  A helper for one subsystem may have its
large batch of operations inside plug/unplug pair, another help may
do the same, and the caller of these two helpers may want to say

	plug
		call helper A
			A does plug
			A does many things
			A does unplug
		call helper B
			B does plug
			B does many things
			B does unplug
	unplug

to "cancel" the unplug helper A and B has.

> In that world,
> having a subsystem not maintain a balanced pairing could be a problem.

And in such a world, you never want to have end-if-active to
implement what you are doing here, as you may end up being not
properly nested:

	begin
		begin
			do many things
			if some condtion
				end_if_active
			do more things
		end
	end

> An alternative API here could be to have an "flush_odb_transaction"
> call to make the objects visible at this point.

Yes, what you want is a forced-flush instead, I think.

So I suspect you'd want these three primitives, perhaps?

 * begin increments the nesting level
   - if outermost, you may have to do real "setup" things
   - otherwise, you may not have anything other than just counting
     the nesting level

 * flush implements unplug, fsync, etc. and does so immediately,
   even when plugged.

 * end decrements the nesting level
   - if outermost, you'd do "flush".
   - otherwise, you may only count the nesting level and do nothing else,
     but doing "flush" when you realize that you've queued too many
     is not a bug or a crime.


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure
  2022-03-24 21:34           ` Junio C Hamano
@ 2022-03-24 22:21             ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-24 22:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Patrick Steinhardt,
	Bagas Sanjaya, Neeraj K. Singh

On Thu, Mar 24, 2022 at 2:34 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Neeraj Singh <nksingh85@gmail.com> writes:
>
> >> IOW, I am not sure end_if_active() should exist in the first place.
> >> Shouldn't end_transaction() do that instead?
> >>
> >
> > Today there's an "assert(bulk_checkin_plugged)" in
> > end_odb_transaction. In principle we could just drop the assert and
> > allow a transaction to be ended multiple times.  But maybe in the long
> > run for composability we'd like to have nested callers to begin/end
> > transaction (e.g. we could have a nested transaction around writing
> > the cache tree to the ODB to minimize fsyncs there).
>
> I am not convinced that "transaction" is a good mental model for
> this mechanism to begin with, in the sense that the sense that it is
> not a bug or failure of the implementation if two or more operations
> in the same <begin,end> bracket did not happen (or not happen)
> atomically, or if 'begin' and 'end' were not properly nested.  With
> the design getting more complex with things like tentative object
> store that needs to be explicitly migrated after the outermost level
> of end-transaction, we may end up _requiring_ that sufficient number
> of 'end' must come once we issued 'begin', which I am not sure is
> necessarily a good thing.

I don't love the tentative object store that keeps things invisble,
but that was the safest way to maintain the invariant that no
loose-object name appears in the ODB without durable contents.  I
think we want the "durability/ordering boundary" part of database
transactions without necessarily needing full abort/commit semantics.
As you say, we don't need full atomicity, but we do need ordering to
ensure that blobs are durable before trees pointing them, and so on up
the merkle chain.  The begin/end pairs help us defer the syncs
required for ordering to the end rather than pessimistically assuming
that every object write is the end.

> In any case, we aspire/envision to have a nested plug/unplug, I
> think it is a good thing.  A helper for one subsystem may have its
> large batch of operations inside plug/unplug pair, another help may
> do the same, and the caller of these two helpers may want to say
>
>         plug
>                 call helper A
>                         A does plug
>                         A does many things
>                         A does unplug
>                 call helper B
>                         B does plug
>                         B does many things
>                         B does unplug
>         unplug
>
> to "cancel" the unplug helper A and B has.
>
> > In that world,
> > having a subsystem not maintain a balanced pairing could be a problem.
>
> And in such a world, you never want to have end-if-active to
> implement what you are doing here, as you may end up being not
> properly nested:
>
>         begin
>                 begin
>                         do many things
>                         if some condtion
>                                 end_if_active
>                         do more things
>                 end
>         end
>
> > An alternative API here could be to have an "flush_odb_transaction"
> > call to make the objects visible at this point.
>
> Yes, what you want is a forced-flush instead, I think.
>
> So I suspect you'd want these three primitives, perhaps?
>
>  * begin increments the nesting level
>    - if outermost, you may have to do real "setup" things
>    - otherwise, you may not have anything other than just counting
>      the nesting level
>
>  * flush implements unplug, fsync, etc. and does so immediately,
>    even when plugged.
>
>  * end decrements the nesting level
>    - if outermost, you'd do "flush".
>    - otherwise, you may only count the nesting level and do nothing else,
>      but doing "flush" when you realize that you've queued too many
>      is not a bug or a crime.
>

Yes, I'll move in this direction. Thanks for the feedback.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v3 09/11] core.fsyncmethod: tests for batch mode
  2022-03-24 18:23         ` Neeraj Singh
@ 2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-26 15:35 UTC (permalink / raw)
  To: Neeraj Singh
  Cc: Neeraj Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Bagas Sanjaya, Neeraj Singh


On Thu, Mar 24 2022, Neeraj Singh wrote:

> On Thu, Mar 24, 2022 at 9:53 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Thu, Mar 24 2022, Neeraj Singh via GitGitGadget wrote:
>>
>> > From: Neeraj Singh <neerajsi@microsoft.com>
>> >
>> > Add test cases to exercise batch mode for:
>> >  * 'git add'
>> >  * 'git stash'
>> >  * 'git update-index'
>> >  * 'git unpack-objects'
>> >
>> > These tests ensure that the added data winds up in the object database.
>> >
>> > In this change we introduce a new test helper lib-unique-files.sh. The
>> > goal of this library is to create a tree of files that have different
>> > oids from any other files that may have been created in the current test
>> > repo. This helps us avoid missing validation of an object being added
>> > due to it already being in the repo.
>> >
>> > Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
>> > ---
>> >  t/lib-unique-files.sh  | 32 ++++++++++++++++++++++++++++++++
>> >  t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
>> >  t/t3903-stash.sh       | 20 ++++++++++++++++++++
>> >  t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
>> >  4 files changed, 107 insertions(+), 14 deletions(-)
>> >  create mode 100644 t/lib-unique-files.sh
>> >
>> > diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
>> > new file mode 100644
>> > index 00000000000..74efca91dd7
>> > --- /dev/null
>> > +++ b/t/lib-unique-files.sh
>> > @@ -0,0 +1,32 @@
>> > +# Helper to create files with unique contents
>> > +
>> > +# Create multiple files with unique contents within this test run. Takes the
>> > +# number of directories, the number of files in each directory, and the base
>> > +# directory.
>> > +#
>> > +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
>> > +#                                     each in my_dir, all with contents
>> > +#                                     different from previous invocations
>> > +#                                     of this command in this run.
>> > +
>> > +test_create_unique_files () {
>> > +     test "$#" -ne 3 && BUG "3 param"
>> > +
>> > +     local dirs="$1" &&
>> > +     local files="$2" &&
>> > +     local basedir="$3" &&
>> > +     local counter=0 &&
>> > +     test_tick &&
>> > +     local basedata=$basedir$test_tick &&
>> > +     rm -rf "$basedir" &&
>> > +     for i in $(test_seq $dirs)
>> > +     do
>> > +             local dir=$basedir/dir$i &&
>> > +             mkdir -p "$dir" &&
>> > +             for j in $(test_seq $files)
>> > +             do
>> > +                     counter=$((counter + 1)) &&
>> > +                     echo "$basedata.$counter">"$dir/file$j.txt"
>> > +             done
>> > +     done
>> > +}
>>
>> Having written my own perf tests for this series, I still don't get why
>> this is needed, at all.
>>
>> tl;dr: the below: I think this whole workaround is because you missed
>> that "test_when_finished" exists, and how it excludes perf timings.
>>
>
> I actually noticed test_when_finished, but I didn't think of your
> "setup the next round on cleanup of last" idea.  I was debating at the
> time adding a "test_perf_setup" helper to do the setup work during
> each perf iteration.  How about I do that and just create a new repo
> in each test_perf_setup step?
>
>> I.e. I get that if we ran this N times we'd want to wipe our repo
>> between tests, as for e.g. "git add" you want it to actually add the
>> objects.
>>
>> It's what I do with the "hyperfine" command in
>> https://lore.kernel.org/git/RFC-patch-v2-4.7-61f4f3d7ef4-20220323T140753Z-avarab@gmail.com/
>> with the "-p" option.
>>
>> I.e. hyperfine has a way to say "this is setup, but don't measure the
>> time", which is 1/2 of what you're working around here and in 10/11.
>>
>> But as 10/11 shows you're limited to one run with t/perf because you
>> want to not include those "setup" numbers, and "test_perf" has no easy
>> way to avoid that (but more on that later).
>>
>> Which b.t.w. I'm really skeptical of as an approach here in any case
>> (even if we couldn't exclude it from the numbers).
>>
>> I.e. yes what "hyperfine" does would be preferrable, but in exchange for
>> avoiding that you're comparing samples of 1 runs.
>>
>> Surely we're better off with N run (even if noisy). Given enough of them
>> the difference will shake out, and our estimated +/- will narrow..
>>
>> But aside from that, why isn't this just:
>>
>>         for cfg in true false blah
>>         done
>>                 test_expect_success "setup for $cfg" '
>>                         git init repo-$cfg &&
>>                         for f in $(test_seq 1 100)
>>                         do
>>                                 >repo-$cfg/$f
>>                         done
>>                 '
>>
>>                 test_perf "perf test for $cfg" '
>>                         git -C repo-$cfg
>>                 '
>>         done
>>
>> Which surely is going to be more accurate in the context of our limited
>> t/perf environment because creating unique files is not sufficient at
>> all to ensure that your tests don't interfere with each other.
>>
>> That's because in the first iteration we'll create N objects in
>> .git/objects/aa/* or whatever, which will *still be there* for your
>> second test, which will impact performance.
>>
>> Whereas if you just make N repos you don't need unique files, and you
>> won't be introducing that as a conflating variable.
>>
>> But anyway, reading perf-lib.sh again I haven't tested, but this whole
>> workaround seems truly unnecessary. I.e. in test_run_perf_ we do:
>>
>>         test_run_perf_ () {
>>                 test_cleanup=:
>>                 test_export_="test_cleanup"
>>                 export test_cleanup test_export_
>>                 "$GTIME" -f "%E %U %S" -o test_time.$i "$TEST_SHELL_PATH" -c '
>>                         [... code we run and time ...]
>>                 '
>>                 [... later ...]
>>                 test_eval_ "$test_cleanup"
>>         }
>>
>> So can't you just avoid this whole glorious workaround for the low low
>> cost of approximately one shellscript string assignment? :)
>>
>> I.e. if you do:
>>
>>         setup_clean () {
>>                 rm -rf repo
>>         }
>>
>>         setup_first () {
>>                 git init repo &&
>>                 [make a bunch of files or whatever in repo]
>>         }
>>
>>         setup_next () {
>>                 test_when_finished "setup_clean" &&
>>                 setup_first
>>         }
>>
>>         test_expect_success 'setup initial stuff' '
>>                 setup_first
>>         '
>>
>>         test_perf 'my perf test' '
>>                 test_when_finished "setup_next" &&
>>                 [your perf test here]
>>         '
>>
>>         test_expect_success 'cleanup' '
>>                 # Not really needed, but just for completeness, we are
>>                 # about to nuke the trash dir anyway...
>>                 setup_clean
>>         '
>>
>> I haven't tested (and need to run), but i'm pretty sure that does
>> exactly what you want without these workarounds, i.e. you'll get
>> "trampoline setup" without that setup being included in the perf
>> numbers.
>>
>> Is it pretty? No, but it's a lot less complex than this unique file
>> business & workarounds, and will give you just the numbers you want, and
>> most importantly you car run it N times now for better samples.
>>
>> I.e. "what you want" sans a *tiny* bit of noise that we use to just call
>> a function to do:
>>
>>     test_cleanup=setup_next
>>
>> Which we'll then eval *after* we measure your numbers to setup the next
>> test.
>
> How about I add a new test_perf_setup mechanism to make your idea work
> in a straightforward way?

Sure, that sounds great.

> I still want the test_create_unique_files thing as a way to make
> multiple files easily.  And for the non-perf tests it makes sense to
> have differing contents within a test run.

I think running your perf test on some generated data might still make
sense, but I think given the above that the *method* really doesn't make
any sense.

I.e. pretty much the whole structure of t/perf is to write tests that
can be run on an arbitrary user-provided repo, some of them do make some
content assumptions (or need no repo), but we've tried to have tests
there handle arbitrary repos.

You ended up with that "generated random files" to get around the X-Y
problem of not being able to reset the area without making that part of
the metrics, but as demo'd above we can use test_when_finished for that.

And once that's resolved it would actually be much more handy to be able
to run this on an arbitrary repo, as you can see in my "git hyperfine"
one-liner I grabbed the "t" directory, but we could just make our test
data all files in the dir (or specify a glob via an env var).

I think it still sounds interesting to have a way to make arbitrary test
data, but surely that's then better as e.g.:

	cd t/perf
 	./make-random-repo /tmp/random-repo &&
	GIT_PERF_REPO=/tmp/random-repo ./run p<your test>

I.e. once we've resolved the metrics/play area issue needing to run this
on some very specific data is artificial limitation v.s. just being able
to point it at a given repo.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
                       ` (11 preceding siblings ...)
  2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
@ 2022-03-29  0:42     ` Neeraj K. Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
                         ` (15 more replies)
  12 siblings, 16 replies; 175+ messages in thread
From: Neeraj K. Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh

V4 changes:

 * Make ODB transactions nestable.
 * Add an ODB transaction around writing out the cached tree.
 * Change update-index to use a more straightforward way of managing ODB
   transactions.
 * Fix missing 'local's in lib-unique-files
 * Add a per-iteration setup mechanism to test_perf.
 * Fix camelCasing in warning message.

V3 changes:

 * Rebrand plug/unplug-bulk-checkin to "begin_odb_transaction" and
   "end_odb_transaction"
 * Add a patch to pass filenames to fsync_or_die, rather than the string
   "loose object"
 * Update the commit description for "core.fsyncmethod to explain why we do
   not directly expose objects until an fsync occurs.
 * Also explain in the commit description why we're using a dummy file for
   the fsync.
 * Create the bulk-fsync tmp-objdir lazily the first time a loose object is
   added. We now do fsync iff that objdir exists.
 * Do batch fsync if core.fsyncMethod=batch and core.fsync contains
   loose-object, regardless of the core.fsyncObjectFiles setting.
 * Mitigate the risk in update-index of an object not being visible due to
   bulk checkin.
 * Add a perf comment to justify the unpack-objects usage of bulk-checkin.
 * Add a new patch to create helpers for parsing OIDs from git commands.
 * Add a comment to the lib-unique-files.sh helper about uniqueness only
   within a repo.
 * Fix style and add '&&' chaining to test helpers.
 * Comment on some magic numbers in tests.
 * Take the object list as an argument in
   ./t5300-pack-object.sh:check_unpack ()
 * Drop accidental change to t/perf/perf-lib.sh

V2 changes:

 * Change doc to indicate that only some repo updates are batched
 * Null and zero out control variables in do_batch_fsync under
   unplug_bulk_checkin
 * Make batch mode default on Windows.
 * Update the description for the initial patch that cleans up the
   bulk-checkin infrastructure.
 * Rebase onto 'seen' at 0cac37f38f9.

--Original definition-- When core.fsync includes loose-object, we issue an
fsync after every written object. For a 'git-add' or similar command that
adds a lot of files to the repo, the costs of these fsyncs adds up. One
major factor in this cost is the time it takes for the physical storage
controller to flush its caches to durable media.

This series takes advantage of the writeout-only mode of git_fsync to issue
OS cache writebacks for all of the objects being added to the repository
followed by a single fsync to a dummy file, which should trigger a
filesystem log flush and storage controller cache flush. This mechanism is
known to be safe on common Windows filesystems and expected to be safe on
macOS. Some linux filesystems, such as XFS, will probably do the right thing
as well. See [1] for previous discussion on the predecessor of this patch
series.

This series is important on Windows, where loose-objects are included in the
fsync set by default in Git-For-Windows. In this series, I'm also setting
the default mode for Windows to turn on loose object fsyncing with batch
mode, so that we can get CI coverage of the actual git-for-windows
configuration upstream. We still don't actually issue fsyncs for the test
suite since GIT_TEST_FSYNC is set to 0, but we exercise all of the
surrounding batch mode code.

This work is based on 'next' at c54b8eb302. It's dependent on
ns/core-fsyncmethod.

[1]
https://lore.kernel.org/git/2c1ddef6057157d85da74a7274e03eacf0374e45.1629856293.git.gitgitgadget@gmail.com/

Neeraj Singh (13):
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  object-file: pass filename to fsync_or_die
  core.fsyncmethod: batched disk flushes for loose-objects
  cache-tree: use ODB transaction around writing a tree
  update-index: use the bulk-checkin infrastructure
  unpack-objects: use the bulk-checkin infrastructure
  core.fsync: use batch mode and sync loose objects by default on
    Windows
  test-lib-functions: add parsing helpers for ls-files and ls-tree
  core.fsyncmethod: tests for batch mode
  t/perf: add iteration setup mechanism to perf-lib
  core.fsyncmethod: performance tests for add and stash
  core.fsyncmethod: correctly camel-case warning message

 Documentation/config/core.txt          |   8 ++
 builtin/add.c                          |   4 +-
 builtin/unpack-objects.c               |   3 +
 builtin/update-index.c                 |  24 ++++++
 bulk-checkin.c                         | 101 ++++++++++++++++++++++---
 bulk-checkin.h                         |  17 ++++-
 cache-tree.c                           |   3 +
 cache.h                                |  12 ++-
 compat/mingw.h                         |   3 +
 config.c                               |   6 +-
 git-compat-util.h                      |   2 +
 object-file.c                          |  15 ++--
 t/lib-unique-files.sh                  |  34 +++++++++
 t/perf/p3700-add.sh                    |  59 +++++++++++++++
 t/perf/p4220-log-grep-engines.sh       |   3 +-
 t/perf/p4221-log-grep-engines-fixed.sh |   3 +-
 t/perf/p5302-pack-index.sh             |  15 ++--
 t/perf/p7519-fsmonitor.sh              |  18 +----
 t/perf/p7820-grep-engines.sh           |   6 +-
 t/perf/perf-lib.sh                     |  62 +++++++++++++--
 t/t3700-add.sh                         |  28 +++++++
 t/t3903-stash.sh                       |  20 +++++
 t/t5300-pack-object.sh                 |  41 ++++++----
 t/t5317-pack-objects-filter-objects.sh |  91 +++++++++++-----------
 t/test-lib-functions.sh                |  10 +++
 25 files changed, 469 insertions(+), 119 deletions(-)
 create mode 100644 t/lib-unique-files.sh
 create mode 100755 t/perf/p3700-add.sh


base-commit: c54b8eb302ffb72f31e73a26044c8a864e2cb307
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1134%2Fneerajsi-msft%2Fns%2Fbatched-fsync-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1134/neerajsi-msft/ns/batched-fsync-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1134

Range-diff vs v3:

  2:  b2d9766a662 !  1:  c7a2a7efe6d bulk-checkin: rename 'state' variable and separate 'plugged' boolean
     @@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
       	return status;
       }
       
     - void begin_odb_transaction(void)
     + void plug_bulk_checkin(void)
       {
      -	state.plugged = 1;
      +	assert(!bulk_checkin_plugged);
      +	bulk_checkin_plugged = 1;
       }
       
     - void end_odb_transaction(void)
     + void unplug_bulk_checkin(void)
       {
      -	state.plugged = 0;
      -	if (state.f)
  1:  53261f0099d !  2:  d045b13795b bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
     @@ Commit message
      
          Make it clearer in the naming and documentation of the plug_bulk_checkin
          and unplug_bulk_checkin APIs that they can be thought of as
     -    a "transaction" to optimize operations on the object database.
     +    a "transaction" to optimize operations on the object database. These
     +    transactions may be nested so that subsystems like the cache-tree
     +    writing code can optimize their operations without caring whether the
     +    top-level code has a transaction active.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ builtin/add.c: int cmd_add(int argc, const char **argv, const char *prefix)
       	if (write_locked_index(&the_index, &lock_file,
      
       ## bulk-checkin.c ##
     +@@
     + #include "packfile.h"
     + #include "object-store.h"
     + 
     +-static int bulk_checkin_plugged;
     ++static int odb_transaction_nesting;
     + 
     + static struct bulk_checkin_state {
     + 	char *pack_tmp_name;
      @@ bulk-checkin.c: int index_bulk_checkin(struct object_id *oid,
     + {
     + 	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
     + 				     path, flags);
     +-	if (!bulk_checkin_plugged)
     ++	if (!odb_transaction_nesting)
     + 		finish_bulk_checkin(&bulk_checkin_state);
       	return status;
       }
       
      -void plug_bulk_checkin(void)
      +void begin_odb_transaction(void)
       {
     - 	state.plugged = 1;
     +-	assert(!bulk_checkin_plugged);
     +-	bulk_checkin_plugged = 1;
     ++	odb_transaction_nesting += 1;
       }
       
      -void unplug_bulk_checkin(void)
      +void end_odb_transaction(void)
       {
     - 	state.plugged = 0;
     - 	if (state.f)
     +-	assert(bulk_checkin_plugged);
     +-	bulk_checkin_plugged = 0;
     ++	odb_transaction_nesting -= 1;
     ++	if (odb_transaction_nesting < 0)
     ++		BUG("Unbalanced ODB transaction nesting");
     ++
     ++	if (odb_transaction_nesting)
     ++		return;
     ++
     + 	if (bulk_checkin_state.f)
     + 		finish_bulk_checkin(&bulk_checkin_state);
     + }
      
       ## bulk-checkin.h ##
      @@ bulk-checkin.h: int index_bulk_checkin(struct object_id *oid,
  3:  26ce5b8fdda =  3:  2d1bc4568ac object-file: pass filename to fsync_or_die
  4:  52638326790 !  4:  9e7ae22fa4a core.fsyncmethod: batched disk flushes for loose-objects
     @@ bulk-checkin.c
       #include "packfile.h"
       #include "object-store.h"
       
     - static int bulk_checkin_plugged;
     + static int odb_transaction_nesting;
       
      +static struct tmp_objdir *bulk_fsync_objdir;
      +
     @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
      +	 * callers may not know whether any objects will be
      +	 * added at the time they call begin_odb_transaction.
      +	 */
     -+	if (!bulk_checkin_plugged || bulk_fsync_objdir)
     ++	if (!odb_transaction_nesting || bulk_fsync_objdir)
      +		return;
      +
      +	bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
     @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
      +void fsync_loose_object_bulk_checkin(int fd, const char *filename)
      +{
      +	/*
     -+	 * If we have a plugged bulk checkin, we issue a call that
     ++	 * If we have an active ODB transaction, we issue a call that
      +	 * cleans the filesystem page cache but avoids a hardware flush
      +	 * command. Later on we will issue a single hardware flush
      +	 * before as part of do_batch_fsync.
     @@ bulk-checkin.c: static int deflate_to_pack(struct bulk_checkin_state *state,
       		       int fd, size_t size, enum object_type type,
       		       const char *path, unsigned flags)
      @@ bulk-checkin.c: void end_odb_transaction(void)
     - 	bulk_checkin_plugged = 0;
     + 
       	if (bulk_checkin_state.f)
       		finish_bulk_checkin(&bulk_checkin_state);
      +
  -:  ----------- >  5:  83fa4a5f3a5 cache-tree: use ODB transaction around writing a tree
  5:  913ce1b3df9 !  6:  f03ebee695a update-index: use the bulk-checkin infrastructure
     @@ Commit message
          There is some risk with this change, since under batch fsync, the object
          files will be in a tmp-objdir until update-index is complete, so callers
          using the --stdin option will not see them until update-index is done.
     -    This risk is mitigated by unplugging the batch when reporting verbose
     -    output, which is the only way a --stdin caller might synchronize with
     -    the addition of an object.
     +    This risk is mitigated by not keeping an ODB transaction open around
     +    --stdin processing if in --verbose mode. Without --verbose mode,
     +    a caller feeding update-index via --stdin wouldn't know when
     +    update-index adds an object, event without an ODB transaction.
      
          Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
      
     @@ builtin/update-index.c
       #include "config.h"
       #include "lockfile.h"
       #include "quote.h"
     -@@ builtin/update-index.c: static int allow_replace;
     - static int info_only;
     - static int force_remove;
     - static int verbose;
     -+static int odb_transaction_active;
     - static int mark_valid_only;
     - static int mark_skip_worktree_only;
     - static int mark_fsmonitor_only;
     -@@ builtin/update-index.c: enum uc_mode {
     - 	UC_FORCE
     - };
     - 
     -+static void end_odb_transaction_if_active(void)
     -+{
     -+	if (!odb_transaction_active)
     -+		return;
     -+
     -+	end_odb_transaction();
     -+	odb_transaction_active = 0;
     -+}
     -+
     - __attribute__((format (printf, 1, 2)))
     - static void report(const char *fmt, ...)
     - {
     -@@ builtin/update-index.c: static void report(const char *fmt, ...)
     - 	if (!verbose)
     - 		return;
     - 
     -+	/*
     -+	 * It is possible, though unlikely, that a caller
     -+	 * could use the verbose output to synchronize with
     -+	 * addition of objects to the object database, so
     -+	 * unplug bulk checkin to make sure that future objects
     -+	 * are immediately visible.
     -+	 */
     -+
     -+	end_odb_transaction_if_active();
     -+
     - 	va_start(vp, fmt);
     - 	vprintf(fmt, vp);
     - 	putchar('\n');
      @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
       	 */
       	parse_options_start(&ctx, argc, argv, prefix,
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const
      +	 * a batch.
      +	 */
      +	begin_odb_transaction();
     -+	odb_transaction_active = 1;
       	while (ctx.argc) {
       		if (parseopt_state != PARSE_OPT_DONE)
       			parseopt_state = parse_options_step(&ctx, options,
     +@@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
     + 		the_index.version = preferred_index_format;
     + 	}
     + 
     ++	/*
     ++	 * It is possible, though unlikely, that a caller could use the verbose
     ++	 * output to synchronize with addition of objects to the object
     ++	 * database. The current implementation of ODB transactions leaves
     ++	 * objects invisible while a transaction is active, so end the
     ++	 * transaction here if verbose output is enabled.
     ++	 */
     ++
     ++	if (verbose)
     ++		end_odb_transaction();
     ++
     + 	if (read_from_stdin) {
     + 		struct strbuf buf = STRBUF_INIT;
     + 		struct strbuf unquoted = STRBUF_INIT;
      @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
       		strbuf_release(&buf);
       	}
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const
      +	/*
      +	 * By now we have added all of the new objects
      +	 */
     -+	end_odb_transaction_if_active();
     ++	if (!verbose)
     ++		end_odb_transaction();
      +
       	if (split_index > 0) {
       		if (git_config_get_split_index() == 0)
  6:  84fd144ef18 =  7:  d85013f7d2c unpack-objects: use the bulk-checkin infrastructure
  7:  447263e8ef1 =  8:  73e54f94c20 core.fsync: use batch mode and sync loose objects by default on Windows
  8:  8f1b01c9ca0 =  9:  124450c86d9 test-lib-functions: add parsing helpers for ls-files and ls-tree
  9:  b5f371e97fe ! 10:  282fbdef792 core.fsyncmethod: tests for batch mode
     @@ t/lib-unique-files.sh (new)
      +	local files="$2" &&
      +	local basedir="$3" &&
      +	local counter=0 &&
     ++	local i &&
     ++	local j &&
      +	test_tick &&
      +	local basedata=$basedir$test_tick &&
      +	rm -rf "$basedir" &&
  -:  ----------- > 11:  ee7ecf4cabe t/perf: add iteration setup mechanism to perf-lib
 10:  b99b32a469c ! 12:  fdf90d45f52 core.fsyncmethod: performance tests for add and stash
     @@ t/perf/p3700-add.sh (new)
      +# core.fsyncMethod=batch mode, which is why we are testing different values
      +# of that setting explicitly and creating a lot of unique objects.
      +
     -+test_description="Tests performance of add"
     ++test_description="Tests performance of adding things to the object database"
      +
      +# Fsync is normally turned off for the test suite.
      +GIT_TEST_FSYNC=1
     @@ t/perf/p3700-add.sh (new)
      +
      +. $TEST_DIRECTORY/lib-unique-files.sh
      +
     -+test_perf_default_repo
     ++test_perf_fresh_repo
      +test_checkout_worktree
      +
      +dir_count=10
      +files_per_dir=50
      +total_files=$((dir_count * files_per_dir))
      +
     -+# We need to create the files each time we run the perf test, but
     -+# we do not want to measure the cost of creating the files, so run
     -+# the test once.
     -+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
     -+then
     -+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
     -+	GIT_PERF_REPEAT_COUNT=1
     -+fi
     -+
     -+for m in false true batch
     ++for mode in false true batch
      +do
     -+	test_expect_success "create the files for object_fsyncing=$m" '
     -+		git reset --hard &&
     -+		# create files across directories
     -+		test_create_unique_files $dir_count $files_per_dir files
     -+	'
     -+
     -+	case $m in
     ++	case $mode in
      +	false)
      +		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
      +		;;
     @@ t/perf/p3700-add.sh (new)
      +		;;
      +	esac
      +
     -+	test_perf "add $total_files files (object_fsyncing=$m)" "
     -+		git $FSYNC_CONFIG add files
     ++	test_perf "add $total_files files (object_fsyncing=$mode)" \
     ++		--setup "
     ++		(rm -rf .git || 1) &&
     ++		git init &&
     ++		test_create_unique_files $dir_count $files_per_dir files_$mode
     ++	" "
     ++		git $FSYNC_CONFIG add files_$mode
      +	"
     -+done
     -+
     -+test_done
     -
     - ## t/perf/p3900-stash.sh (new) ##
     -@@
     -+#!/bin/sh
     -+#
     -+# This test measures the performance of adding new files to the object database
     -+# and index. The test was originally added to measure the effect of the
     -+# core.fsyncMethod=batch mode, which is why we are testing different values
     -+# of that setting explicitly and creating a lot of unique objects.
     -+
     -+test_description="Tests performance of stash"
     -+
     -+# Fsync is normally turned off for the test suite.
     -+GIT_TEST_FSYNC=1
     -+export GIT_TEST_FSYNC
     -+
     -+. ./perf-lib.sh
     -+
     -+. $TEST_DIRECTORY/lib-unique-files.sh
     -+
     -+test_perf_default_repo
     -+test_checkout_worktree
     -+
     -+dir_count=10
     -+files_per_dir=50
     -+total_files=$((dir_count * files_per_dir))
     -+
     -+# We need to create the files each time we run the perf test, but
     -+# we do not want to measure the cost of creating the files, so run
     -+# the test once.
     -+if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1
     -+then
     -+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
     -+	GIT_PERF_REPEAT_COUNT=1
     -+fi
     -+
     -+for m in false true batch
     -+do
     -+	test_expect_success "create the files for object_fsyncing=$m" '
     -+		git reset --hard &&
     -+		# create files across directories
     -+		test_create_unique_files $dir_count $files_per_dir files
     -+	'
     -+
     -+	case $m in
     -+	false)
     -+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
     -+		;;
     -+	true)
     -+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
     -+		;;
     -+	batch)
     -+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
     -+		;;
     -+	esac
      +
     -+	# We only stash files in the 'files' subdirectory since
     -+	# the perf test infrastructure creates files in the
     -+	# current working directory that need to be preserved
     -+	test_perf "stash $total_files files (object_fsyncing=$m)" "
     -+		git $FSYNC_CONFIG stash push -u -- files
     ++	test_perf "stash $total_files files (object_fsyncing=$mode)" \
     ++		--setup "
     ++		(rm -rf .git || 1) &&
     ++		git init &&
     ++		test_commit first &&
     ++		test_create_unique_files $dir_count $files_per_dir stash_files_$mode
     ++	" "
     ++		git $FSYNC_CONFIG stash push -u -- stash_files_$mode
      +	"
      +done
      +
 11:  6b832e89bc4 = 13:  fb30bd02c8d core.fsyncmethod: correctly camel-case warning message

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
                         ` (14 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

This commit prepares for adding batch-fsync to the bulk-checkin
infrastructure.

The bulk-checkin infrastructure is currently used to batch up addition
of large blobs to a packfile. When a blob is larger than
big_file_threshold, we unconditionally add it to a pack. If bulk
checkins are 'plugged', we allow multiple large blobs to be added to a
single pack until we reach the packfile size limit; otherwise, we simply
make a new packfile for each large blob. The 'unplug' call tells us when
the series of blob additions is done so that we can finish the packfiles
and make their objects available to subsequent operations.

Stated another way, bulk-checkin allows callers to define a transaction
that adds multiple objects to the object database, where the object
database can optimize its internal operations within the transaction
boundary.

Batched fsync will fit into bulk-checkin by taking advantage of the
plug/unplug functionality to determine the appropriate time to fsync
and make newly-added objects available in the primary object database.

* Rename 'state' variable to 'bulk_checkin_state', since we will later
  be adding 'bulk_fsync_objdir'.  This also makes the variable easier to
  find in the debugger, since the name is more unique.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 bulk-checkin.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 6d6c37171c9..577b135e39c 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -10,9 +10,9 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static struct bulk_checkin_state {
-	unsigned plugged:1;
+static int bulk_checkin_plugged;
 
+static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
 	off_t offset;
@@ -21,7 +21,7 @@ static struct bulk_checkin_state {
 	struct pack_idx_entry **written;
 	uint32_t alloc_written;
 	uint32_t nr_written;
-} state;
+} bulk_checkin_state;
 
 static void finish_tmp_packfile(struct strbuf *basename,
 				const char *pack_tmp_name,
@@ -278,21 +278,23 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
 {
-	int status = deflate_to_pack(&state, oid, fd, size, type,
+	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
 				     path, flags);
-	if (!state.plugged)
-		finish_bulk_checkin(&state);
+	if (!bulk_checkin_plugged)
+		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
 
 void plug_bulk_checkin(void)
 {
-	state.plugged = 1;
+	assert(!bulk_checkin_plugged);
+	bulk_checkin_plugged = 1;
 }
 
 void unplug_bulk_checkin(void)
 {
-	state.plugged = 0;
-	if (state.f)
-		finish_bulk_checkin(&state);
+	assert(bulk_checkin_plugged);
+	bulk_checkin_plugged = 0;
+	if (bulk_checkin_state.f)
+		finish_bulk_checkin(&bulk_checkin_state);
 }
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
                         ` (13 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Make it clearer in the naming and documentation of the plug_bulk_checkin
and unplug_bulk_checkin APIs that they can be thought of as
a "transaction" to optimize operations on the object database. These
transactions may be nested so that subsystems like the cache-tree
writing code can optimize their operations without caring whether the
top-level code has a transaction active.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/add.c  |  4 ++--
 bulk-checkin.c | 20 ++++++++++++--------
 bulk-checkin.h | 14 ++++++++++++--
 3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 3ffb86a4338..9bf37ceae8e 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -670,7 +670,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		string_list_clear(&only_match_skip_worktree, 0);
 	}
 
-	plug_bulk_checkin();
+	begin_odb_transaction();
 
 	if (add_renormalize)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
@@ -682,7 +682,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (chmod_arg && pathspec.nr)
 		exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only);
-	unplug_bulk_checkin();
+	end_odb_transaction();
 
 finish:
 	if (write_locked_index(&the_index, &lock_file,
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 577b135e39c..8b0fd5c7723 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -10,7 +10,7 @@
 #include "packfile.h"
 #include "object-store.h"
 
-static int bulk_checkin_plugged;
+static int odb_transaction_nesting;
 
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
@@ -280,21 +280,25 @@ int index_bulk_checkin(struct object_id *oid,
 {
 	int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type,
 				     path, flags);
-	if (!bulk_checkin_plugged)
+	if (!odb_transaction_nesting)
 		finish_bulk_checkin(&bulk_checkin_state);
 	return status;
 }
 
-void plug_bulk_checkin(void)
+void begin_odb_transaction(void)
 {
-	assert(!bulk_checkin_plugged);
-	bulk_checkin_plugged = 1;
+	odb_transaction_nesting += 1;
 }
 
-void unplug_bulk_checkin(void)
+void end_odb_transaction(void)
 {
-	assert(bulk_checkin_plugged);
-	bulk_checkin_plugged = 0;
+	odb_transaction_nesting -= 1;
+	if (odb_transaction_nesting < 0)
+		BUG("Unbalanced ODB transaction nesting");
+
+	if (odb_transaction_nesting)
+		return;
+
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index b26f3dc3b74..69a94422ac7 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -10,7 +10,17 @@ int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
 
-void plug_bulk_checkin(void);
-void unplug_bulk_checkin(void);
+/*
+ * Tell the object database to optimize for adding
+ * multiple objects. end_odb_transaction must be called
+ * to make new objects visible.
+ */
+void begin_odb_transaction(void);
+
+/*
+ * Tell the object database to make any objects from the
+ * current transaction visible.
+ */
+void end_odb_transaction(void);
 
 #endif
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 03/13] object-file: pass filename to fsync_or_die
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
                         ` (12 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

If we die while trying to fsync a loose object file, pass the actual
filename we're trying to sync. This is likely to be more helpful for a
user trying to diagnose the cause of the failure than the former
'loose object file' string. It also sidesteps any concerns about
translating the die message differently for loose objects versus
something else that has a real path.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 object-file.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/object-file.c b/object-file.c
index b254bc50d70..5ffbf3d4fd4 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1888,16 +1888,16 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
 }
 
 /* Finalize a file on disk, and close it. */
-static void close_loose_object(int fd)
+static void close_loose_object(int fd, const char *filename)
 {
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
 	if (fsync_object_files > 0)
-		fsync_or_die(fd, "loose object file");
+		fsync_or_die(fd, filename);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
-				       "loose object file");
+				       filename);
 
 out:
 	if (close(fd) != 0)
@@ -2011,7 +2011,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 		die(_("confused by unstable object source data for %s"),
 		    oid_to_hex(oid));
 
-	close_loose_object(fd);
+	close_loose_object(fd, tmp_file.buf);
 
 	if (mtime) {
 		struct utimbuf utb;
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
                         ` (11 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin odb-transaction functionality, takes
advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1a. Create the object in a tmp-objdir.
2a. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1b. Issue an fsync against a dummy file to flush the log and hardware
   writeback cache, which should by now have seen the tmp-objdir writes.
2b. Rename all of the tmp-objdir files to their final names.
3b. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns. This sequence also ensures that no object files
appear in the main object store unless they are fsync-durable.

Batch mode is only enabled if core.fsync includes loose-objects. If
the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
not include loose-objects, we will use file-by-file fsyncing.

In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
work if no loose objects are ever added to the ODB. We use a tmp-objdir
to maintain the invariant that no loose-objects are visible in the main
ODB unless they are properly fsync-durable. This is important since
future ODB operations that try to create an object with specific
contents will silently drop the new data if an object with the target
hash exists without checking that the loose-object contents match the
hash. Only a full git-fsck would restore the ODB to a functional state
where dataloss doesn't occur.

In step (1b) of the sequence, we issue a fsync against a dummy file
created specifically for the purpose. This method has a little higher
cost than using one of the input object files, but makes adding new
callers of this mechanism easier, since we don't need to figure out
which object file is "last" or risk sharing violations by caching the fd
of the last object file.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 Documentation/config/core.txt |  8 ++++
 bulk-checkin.c                | 71 +++++++++++++++++++++++++++++++++++
 bulk-checkin.h                |  3 ++
 cache.h                       |  8 +++-
 config.c                      |  2 +
 object-file.c                 |  7 +++-
 6 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 9da3e5d88f6..3c90ba0b395 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -596,6 +596,14 @@ core.fsyncMethod::
 * `writeout-only` issues pagecache writeback requests, but depending on the
   filesystem and storage hardware, data added to the repository may not be
   durable in the event of a system crash. This is the default mode on macOS.
+* `batch` enables a mode that uses writeout-only flushes to stage multiple
+  updates in the disk writeback cache and then does a single full fsync of
+  a dummy file to trigger the disk cache flush at the end of the operation.
++
+  Currently `batch` mode only applies to loose-object files. Other repository
+  data is made durable as if `fsync` was specified. This mode is expected to
+  be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems
+  and on Windows for repos stored on NTFS or ReFS filesystems.
 
 core.fsyncObjectFiles::
 	This boolean will enable 'fsync()' when writing object files.
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 8b0fd5c7723..9799d247cad 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -3,15 +3,20 @@
  */
 #include "cache.h"
 #include "bulk-checkin.h"
+#include "lockfile.h"
 #include "repository.h"
 #include "csum-file.h"
 #include "pack.h"
 #include "strbuf.h"
+#include "string-list.h"
+#include "tmp-objdir.h"
 #include "packfile.h"
 #include "object-store.h"
 
 static int odb_transaction_nesting;
 
+static struct tmp_objdir *bulk_fsync_objdir;
+
 static struct bulk_checkin_state {
 	char *pack_tmp_name;
 	struct hashfile *f;
@@ -80,6 +85,40 @@ clear_exit:
 	reprepare_packed_git(the_repository);
 }
 
+/*
+ * Cleanup after batch-mode fsync_object_files.
+ */
+static void do_batch_fsync(void)
+{
+	struct strbuf temp_path = STRBUF_INIT;
+	struct tempfile *temp;
+
+	if (!bulk_fsync_objdir)
+		return;
+
+	/*
+	 * Issue a full hardware flush against a temporary file to ensure
+	 * that all objects are durable before any renames occur. The code in
+	 * fsync_loose_object_bulk_checkin has already issued a writeout
+	 * request, but it has not flushed any writeback cache in the storage
+	 * hardware or any filesystem logs. This fsync call acts as a barrier
+	 * to ensure that the data in each new object file is durable before
+	 * the final name is visible.
+	 */
+	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory());
+	temp = xmks_tempfile(temp_path.buf);
+	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
+	delete_tempfile(&temp);
+	strbuf_release(&temp_path);
+
+	/*
+	 * Make the object files visible in the primary ODB after their data is
+	 * fully durable.
+	 */
+	tmp_objdir_migrate(bulk_fsync_objdir);
+	bulk_fsync_objdir = NULL;
+}
+
 static int already_written(struct bulk_checkin_state *state, struct object_id *oid)
 {
 	int i;
@@ -274,6 +313,36 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 	return 0;
 }
 
+void prepare_loose_object_bulk_checkin(void)
+{
+	/*
+	 * We lazily create the temporary object directory
+	 * the first time an object might be added, since
+	 * callers may not know whether any objects will be
+	 * added at the time they call begin_odb_transaction.
+	 */
+	if (!odb_transaction_nesting || bulk_fsync_objdir)
+		return;
+
+	bulk_fsync_objdir = tmp_objdir_create("bulk-fsync");
+	if (bulk_fsync_objdir)
+		tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0);
+}
+
+void fsync_loose_object_bulk_checkin(int fd, const char *filename)
+{
+	/*
+	 * If we have an active ODB transaction, we issue a call that
+	 * cleans the filesystem page cache but avoids a hardware flush
+	 * command. Later on we will issue a single hardware flush
+	 * before as part of do_batch_fsync.
+	 */
+	if (!bulk_fsync_objdir ||
+	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
+		fsync_or_die(fd, filename);
+	}
+}
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags)
@@ -301,4 +370,6 @@ void end_odb_transaction(void)
 
 	if (bulk_checkin_state.f)
 		finish_bulk_checkin(&bulk_checkin_state);
+
+	do_batch_fsync();
 }
diff --git a/bulk-checkin.h b/bulk-checkin.h
index 69a94422ac7..70edf745be8 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -6,6 +6,9 @@
 
 #include "cache.h"
 
+void prepare_loose_object_bulk_checkin(void);
+void fsync_loose_object_bulk_checkin(int fd, const char *filename);
+
 int index_bulk_checkin(struct object_id *oid,
 		       int fd, size_t size, enum object_type type,
 		       const char *path, unsigned flags);
diff --git a/cache.h b/cache.h
index ef7d34b7a09..a5bf15a5131 100644
--- a/cache.h
+++ b/cache.h
@@ -1040,7 +1040,8 @@ extern int use_fsync;
 
 enum fsync_method {
 	FSYNC_METHOD_FSYNC,
-	FSYNC_METHOD_WRITEOUT_ONLY
+	FSYNC_METHOD_WRITEOUT_ONLY,
+	FSYNC_METHOD_BATCH,
 };
 
 extern enum fsync_method fsync_method;
@@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *);
 int fsync_component(enum fsync_component component, int fd);
 void fsync_component_or_die(enum fsync_component component, int fd, const char *msg);
 
+static inline int batch_fsync_enabled(enum fsync_component component)
+{
+	return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH);
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count);
 ssize_t write_in_full(int fd, const void *buf, size_t count);
 ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset);
diff --git a/config.c b/config.c
index 3c9b6b589ab..511f4584eeb 100644
--- a/config.c
+++ b/config.c
@@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 			fsync_method = FSYNC_METHOD_FSYNC;
 		else if (!strcmp(value, "writeout-only"))
 			fsync_method = FSYNC_METHOD_WRITEOUT_ONLY;
+		else if (!strcmp(value, "batch"))
+			fsync_method = FSYNC_METHOD_BATCH;
 		else
 			warning(_("ignoring unknown core.fsyncMethod value '%s'"), value);
 
diff --git a/object-file.c b/object-file.c
index 5ffbf3d4fd4..d2e0c13198f 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1893,7 +1893,9 @@ static void close_loose_object(int fd, const char *filename)
 	if (the_repository->objects->odb->will_destroy)
 		goto out;
 
-	if (fsync_object_files > 0)
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		fsync_loose_object_bulk_checkin(fd, filename);
+	else if (fsync_object_files > 0)
 		fsync_or_die(fd, filename);
 	else
 		fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd,
@@ -1961,6 +1963,9 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
 	static struct strbuf tmp_file = STRBUF_INIT;
 	static struct strbuf filename = STRBUF_INIT;
 
+	if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT))
+		prepare_loose_object_bulk_checkin();
+
 	loose_object_path(the_repository, &filename, oid);
 
 	fd = create_tmpfile(&tmp_file, filename.buf);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
                         ` (10 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Take advantage of the odb transaction infrastructure around writing the
cached tree to the object database.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 cache-tree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 6752f69d515..8c5e8822716 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -3,6 +3,7 @@
 #include "tree.h"
 #include "tree-walk.h"
 #include "cache-tree.h"
+#include "bulk-checkin.h"
 #include "object-store.h"
 #include "replace-object.h"
 #include "promisor-remote.h"
@@ -474,8 +475,10 @@ int cache_tree_update(struct index_state *istate, int flags)
 
 	trace_performance_enter();
 	trace2_region_enter("cache_tree", "update", the_repository);
+	begin_odb_transaction();
 	i = update_one(istate->cache_tree, istate->cache, istate->cache_nr,
 		       "", 0, &skip, flags);
+	end_odb_transaction();
 	trace2_region_leave("cache_tree", "update", the_repository);
 	trace_performance_leave("cache_tree_update");
 	if (i < 0)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
                         ` (9 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.

This change enables odb-transactions for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.

There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete, so callers
using the --stdin option will not see them until update-index is done.
This risk is mitigated by not keeping an ODB transaction open around
--stdin processing if in --verbose mode. Without --verbose mode,
a caller feeding update-index via --stdin wouldn't know when
update-index adds an object, event without an ODB transaction.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/update-index.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index aafe7eeac2a..50f9063e1c6 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -5,6 +5,7 @@
  */
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "lockfile.h"
 #include "quote.h"
@@ -1116,6 +1117,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	 */
 	parse_options_start(&ctx, argc, argv, prefix,
 			    options, PARSE_OPT_STOP_AT_NON_OPTION);
+
+	/*
+	 * Allow the object layer to optimize adding multiple objects in
+	 * a batch.
+	 */
+	begin_odb_transaction();
 	while (ctx.argc) {
 		if (parseopt_state != PARSE_OPT_DONE)
 			parseopt_state = parse_options_step(&ctx, options,
@@ -1167,6 +1174,17 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		the_index.version = preferred_index_format;
 	}
 
+	/*
+	 * It is possible, though unlikely, that a caller could use the verbose
+	 * output to synchronize with addition of objects to the object
+	 * database. The current implementation of ODB transactions leaves
+	 * objects invisible while a transaction is active, so end the
+	 * transaction here if verbose output is enabled.
+	 */
+
+	if (verbose)
+		end_odb_transaction();
+
 	if (read_from_stdin) {
 		struct strbuf buf = STRBUF_INIT;
 		struct strbuf unquoted = STRBUF_INIT;
@@ -1190,6 +1208,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	/*
+	 * By now we have added all of the new objects
+	 */
+	if (!verbose)
+		end_odb_transaction();
+
 	if (split_index > 0) {
 		if (git_config_get_split_index() == 0)
 			warning(_("core.splitIndex is set to false; "
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 07/13] unpack-objects: use the bulk-checkin infrastructure
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
                         ` (8 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling an odb-transaction when unpacking objects, we can take advantage
of batched fsyncs.

Here are some performance numbers to justify batch mode for
unpack-objects, collected on a WSL2 Ubuntu VM.

Fsync Mode | Time for 90 objects (ms)
-------------------------------------
       Off | 170
  On,fsync | 760
  On,batch | 230

Note that the default unpackLimit is 100 objects, so there's a 3x
benefit in the worst case. The non-batch mode fsync scales linearly
with the number of objects, so there are significant benefits even with
smaller numbers of objects.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/unpack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dbeb0680a58..56d05e2725d 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "object-store.h"
 #include "object.h"
@@ -503,10 +504,12 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
+	begin_odb_transaction();
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
 		display_progress(progress, i + 1);
 	}
+	end_odb_transaction();
 	stop_progress(&progress);
 
 	if (delta_list)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
                         ` (7 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.

We don't actually do fsyncs in the most of the test suite, since
GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
surrounding batch mode code since GIT_TEST_FSYNC merely makes the
maybe_fsync wrapper always appear to succeed.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 cache.h           | 4 ++++
 compat/mingw.h    | 3 +++
 config.c          | 2 +-
 git-compat-util.h | 2 ++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index a5bf15a5131..7f6cbb254b4 100644
--- a/cache.h
+++ b/cache.h
@@ -1031,6 +1031,10 @@ enum fsync_component {
 			      FSYNC_COMPONENT_INDEX | \
 			      FSYNC_COMPONENT_REFERENCE)
 
+#ifndef FSYNC_COMPONENTS_PLATFORM_DEFAULT
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT FSYNC_COMPONENTS_DEFAULT
+#endif
+
 /*
  * A bitmask indicating which components of the repo should be fsynced.
  */
diff --git a/compat/mingw.h b/compat/mingw.h
index 6074a3d3ced..afe30868c04 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -332,6 +332,9 @@ int mingw_getpagesize(void);
 int win32_fsync_no_flush(int fd);
 #define fsync_no_flush win32_fsync_no_flush
 
+#define FSYNC_COMPONENTS_PLATFORM_DEFAULT (FSYNC_COMPONENTS_DEFAULT | FSYNC_COMPONENT_LOOSE_OBJECT)
+#define FSYNC_METHOD_DEFAULT (FSYNC_METHOD_BATCH)
+
 struct rlimit {
 	unsigned int rlim_cur;
 };
diff --git a/config.c b/config.c
index 511f4584eeb..e9cac5f4707 100644
--- a/config.c
+++ b/config.c
@@ -1342,7 +1342,7 @@ static const struct fsync_component_name {
 
 static enum fsync_component parse_fsync_components(const char *var, const char *string)
 {
-	enum fsync_component current = FSYNC_COMPONENTS_DEFAULT;
+	enum fsync_component current = FSYNC_COMPONENTS_PLATFORM_DEFAULT;
 	enum fsync_component positive = 0, negative = 0;
 
 	while (string) {
diff --git a/git-compat-util.h b/git-compat-util.h
index 0892e209a2f..fffe42ce7c1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1257,11 +1257,13 @@ __attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...);
 #define BUG(...) BUG_fl(__FILE__, __LINE__, __VA_ARGS__)
 
+#ifndef FSYNC_METHOD_DEFAULT
 #ifdef __APPLE__
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_WRITEOUT_ONLY
 #else
 #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_FSYNC
 #endif
+#endif
 
 enum fsync_action {
 	FSYNC_WRITEOUT_ONLY,
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (7 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
                         ` (6 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Several tests use awk to parse OIDs from the output of 'git ls-files
--stage' and 'git ls-tree'. Introduce helpers to centralize these uses
of awk.

Update t5317-pack-objects-filter-objects.sh to use the new ls-files
helper so that it has some usages to review. Other updates are left for
the future.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/t5317-pack-objects-filter-objects.sh | 91 +++++++++++++-------------
 t/test-lib-functions.sh                | 10 +++
 2 files changed, 54 insertions(+), 47 deletions(-)

diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh
index 33b740ce628..bb633c9b099 100755
--- a/t/t5317-pack-objects-filter-objects.sh
+++ b/t/t5317-pack-objects-filter-objects.sh
@@ -10,9 +10,6 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 # Test blob:none filter.
 
 test_expect_success 'setup r1' '
-	echo "{print \$1}" >print_1.awk &&
-	echo "{print \$2}" >print_2.awk &&
-
 	git init r1 &&
 	for n in 1 2 3 4 5
 	do
@@ -22,10 +19,13 @@ test_expect_success 'setup r1' '
 	done
 '
 
+parse_verify_pack_blob_oid () {
+	awk '{print $1}' -
+}
+
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r1 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -35,7 +35,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r1 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -54,12 +54,12 @@ test_expect_success 'verify blob:none packfile has no blobs' '
 test_expect_success 'verify normal and blob:none packfiles have same commits/trees' '
 	git -C r1 verify-pack -v ../all.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >expected &&
 
 	git -C r1 verify-pack -v ../filter.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -123,8 +123,8 @@ test_expect_success 'setup r2' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -134,7 +134,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r2 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -161,8 +161,8 @@ test_expect_success 'verify blob:limit=1000' '
 '
 
 test_expect_success 'verify blob:limit=1001' '
-	git -C r2 ls-files -s large.1000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1001 >filter.pack <<-EOF &&
@@ -172,15 +172,15 @@ test_expect_success 'verify blob:limit=1001' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=10001' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=10001 >filter.pack <<-EOF &&
@@ -190,15 +190,15 @@ test_expect_success 'verify blob:limit=10001' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=1k' '
-	git -C r2 ls-files -s large.1000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1k >filter.pack <<-EOF &&
@@ -208,15 +208,15 @@ test_expect_success 'verify blob:limit=1k' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify explicitly specifying oversized blob in input' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	echo HEAD >objects &&
@@ -226,15 +226,15 @@ test_expect_success 'verify explicitly specifying oversized blob in input' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify blob:limit=1m' '
-	git -C r2 ls-files -s large.1000 large.10000 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r2 ls-files -s large.1000 large.10000 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r2 pack-objects --revs --stdout --filter=blob:limit=1m >filter.pack <<-EOF &&
@@ -244,7 +244,7 @@ test_expect_success 'verify blob:limit=1m' '
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -253,12 +253,12 @@ test_expect_success 'verify blob:limit=1m' '
 test_expect_success 'verify normal and blob:limit packfiles have same commits/trees' '
 	git -C r2 verify-pack -v ../all.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >expected &&
 
 	git -C r2 verify-pack -v ../filter.pack >verify_result &&
 	grep -E "commit|tree" verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -289,9 +289,8 @@ test_expect_success 'setup r3' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r3 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -301,7 +300,7 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r3 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -342,9 +341,8 @@ test_expect_success 'setup r4' '
 '
 
 test_expect_success 'verify blob count in normal packfile' '
-	git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 pack-objects --revs --stdout >all.pack <<-EOF &&
@@ -354,19 +352,19 @@ test_expect_success 'verify blob count in normal packfile' '
 
 	git -C r4 verify-pack -v ../all.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify sparse:oid=OID' '
-	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 ls-files -s pattern >staged &&
-	oid=$(awk -f print_2.awk staged) &&
+	oid=$(test_parse_ls_files_stage_oids <staged) &&
 	git -C r4 pack-objects --revs --stdout --filter=sparse:oid=$oid >filter.pack <<-EOF &&
 	HEAD
 	EOF
@@ -374,15 +372,15 @@ test_expect_success 'verify sparse:oid=OID' '
 
 	git -C r4 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
 '
 
 test_expect_success 'verify sparse:oid=oid-ish' '
-	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	git -C r4 pack-objects --revs --stdout --filter=sparse:oid=main:pattern >filter.pack <<-EOF &&
@@ -392,7 +390,7 @@ test_expect_success 'verify sparse:oid=oid-ish' '
 
 	git -C r4 verify-pack -v ../filter.pack >verify_result &&
 	grep blob verify_result |
-	awk -f print_1.awk |
+	parse_verify_pack_blob_oid |
 	sort >observed &&
 
 	test_cmp expected observed
@@ -402,9 +400,8 @@ test_expect_success 'verify sparse:oid=oid-ish' '
 # This models previously omitted objects that we did not receive.
 
 test_expect_success 'setup r1 - delete loose blobs' '
-	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \
-		>ls_files_result &&
-	awk -f print_2.awk ls_files_result |
+	git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 |
+	test_parse_ls_files_stage_oids |
 	sort >expected &&
 
 	for id in `cat expected | sed "s|..|&/|"`
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index a027f0c409e..e6011409e2f 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1782,6 +1782,16 @@ test_oid_to_path () {
 	echo "${1%$basename}/$basename"
 }
 
+# Parse oids from git ls-files --staged output
+test_parse_ls_files_stage_oids () {
+	awk '{print $2}' -
+}
+
+# Parse oids from git ls-tree output
+test_parse_ls_tree_oids () {
+	awk '{print $3}' -
+}
+
 # Choose a port number based on the test script's number and store it in
 # the given variable name, unless that variable already contains a number.
 test_set_port () {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 10/13] core.fsyncmethod: tests for batch mode
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (8 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
                         ` (5 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added
due to it already being in the repo.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/lib-unique-files.sh  | 34 ++++++++++++++++++++++++++++++++++
 t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
 t/t3903-stash.sh       | 20 ++++++++++++++++++++
 t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
 4 files changed, 109 insertions(+), 14 deletions(-)
 create mode 100644 t/lib-unique-files.sh

diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
new file mode 100644
index 00000000000..34c01a65256
--- /dev/null
+++ b/t/lib-unique-files.sh
@@ -0,0 +1,34 @@
+# Helper to create files with unique contents
+
+# Create multiple files with unique contents within this test run. Takes the
+# number of directories, the number of files in each directory, and the base
+# directory.
+#
+# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
+#					 each in my_dir, all with contents
+#					 different from previous invocations
+#					 of this command in this run.
+
+test_create_unique_files () {
+	test "$#" -ne 3 && BUG "3 param"
+
+	local dirs="$1" &&
+	local files="$2" &&
+	local basedir="$3" &&
+	local counter=0 &&
+	local i &&
+	local j &&
+	test_tick &&
+	local basedata=$basedir$test_tick &&
+	rm -rf "$basedir" &&
+	for i in $(test_seq $dirs)
+	do
+		local dir=$basedir/dir$i &&
+		mkdir -p "$dir" &&
+		for j in $(test_seq $files)
+		do
+			counter=$((counter + 1)) &&
+			echo "$basedata.$counter">"$dir/file$j.txt"
+		done
+	done
+}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index b1f90ba3250..8979c8a5f03 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+. $TEST_DIRECTORY/lib-unique-files.sh
+
 # Test the file mode "$1" of the file "$2" in the index.
 test_mode_in_index () {
 	case "$(git ls-files -s "$2")" in
@@ -34,6 +36,32 @@ test_expect_success \
     'Test that "git add -- -q" works' \
     'touch -- -q && git add -- -q'
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'git add: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir1 &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION add -- ./files_base_dir1/ &&
+	git ls-files --stage files_base_dir1/ |
+	test_parse_ls_files_stage_oids >added_files_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files_oids >added_files_actual &&
+	test_cmp added_files_oids added_files_actual
+"
+
+test_expect_success 'git update-index: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir2 &&
+	find files_base_dir2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
+	git ls-files --stage files_base_dir2 |
+	test_parse_ls_files_stage_oids >added_files2_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files2_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files2_oids >added_files2_actual &&
+	test_cmp added_files2_oids added_files2_actual
+"
+
 test_expect_success \
 	'git add: Test that executable bit is not used if core.filemode=0' \
 	'git config core.filemode 0 &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 4abbc8fccae..20e94881964 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
+. $TEST_DIRECTORY/lib-unique-files.sh
 
 test_expect_success 'usage on cmd and subcommand invalid option' '
 	test_expect_code 129 git stash --invalid-option 2>usage &&
@@ -1410,6 +1411,25 @@ test_expect_success 'stash handles skip-worktree entries nicely' '
 	git rev-parse --verify refs/stash:A.t
 '
 
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'stash with core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION stash push -u -- ./files_base_dir/ &&
+
+	# The files were untracked, so use the third parent,
+	# which contains the untracked files
+	git ls-tree -r stash^3 -- ./files_base_dir/ |
+	test_parse_ls_tree_oids >stashed_files_oids &&
+
+	# We created 2 dirs with 4 files each (8 files total) above
+	test_line_count = 8 stashed_files_oids &&
+	git cat-file --batch-check='%(objectname)' <stashed_files_oids >stashed_files_actual &&
+	test_cmp stashed_files_oids stashed_files_actual
+"
+
+
 test_expect_success 'git stash succeeds despite directory/file change' '
 	test_create_repo directory_file_switch_v1 &&
 	(
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index a11d61206ad..f8a0f309e2d 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -161,22 +161,27 @@ test_expect_success 'pack-objects with bogus arguments' '
 '
 
 check_unpack () {
+	local packname="$1" &&
+	local object_list="$2" &&
+	local git_config="$3" &&
 	test_when_finished "rm -rf git2" &&
-	git init --bare git2 &&
-	git -C git2 unpack-objects -n <"$1".pack &&
-	git -C git2 unpack-objects <"$1".pack &&
-	(cd .git && find objects -type f -print) |
-	while read path
-	do
-		cmp git2/$path .git/$path || {
-			echo $path differs.
-			return 1
-		}
-	done
+	git $git_config init --bare git2 &&
+	(
+		git $git_config -C git2 unpack-objects -n <"$packname".pack &&
+		git $git_config -C git2 unpack-objects <"$packname".pack &&
+		git $git_config -C git2 cat-file --batch-check="%(objectname)"
+	) <"$object_list" >current &&
+	cmp "$object_list" current
 }
 
 test_expect_success 'unpack without delta' '
-	check_unpack test-1-${packname_1}
+	check_unpack test-1-${packname_1} obj-list
+'
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
+	check_unpack test-1-${packname_1} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with REF_DELTA' '
@@ -185,7 +190,11 @@ test_expect_success 'pack with REF_DELTA' '
 '
 
 test_expect_success 'unpack with REF_DELTA' '
-	check_unpack test-2-${packname_2}
+	check_unpack test-2-${packname_2} obj-list
+'
+
+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-2-${packname_2} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with OFS_DELTA' '
@@ -195,7 +204,11 @@ test_expect_success 'pack with OFS_DELTA' '
 '
 
 test_expect_success 'unpack with OFS_DELTA' '
-	check_unpack test-3-${packname_3}
+	check_unpack test-3-${packname_3} obj-list
+'
+
+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-3-${packname_3} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'compare delta flavors' '
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (9 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29 17:14         ` Neeraj Singh
  2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
                         ` (4 subsequent siblings)
  15 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Tests that affect the repo in stateful ways are easier to write if we
can run setup steps outside of the measured portion of perf iteration.

This change adds a "--setup 'setup-script'" parameter to test_perf. To
make invocations easier to understand, I also moved the prerequisites to
a new --prereq parameter.

The setup facility will be used in the upcoming perf tests for batch
mode, but it already helps in some existing tests, like t5302 and t7820.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p4220-log-grep-engines.sh       |  3 +-
 t/perf/p4221-log-grep-engines-fixed.sh |  3 +-
 t/perf/p5302-pack-index.sh             | 15 +++----
 t/perf/p7519-fsmonitor.sh              | 18 ++------
 t/perf/p7820-grep-engines.sh           |  6 ++-
 t/perf/perf-lib.sh                     | 62 +++++++++++++++++++++++---
 6 files changed, 73 insertions(+), 34 deletions(-)

diff --git a/t/perf/p4220-log-grep-engines.sh b/t/perf/p4220-log-grep-engines.sh
index 2bc47ded4d1..03fbfbb85d3 100755
--- a/t/perf/p4220-log-grep-engines.sh
+++ b/t/perf/p4220-log-grep-engines.sh
@@ -36,7 +36,8 @@ do
 		else
 			prereq=""
 		fi
-		test_perf $prereq "$engine log$GIT_PERF_4220_LOG_OPTS --grep='$pattern'" "
+		test_perf "$engine log$GIT_PERF_4220_LOG_OPTS --grep='$pattern'" \
+			--prereq "$prereq" "
 			git -c grep.patternType=$engine log --pretty=format:%h$GIT_PERF_4220_LOG_OPTS --grep='$pattern' >'out.$engine' || :
 		"
 	done
diff --git a/t/perf/p4221-log-grep-engines-fixed.sh b/t/perf/p4221-log-grep-engines-fixed.sh
index 060971265a9..0a6d6dfc219 100755
--- a/t/perf/p4221-log-grep-engines-fixed.sh
+++ b/t/perf/p4221-log-grep-engines-fixed.sh
@@ -26,7 +26,8 @@ do
 		else
 			prereq=""
 		fi
-		test_perf $prereq "$engine log$GIT_PERF_4221_LOG_OPTS --grep='$pattern'" "
+		test_perf "$engine log$GIT_PERF_4221_LOG_OPTS --grep='$pattern'" \
+			--prereq "$prereq" "
 			git -c grep.patternType=$engine log --pretty=format:%h$GIT_PERF_4221_LOG_OPTS --grep='$pattern' >'out.$engine' || :
 		"
 	done
diff --git a/t/perf/p5302-pack-index.sh b/t/perf/p5302-pack-index.sh
index c16f6a3ff69..14c601bbf86 100755
--- a/t/perf/p5302-pack-index.sh
+++ b/t/perf/p5302-pack-index.sh
@@ -26,9 +26,8 @@ test_expect_success 'set up thread-counting tests' '
 	done
 '
 
-test_perf PERF_EXTRA 'index-pack 0 threads' '
-	rm -rf repo.git &&
-	git init --bare repo.git &&
+test_perf 'index-pack 0 threads' --prereq PERF_EXTRA \
+	--setup 'rm -rf repo.git && git init --bare repo.git' '
 	GIT_DIR=repo.git git index-pack --threads=1 --stdin < $PACK
 '
 
@@ -36,17 +35,15 @@ for t in $threads
 do
 	THREADS=$t
 	export THREADS
-	test_perf PERF_EXTRA "index-pack $t threads" '
-		rm -rf repo.git &&
-		git init --bare repo.git &&
+	test_perf "index-pack $t threads" --prereq PERF_EXTRA \
+		--setup 'rm -rf repo.git && git init --bare repo.git' '
 		GIT_DIR=repo.git GIT_FORCE_THREADS=1 \
 		git index-pack --threads=$THREADS --stdin <$PACK
 	'
 done
 
-test_perf 'index-pack default number of threads' '
-	rm -rf repo.git &&
-	git init --bare repo.git &&
+test_perf 'index-pack default number of threads' \
+	--setup 'rm -rf repo.git && git init --bare repo.git' '
 	GIT_DIR=repo.git git index-pack --stdin < $PACK
 '
 
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index c8be58f3c76..5b489c968b8 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -60,18 +60,6 @@ then
 	esac
 fi
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"
-then
-	# When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to
-	# generate valid results. Otherwise the caching that happens for the nth
-	# run will negate the validity of the comparisons.
-	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
-	then
-		echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
-		GIT_PERF_REPEAT_COUNT=1
-	fi
-fi
-
 trace_start() {
 	if test -n "$GIT_PERF_7519_TRACE"
 	then
@@ -167,10 +155,10 @@ setup_for_fsmonitor() {
 
 test_perf_w_drop_caches () {
 	if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-		test-tool drop-caches
+		test_perf "$1" --setup "test-tool drop-caches" "$2"
+	else
+		test_perf "$@"
 	fi
-
-	test_perf "$@"
 }
 
 test_fsmonitor_suite() {
diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
index 8b09c5bf328..9bfb86842a9 100755
--- a/t/perf/p7820-grep-engines.sh
+++ b/t/perf/p7820-grep-engines.sh
@@ -49,13 +49,15 @@ do
 		fi
 		if ! test_have_prereq PERF_GREP_ENGINES_THREADS
 		then
-			test_perf $prereq "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern'" "
+			test_perf "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern'" \
+				--prereq "$prereq" "
 				git -c grep.patternType=$engine grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine' || :
 			"
 		else
 			for threads in $GIT_PERF_GREP_THREADS
 			do
-				test_perf PTHREADS,$prereq "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern' with $threads threads" "
+				test_perf "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern' with $threads threads"
+					--prereq PTHREADS,$prereq "
 					git -c grep.patternType=$engine -c grep.threads=$threads grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine.$threads' || :
 				"
 			done
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 407252bac70..a935ad622d3 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -189,19 +189,38 @@ exit $ret' >&3 2>&4
 }
 
 test_wrapper_ () {
-	test_wrapper_func_=$1; shift
+	local test_wrapper_func_=$1; shift
+	local test_title_=$1; shift
 	test_start_
-	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
-	test "$#" = 2 ||
-	BUG "not 2 or 3 parameters to test-expect-success"
+	test_prereq=
+	test_perf_setup_=
+	while test $# != 0
+	do
+		case $1 in
+		--prereq)
+			test_prereq=$2
+			shift
+			;;
+		--setup)
+			test_perf_setup_=$2
+			shift
+			;;
+		*)
+			break
+			;;
+		esac
+		shift
+	done
+	test "$#" = 1 || BUG "test_wrapper_ needs 2 positional parameters"
 	export test_prereq
-	if ! test_skip "$@"
+	export test_perf_setup_
+	if ! test_skip "$test_title_" "$@"
 	then
 		base=$(basename "$0" .sh)
 		echo "$test_count" >>"$perf_results_dir"/$base.subtests
 		echo "$1" >"$perf_results_dir"/$base.$test_count.descr
 		base="$perf_results_dir"/"$PERF_RESULTS_PREFIX$(basename "$0" .sh)"."$test_count"
-		"$test_wrapper_func_" "$@"
+		"$test_wrapper_func_" "$test_title_" "$@"
 	fi
 
 	test_finish_
@@ -214,6 +233,16 @@ test_perf_ () {
 		echo "perf $test_count - $1:"
 	fi
 	for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do
+		if test -n "$test_perf_setup_"
+		then
+			say >&3 "setup: $test_perf_setup_"
+			if ! test_eval_ $test_perf_setup_
+			then
+				test_failure_ "$test_perf_setup_"
+				break
+			fi
+
+		fi
 		say >&3 "running: $2"
 		if test_run_perf_ "$2"
 		then
@@ -237,11 +266,24 @@ test_perf_ () {
 	rm test_time.*
 }
 
+# Usage: test_perf 'title' [options] 'perf-test'
+#	Run the performance test script specified in perf-test with
+#	optional prerequisite and setup steps.
+# Options:
+#	--prereq prerequisites: Skip the test if prequisites aren't met
+#	--setup "setup-steps": Run setup steps prior to each measured iteration
+#
 test_perf () {
 	test_wrapper_ test_perf_ "$@"
 }
 
 test_size_ () {
+	if test -n "$test_perf_setup_"
+	then
+		say >&3 "setup: $test_perf_setup_"
+		test_eval_ $test_perf_setup_
+	fi
+
 	say >&3 "running: $2"
 	if test_eval_ "$2" 3>"$base".result; then
 		test_ok_ "$1"
@@ -250,6 +292,14 @@ test_size_ () {
 	fi
 }
 
+# Usage: test_size 'title' [options] 'size-test'
+#	Run the size test script specified in size-test with optional
+#	prerequisites and setup steps. Returns the numeric value
+#	returned by size-test.
+# Options:
+#	--prereq prerequisites: Skip the test if prequisites aren't met
+#	--setup "setup-steps": Run setup steps prior to the size measurement
+
 test_size () {
 	test_wrapper_ test_size_ "$@"
 }
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (10 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29 17:38         ` Neeraj Singh
  2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
                         ` (3 subsequent siblings)
  15 siblings, 1 reply; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

Add basic performance tests for "git add" and "git stash" of a lot of
new objects with various fsync settings. This shows the benefit of batch
mode relative to full fsync.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p3700-add.sh | 59 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)
 create mode 100755 t/perf/p3700-add.sh

diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh
new file mode 100755
index 00000000000..ef6024f9897
--- /dev/null
+++ b/t/perf/p3700-add.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object database
+# and index. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values
+# of that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of adding things to the object database"
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_fresh_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+for mode in false true batch
+do
+	case $mode in
+	false)
+		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
+		;;
+	true)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
+		;;
+	batch)
+		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+		;;
+	esac
+
+	test_perf "add $total_files files (object_fsyncing=$mode)" \
+		--setup "
+		(rm -rf .git || 1) &&
+		git init &&
+		test_create_unique_files $dir_count $files_per_dir files_$mode
+	" "
+		git $FSYNC_CONFIG add files_$mode
+	"
+
+	test_perf "stash $total_files files (object_fsyncing=$mode)" \
+		--setup "
+		(rm -rf .git || 1) &&
+		git init &&
+		test_commit first &&
+		test_create_unique_files $dir_count $files_per_dir stash_files_$mode
+	" "
+		git $FSYNC_CONFIG stash push -u -- stash_files_$mode
+	"
+done
+
+test_done
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (11 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
@ 2022-03-29  0:42       ` Neeraj Singh via GitGitGadget
  2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  15 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh via GitGitGadget @ 2022-03-29  0:42 UTC (permalink / raw)
  To: git
  Cc: Johannes.Schindelin, avarab, nksingh85, ps, jeffhost,
	Bagas Sanjaya, worldhello.net, Neeraj K. Singh, Neeraj Singh

From: Neeraj Singh <neerajsi@microsoft.com>

The warning for an unrecognized fsyncMethod was not
camel-cased.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config.c b/config.c
index e9cac5f4707..ae819dee20b 100644
--- a/config.c
+++ b/config.c
@@ -1697,7 +1697,7 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "core.fsyncobjectfiles")) {
 		if (fsync_object_files < 0)
-			warning(_("core.fsyncobjectfiles is deprecated; use core.fsync instead"));
+			warning(_("core.fsyncObjectFiles is deprecated; use core.fsync instead"));
 		fsync_object_files = git_config_bool(var, value);
 		return 0;
 	}
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (12 preceding siblings ...)
  2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
@ 2022-03-29 10:47       ` Ævar Arnfjörð Bjarmason
  2022-03-29 17:09         ` Neeraj Singh
  2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
  2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
  15 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-29 10:47 UTC (permalink / raw)
  To: Neeraj K. Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, jeffhost, Bagas Sanjaya,
	worldhello.net, Neeraj K. Singh


On Tue, Mar 29 2022, Neeraj K. Singh via GitGitGadget wrote:

> V4 changes:
>
>  * Make ODB transactions nestable.
>  * Add an ODB transaction around writing out the cached tree.
>  * Change update-index to use a more straightforward way of managing ODB
>    transactions.
>  * Fix missing 'local's in lib-unique-files
>  * Add a per-iteration setup mechanism to test_perf.
>  * Fix camelCasing in warning message.

I haven't looked at the bulk of this in any detail, but:

>  10:  b99b32a469c ! 12:  fdf90d45f52 core.fsyncmethod: performance tests for add and stash
>      @@ t/perf/p3700-add.sh (new)
>       +# core.fsyncMethod=batch mode, which is why we are testing different values
>       +# of that setting explicitly and creating a lot of unique objects.
>       +
>      -+test_description="Tests performance of add"
>      ++test_description="Tests performance of adding things to the object database"

Now having both tests for "add" and "stash" in a test named p3700-add.sh
isn't better, the rest of the perf tests are split up by command,
perhaps just add a helper library and have both use it?

And re the unaddressed feedback I ad of "why the random data"
inhttps://lore.kernel.org/git/220326.86o81sk9ao.gmgdl@evledraar.gmail.com/
I tried patching it on top to do what I suggested there, allowing us to
run these against any arbitrary repository and came up with this:

diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh
index ef6024f9897..60abd5ee076 100755
--- a/t/perf/p3700-add.sh
+++ b/t/perf/p3700-add.sh
@@ -13,47 +13,26 @@ export GIT_TEST_FSYNC
 
 . ./perf-lib.sh
 
-. $TEST_DIRECTORY/lib-unique-files.sh
-
-test_perf_fresh_repo
+test_perf_default_repo
 test_checkout_worktree
 
-dir_count=10
-files_per_dir=50
-total_files=$((dir_count * files_per_dir))
-
-for mode in false true batch
+for cfg in \
+	'-c core.fsync=-loose-object -c core.fsyncmethod=fsync' \
+	'-c core.fsync=loose-object -c core.fsyncmethod=fsync' \
+	'-c core.fsync=loose-object -c core.fsyncmethod=batch' \
+	'-c core.fsyncmethod=batch'
 do
-	case $mode in
-	false)
-		FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync'
-		;;
-	true)
-		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync'
-		;;
-	batch)
-		FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch'
-		;;
-	esac
-
-	test_perf "add $total_files files (object_fsyncing=$mode)" \
-		--setup "
-		(rm -rf .git || 1) &&
-		git init &&
-		test_create_unique_files $dir_count $files_per_dir files_$mode
-	" "
-		git $FSYNC_CONFIG add files_$mode
-	"
-
-	test_perf "stash $total_files files (object_fsyncing=$mode)" \
-		--setup "
-		(rm -rf .git || 1) &&
-		git init &&
-		test_commit first &&
-		test_create_unique_files $dir_count $files_per_dir stash_files_$mode
-	" "
-		git $FSYNC_CONFIG stash push -u -- stash_files_$mode
-	"
+	test_perf "'git add' with '$cfg'" \
+		--setup '
+			mv -v .git .git.old &&
+			git init .
+		' \
+		--cleanup '
+			rm -rf .git &&
+			mv .git.old .git
+		' '
+		git $cfg add -f -- ":!.git.old/"
+	'
 done
 
 test_done
diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh
new file mode 100755
index 00000000000..12c489069ba
--- /dev/null
+++ b/t/perf/p3900-stash.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='performance of "git stash" with different fsync settings'
+
+# Fsync is normally turned off for the test suite.
+GIT_TEST_FSYNC=1
+export GIT_TEST_FSYNC
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+test_checkout_worktree
+
+for cfg in \
+	'-c core.fsync=-loose-object -c core.fsyncmethod=fsync' \
+	'-c core.fsync=loose-object -c core.fsyncmethod=fsync' \
+	'-c core.fsync=loose-object -c core.fsyncmethod=batch' \
+	'-c core.fsyncmethod=batch'
+do
+	test_perf "'stash push -u' with '$cfg'" \
+		--setup '
+			mv -v .git .git.old &&
+			git init . &&
+			test_commit dummy
+		' \
+		--cleanup '
+			rm -rf .git &&
+			mv .git.old .git
+		' '
+		git $cfg stash push -a -u ":!.git.old/" ":!test*" "."
+	'
+done
+
+test_done
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index a935ad622d3..24a5108f234 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -194,6 +194,7 @@ test_wrapper_ () {
 	test_start_
 	test_prereq=
 	test_perf_setup_=
+	test_perf_cleanup_=
 	while test $# != 0
 	do
 		case $1 in
@@ -205,6 +206,10 @@ test_wrapper_ () {
 			test_perf_setup_=$2
 			shift
 			;;
+		--cleanup)
+			test_perf_cleanup_=$2
+			shift
+			;;
 		*)
 			break
 			;;
@@ -214,6 +219,7 @@ test_wrapper_ () {
 	test "$#" = 1 || BUG "test_wrapper_ needs 2 positional parameters"
 	export test_prereq
 	export test_perf_setup_
+	export test_perf_cleanup_
 	if ! test_skip "$test_title_" "$@"
 	then
 		base=$(basename "$0" .sh)
@@ -256,6 +262,16 @@ test_perf_ () {
 			test_failure_ "$@"
 			break
 		fi
+		if test -n "$test_perf_cleanup_"
+		then
+			say >&3 "cleanup: $test_perf_cleanup_"
+			if ! test_eval_ $test_perf_cleanup_
+			then
+				test_failure_ "$test_perf_cleanup_"
+				break
+			fi
+
+		fi
 	done
 	if test -z "$verbose"; then
 		echo " ok"


Here it is against Cor.git (a random small-ish repo I had laying around):
	
	$ GIT_SKIP_TESTS='p3[79]00.[12]' GIT_PERF_MAKE_OPTS='CFLAGS=-O3' GIT_PERF_REPO=~/g/Cor/ ./run origin/master HEAD -- p3900-stash.sh
	=== Building abf474a5dd901f28013c52155411a48fd4c09922 (origin/master) ===
	    GEN git-add--interactive
	    GEN git-archimport
	    GEN git-cvsexportcommit
	    GEN git-cvsimport
	    GEN git-cvsserver
	    GEN git-send-email
	    GEN git-svn
	    GEN git-p4
	    SUBDIR templates
	=== Running 1 tests in /home/avar/g/git/t/perf/build/abf474a5dd901f28013c52155411a48fd4c09922/bin-wrappers ===
	ok 1 # skip 'stash push -u' with '-c core.fsync=-loose-object -c core.fsyncmethod=fsync' (GIT_SKIP_TESTS)
	ok 2 # skip 'stash push -u' with '-c core.fsync=loose-object -c core.fsyncmethod=fsync' (GIT_SKIP_TESTS)
	perf 3 - 'stash push -u' with '-c core.fsync=loose-object -c core.fsyncmethod=batch': 1 2 3 ok
	perf 4 - 'stash push -u' with '-c core.fsyncmethod=batch': 1 2 3 ok
	# passed all 4 test(s)
	1..4
	=== Building ecda9c2b029e35d239e369b875b245f45fd2a097 (HEAD) ===
	    GEN git-add--interactive
	    GEN git-archimport
	    GEN git-cvsexportcommit
	    GEN git-cvsimport
	    GEN git-cvsserver
	    GEN git-send-email
	    GEN git-svn
	    GEN git-p4
	    SUBDIR templates
	=== Running 1 tests in /home/avar/g/git/t/perf/build/ecda9c2b029e35d239e369b875b245f45fd2a097/bin-wrappers ===
	ok 1 # skip 'stash push -u' with '-c core.fsync=-loose-object -c core.fsyncmethod=fsync' (GIT_SKIP_TESTS)
	ok 2 # skip 'stash push -u' with '-c core.fsync=loose-object -c core.fsyncmethod=fsync' (GIT_SKIP_TESTS)
	perf 3 - 'stash push -u' with '-c core.fsync=loose-object -c core.fsyncmethod=batch': 1 2 3 ok
	perf 4 - 'stash push -u' with '-c core.fsyncmethod=batch': 1 2 3 ok
	# passed all 4 test(s)
	1..4
	Test       origin/master     HEAD
	---------------------------------------------------
	3900.3:    0.03(0.00+0.00)   0.02(0.00+0.00) -33.3%
	3900.4:    0.02(0.00+0.00)   0.03(0.00+0.00) +50.0%
	

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
                         ` (13 preceding siblings ...)
  2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
@ 2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
  2022-03-29 16:51         ` Neeraj Singh
  2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
  15 siblings, 1 reply; 175+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-29 11:45 UTC (permalink / raw)
  To: Neeraj K. Singh via GitGitGadget
  Cc: git, Johannes.Schindelin, nksingh85, ps, jeffhost, Bagas Sanjaya,
	worldhello.net, Neeraj K. Singh


On Tue, Mar 29 2022, Neeraj K. Singh via GitGitGadget wrote:

> V4 changes:
>
>  * Make ODB transactions nestable.
>  * Add an ODB transaction around writing out the cached tree.
>  * Change update-index to use a more straightforward way of managing ODB
>    transactions.
>  * Fix missing 'local's in lib-unique-files
>  * Add a per-iteration setup mechanism to test_perf.
>  * Fix camelCasing in warning message.

Despite my
https://lore.kernel.org/git/220329.86czi52ekn.gmgdl@evledraar.gmail.com/
I eventually gave up on trying to extract meaningful numbers from
t/perf, I can never quite find out if they're because of its
shellscripts shenanigans or actual code.

(And also; I realize I didn't follow-up on
https://lore.kernel.org/git/CANQDOdcFN5GgOPZ3hqCsjHDTiRfRpqoAKxjF1n9D6S8oD9--_A@mail.gmail.com/,
sorry):

But I came up with this (uses my thin
https://gitlab.com/avar/git-hyperfine/ wrapper, and you should be able
to apt get hyperfine):
	
	#!/bin/sh
	set -xe
	
	if ! test -d /tmp/scalar.git
	then
		git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git
		mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack
	fi
	git hyperfine \
	        --warmup 1 -r 3 \
		-L rev neeraj-v4,avar-RFC \
		-s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' \
		-p 'rm -rf repo/.git/objects/* repo/.git/index' \
		$@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt'
	
	git hyperfine \
	        --warmup 1 -r 3 \
		-L rev neeraj-v4,avar-RFC \
		-s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/' \
		-p 'rm -rf repo/.git/objects/* repo/.git/index' \
		$@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .'
	
	git hyperfine \
	        --warmup 1 -r 3 \
		-L rev neeraj-v4,avar-RFC \
	        -s 'make CFLAGS=-O3' \
	        -p 'git init --bare dest.git' \
	        -c 'rm -rf dest.git' \
	        $@'./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack'

Those tags are your v4 here & the v2 of the RFC I sent at
https://lore.kernel.org/git/RFC-cover-v2-0.7-00000000000-20220323T140753Z-avarab@gmail.com/

Which shows my RFC v2 is ~20% faster with:

    $ PFX='strace' ~/g/git.meta/benchmark.sh "strace "

    1.22 ± 0.02 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4'
    1.22 ± 0.01 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4'
    1.00 ± 0.01 times faster than 'strace ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4'

But only for add/update-index, is the unpack-objects not using the
tmp-objdir? (presumably yes).

As noted before I've found "strace" to be a handy way to "simulate"
slower FS ops on a ramdisk (I get about the same numbers sometimes on
the actual non-SSD disk, but due to load on the system (that I'm not in
full control of[1]) I can't get hyperfine to be happy with the
non-fuzzyness:

    1.06 ± 0.02 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4'
    1.06 ± 0.03 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4'
    1.01 ± 0.01 times faster than './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4'

FWIW these are my actual non-fuzzy-with-strace numbers on the
not-ramdisk, as you can see the intervals overlap, but for the first two
the "min" time is never close to the RFC v2:
	
	$ XDG_RUNTIME_DIR=/tmp/ghf ~/g/git.meta/benchmark.sh
	+ test -d /tmp/scalar.git
	+ git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt -p rm -rf repo/.git/objects/* repo/.git/index ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt
	Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4
	  Time (mean ± σ):      1.043 s ±  0.143 s    [User: 0.184 s, System: 0.193 s]
	  Range (min … max):    0.943 s …  1.207 s    3 runs
	
	Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'avar-RFC
	  Time (mean ± σ):     877.6 ms ± 183.4 ms    [User: 197.9 ms, System: 149.4 ms]
	  Range (min … max):   697.8 ms … 1064.4 ms    3 runs
	
	Summary
	  './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'avar-RFC' ran
	    1.19 ± 0.30 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4'
	+ git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ -p rm -rf repo/.git/objects/* repo/.git/index ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .
	Benchmark 1: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4
	  Time (mean ± σ):      1.019 s ±  0.057 s    [User: 0.213 s, System: 0.194 s]
	  Range (min … max):    0.963 s …  1.076 s    3 runs
	
	Benchmark 2: ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'avar-RFC
	  Time (mean ± σ):     918.6 ms ±  34.4 ms    [User: 207.8 ms, System: 164.1 ms]
	  Range (min … max):   880.6 ms … 947.5 ms    3 runs
	
	Summary
	  './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'avar-RFC' ran
	    1.11 ± 0.07 times faster than './git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4'
	+ git hyperfine --warmup 1 -r 3 -L rev neeraj-v4,avar-RFC -s make CFLAGS=-O3 -p git init --bare dest.git -c rm -rf dest.git ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack
	Benchmark 1: ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4
	  Time (mean ± σ):      1.362 s ±  0.285 s    [User: 1.021 s, System: 0.186 s]
	  Range (min … max):    1.192 s …  1.691 s    3 runs
	
	  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
	
	Benchmark 2: ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'avar-RFC
	 ⠏ Performing warmup runs         ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ⠙ Performing warmup runs         ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  Time (mean ± σ):      1.188 s ±  0.009 s    [User: 1.025 s, System: 0.161 s]
	  Range (min … max):    1.180 s …  1.199 s    3 runs
	 
	Summary
	  './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'avar-RFC' ran
	    1.15 ± 0.24 times faster than './git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4'

1. I do my git hacking on a bare metal box I rent with some friends, and
   one of them is running one those persistent video game daemons
   written in Java. So I think all my non-RAM I/O numbers are
   continually fuzzed by what players are doing in Minecraft or whatever
   that thing is...

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects
  2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
@ 2022-03-29 16:51         ` Neeraj Singh
  0 siblings, 0 replies; 175+ messages in thread
From: Neeraj Singh @ 2022-03-29 16:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Neeraj K. Singh via GitGitGadget, Git List, Johannes Schindelin,
	Patrick Steinhardt, Jeff Hostetler, Bagas Sanjaya, Jiang Xin,
	Neeraj K. Singh

On Tue, Mar 29, 2022 at 5:04 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Mar 29 2022, Neeraj K. Singh via GitGitGadget wrote:
>
> > V4 changes:
> >
> >  * Make ODB transactions nestable.
> >  * Add an ODB transaction around writing out the cached tree.
> >  * Change update-index to use a more straightforward way of managing ODB
> >    transactions.
> >  * Fix missing 'local's in lib-unique-files
> >  * Add a per-iteration setup mechanism to test_perf.
> >  * Fix camelCasing in warning message.
>
> Despite my
> https://lore.kernel.org/git/220329.86czi52ekn.gmgdl@evledraar.gmail.com/
> I eventually gave up on trying to extract meaningful numbers from
> t/perf, I can never quite find out if they're because of its
> shellscripts shenanigans or actual code.
>
> (And also; I realize I didn't follow-up on
> https://lore.kernel.org/git/CANQDOdcFN5GgOPZ3hqCsjHDTiRfRpqoAKxjF1n9D6S8oD9--_A@mail.gmail.com/,
> sorry):
>

Looks like we aren't actually hitting fsync in the numbers you
expressed there, if they're down in the 20ms range.  Or we simply
aren't adding enough files.  Or if that's against a ramdisk, the
ramdisk doesn't have enough cost to represent real disk hardware.

> But I came up with this (uses my thin
> https://gitlab.com/avar/git-hyperfine/ wrapper, and you should be able
> to apt get hyperfine):
>
>         #!/bin/sh
>         set -xe
>
>         if ! test -d /tmp/scalar.git
>         then
>                 git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git
>                 mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack
>         fi
>         git hyperfine \
>                 --warmup 1 -r 3 \
>                 -L rev neeraj-v4,avar-RFC \
>                 -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/ && git ls-files -- t >repo/.git/to-add.txt' \
>                 -p 'rm -rf repo/.git/objects/* repo/.git/index' \
>                 $@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt'
>
>         git hyperfine \
>                 --warmup 1 -r 3 \
>                 -L rev neeraj-v4,avar-RFC \
>                 -s 'make CFLAGS=-O3 && rm -rf repo && git init repo && cp -R t repo/' \
>                 -p 'rm -rf repo/.git/objects/* repo/.git/index' \
>                 $@'./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .'
>
>         git hyperfine \
>                 --warmup 1 -r 3 \
>                 -L rev neeraj-v4,avar-RFC \
>                 -s 'make CFLAGS=-O3' \
>                 -p 'git init --bare dest.git' \
>                 -c 'rm -rf dest.git' \
>                 $@'./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack'
>
> Those tags are your v4 here & the v2 of the RFC I sent at
> https://lore.kernel.org/git/RFC-cover-v2-0.7-00000000000-20220323T140753Z-avarab@gmail.com/
>
> Which shows my RFC v2 is ~20% faster with:
>
>     $ PFX='strace' ~/g/git.meta/benchmark.sh "strace "
>
>     1.22 ± 0.02 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin <repo/.git/to-add.txt' in 'neeraj-v4'
>     1.22 ± 0.01 times faster than 'strace ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo add .' in 'neeraj-v4'
>     1.00 ± 0.01 times faster than 'strace ./git -C dest.git -c core.fsyncMethod=batch unpack-objects </tmp/scalar.git/my.pack' in 'neeraj-v4'
>
> But only for add/update-index, is the unpack-objects not using the
> tmp-objdir? (presumably yes).
>
> As noted before I've found "strace" to be a handy way to "simulate"
> slower FS ops on a ramdisk (I get about the same numbers sometimes on
> the actual non-SSD disk, but due to load on the system (that I'm not in
> full control of[1]) I can't get hyperfine to be happy with the
> non-fuzzyness:
>