git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Eric Wong" <e@80x24.org>,
	"Neeraj K. Singh" <neerajsi@microsoft.com>
Subject: [PATCH v2 3/3] refs: add configuration to enable flushing of refs
Date: Wed, 10 Nov 2021 12:41:03 +0100	[thread overview]
Message-ID: <d9aa96913b1730f1d0c238d7d52e27c20bc55390.1636544377.git.ps@pks.im> (raw)
In-Reply-To: <cover.1636544377.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 9268 bytes --]

When writing loose refs, we first create a lockfile, write the new ref
into that lockfile, close it and then rename the lockfile into place
such that the actual update is atomic for that single ref. While this
works as intended under normal circumstences, at GitLab we infrequently
encounter corrupt loose refs in repositories after a machine encountered
a hard reset. The corruption is always of the same type: the ref has
been committed into place, but it is completely empty.

The root cause of this is likely that we don't sync contents of the
lockfile to disk before renaming it into place. As a result, it's not
guaranteed that the contents are properly persisted and one may observe
weird in-between states on hard resets. Quoting ext4 documentation [1]:

    Many broken applications don't use fsync() when replacing existing
    files via patterns such as fd =
    open("foo.new")/write(fd,..)/close(fd)/ rename("foo.new", "foo"), or
    worse yet, fd = open("foo", O_TRUNC)/write(fd,..)/close(fd). If
    auto_da_alloc is enabled, ext4 will detect the replace-via-rename
    and replace-via-truncate patterns and force that any delayed
    allocation blocks are allocated such that at the next journal
    commit, in the default data=ordered mode, the data blocks of the new
    file are forced to disk before the rename() operation is committed.
    This provides roughly the same level of guarantees as ext3, and
    avoids the "zero-length" problem that can happen when a system
    crashes before the delayed allocation blocks are forced to disk.

This explicitly points out that one must call fsync(3P) before doing the
rename(3P) call, or otherwise data may not be correctly persisted to
disk.

Fix this by introducing a new configuration "core.fsyncRefFiles". This
config matches behaviour of "core.fsyncObjectFiles" in that it provides
three different modes:

    - "off" disables calling fsync on ref files. This is the default
      behaviour previous to this change and remains the default after
      this change.

    - "on" enables calling fsync on ref files, where each reference is
      flushed to disk before it is being committed. This is the safest
      setting, but may incur significant performance overhead.

    - "batch" will flush the page cache of each file as it is written to
      ensure its data is persisted. After all refs have been written,
      the directories which host refs are flushed.

With this change in place and when "core.fsyncRefFiles" is set to either
"on" or "batch", this kind of corruption shouldn't happen anymore.

[1]: https://www.kernel.org/doc/Documentation/filesystems/ext4.txt

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/core.txt |  7 ++++
 cache.h                       |  7 ++++
 config.c                      | 10 ++++++
 environment.c                 |  1 +
 refs/files-backend.c          | 62 ++++++++++++++++++++++++++++++++++-
 5 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 200b4d9f06..e2fd0d8c90 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -572,6 +572,13 @@ core.fsyncObjectFiles::
   stored on HFS+ or APFS filesystems and on Windows for repos stored on NTFS or
   ReFS.
 
+core.fsyncRefFiles::
+	A value indicating the level of effort Git will expend in trying to make
+	refs added to the repo durable in the event of an unclean system
+	shutdown. This setting currently only controls loose refs in the object
+	store, so updates to packed refs may not be equally durable. Takes the
+	same parameters as `core.fsyncObjectFiles`.
+
 core.preloadIndex::
 	Enable parallel index preload for operations like 'git diff'
 +
diff --git a/cache.h b/cache.h
index 6d6e6770ec..14c8fab6b4 100644
--- a/cache.h
+++ b/cache.h
@@ -991,7 +991,14 @@ enum fsync_object_files_mode {
     FSYNC_OBJECT_FILES_BATCH
 };
 
+enum fsync_ref_files_mode {
+    FSYNC_REF_FILES_OFF,
+    FSYNC_REF_FILES_ON,
+    FSYNC_REF_FILES_BATCH
+};
+
 extern enum fsync_object_files_mode fsync_object_files;
+extern enum fsync_ref_files_mode fsync_ref_files;
 extern int core_preload_index;
 extern int precomposed_unicode;
 extern int protect_hfs;
diff --git a/config.c b/config.c
index 5eb36ecd77..4cbad5a29d 100644
--- a/config.c
+++ b/config.c
@@ -1500,6 +1500,16 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.fsyncreffiles")) {
+		if (value && !strcmp(value, "batch"))
+			fsync_ref_files = FSYNC_REF_FILES_BATCH;
+		else if (git_config_bool(var, value))
+			fsync_ref_files = FSYNC_REF_FILES_ON;
+		else
+			fsync_ref_files = FSYNC_REF_FILES_OFF;
+		return 0;
+	}
+
 	if (!strcmp(var, "core.preloadindex")) {
 		core_preload_index = git_config_bool(var, value);
 		return 0;
diff --git a/environment.c b/environment.c
index aeafe80235..1514ac9384 100644
--- a/environment.c
+++ b/environment.c
@@ -43,6 +43,7 @@ const char *git_hooks_path;
 int zlib_compression_level = Z_BEST_SPEED;
 int pack_compression_level = Z_DEFAULT_COMPRESSION;
 enum fsync_object_files_mode fsync_object_files;
+enum fsync_ref_files_mode fsync_ref_files;
 size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
 size_t delta_base_cache_limit = 96 * 1024 * 1024;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4b14f30d48..31d7456266 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1360,6 +1360,57 @@ static int commit_ref_update(struct files_ref_store *refs,
 			     const struct object_id *oid, const char *logmsg,
 			     struct strbuf *err);
 
+static int sync_loose_ref(int fd)
+{
+	switch (fsync_ref_files) {
+	case FSYNC_REF_FILES_OFF:
+		return 0;
+	case FSYNC_REF_FILES_ON:
+		return git_fsync(fd, FSYNC_HARDWARE_FLUSH);
+	case FSYNC_REF_FILES_BATCH:
+		return git_fsync(fd, FSYNC_WRITEOUT_ONLY);
+	default:
+		BUG("invalid fsync mode %d", fsync_ref_files);
+	}
+}
+
+#define SYNC_LOOSE_REF_GITDIR    (1 << 0)
+#define SYNC_LOOSE_REF_COMMONDIR (1 << 1)
+
+static int sync_loose_refs_flags(const char *refname)
+{
+	switch (ref_type(refname)) {
+	case REF_TYPE_PER_WORKTREE:
+	case REF_TYPE_PSEUDOREF:
+		return SYNC_LOOSE_REF_GITDIR;
+	case REF_TYPE_MAIN_PSEUDOREF:
+	case REF_TYPE_OTHER_PSEUDOREF:
+	case REF_TYPE_NORMAL:
+		return SYNC_LOOSE_REF_COMMONDIR;
+	default:
+		BUG("unknown ref type %d of ref %s",
+		    ref_type(refname), refname);
+	}
+}
+
+static int sync_loose_refs(struct files_ref_store *refs,
+			   int flags,
+			   struct strbuf *err)
+{
+	if (fsync_ref_files != FSYNC_REF_FILES_BATCH)
+		return 0;
+
+	if ((flags & SYNC_LOOSE_REF_GITDIR) &&
+	    git_fsync_dir(refs->base.gitdir) < 0)
+		return -1;
+
+	if ((flags & SYNC_LOOSE_REF_COMMONDIR) &&
+	    git_fsync_dir(refs->gitcommondir) < 0)
+		return -1;
+
+	return 0;
+}
+
 /*
  * Emit a better error message than lockfile.c's
  * unable_to_lock_message() would in case there is a D/F conflict with
@@ -1502,6 +1553,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 	oidcpy(&lock->old_oid, &orig_oid);
 
 	if (write_ref_to_lockfile(lock, &orig_oid, &err) ||
+	    sync_loose_refs(refs, sync_loose_refs_flags(newrefname), &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, logmsg, &err)) {
 		error("unable to write current sha1 into %s: %s", newrefname, err.buf);
 		strbuf_release(&err);
@@ -1522,6 +1574,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 	flag = log_all_ref_updates;
 	log_all_ref_updates = LOG_REFS_NONE;
 	if (write_ref_to_lockfile(lock, &orig_oid, &err) ||
+	    sync_loose_refs(refs, sync_loose_refs_flags(newrefname), &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, NULL, &err)) {
 		error("unable to write current sha1 into %s: %s", oldrefname, err.buf);
 		strbuf_release(&err);
@@ -1781,6 +1834,7 @@ static int write_ref_to_lockfile(struct ref_lock *lock,
 	fd = get_lock_file_fd(&lock->lk);
 	if (write_in_full(fd, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
 	    write_in_full(fd, &term, 1) < 0 ||
+	    sync_loose_ref(fd) < 0 ||
 	    close_ref_gently(lock) < 0) {
 		strbuf_addf(err,
 			    "couldn't write '%s'", get_lock_file_path(&lock->lk));
@@ -2665,7 +2719,7 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 		files_downcast(ref_store, REF_STORE_WRITE,
 			       "ref_transaction_prepare");
 	size_t i;
-	int ret = 0;
+	int ret = 0, sync_flags = 0;
 	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
 	char *head_ref = NULL;
 	int head_type;
@@ -2777,8 +2831,14 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					&update->new_oid, NULL,
 					NULL);
 		}
+
+		sync_flags |= sync_loose_refs_flags(update->refname);
 	}
 
+	ret = sync_loose_refs(refs, sync_flags, err);
+	if (ret)
+		goto cleanup;
+
 	if (packed_transaction) {
 		if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
 			ret = TRANSACTION_GENERIC_ERROR;
-- 
2.33.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2021-11-10 11:41 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-04 12:38 [PATCH] refs: sync loose refs to disk before committing them Patrick Steinhardt
2021-11-04 13:14 ` Ævar Arnfjörð Bjarmason
2021-11-04 14:51   ` Patrick Steinhardt
2021-11-04 21:24   ` Junio C Hamano
2021-11-04 22:36     ` Neeraj Singh
2021-11-05  1:40       ` Junio C Hamano
2021-11-05  6:36         ` Jeff King
2021-11-05  8:35       ` Ævar Arnfjörð Bjarmason
2021-11-05  9:04         ` Jeff King
2021-11-05  7:07 ` Jeff King
2021-11-05  7:17   ` Jeff King
2021-11-05  9:12     ` Johannes Schindelin
2021-11-05  9:22       ` Patrick Steinhardt
2021-11-05  9:34       ` Jeff King
2021-11-09 11:25         ` Patrick Steinhardt
2021-11-10  8:36           ` Jeff King
2021-11-10  9:16             ` Patrick Steinhardt
2021-11-10 11:40 ` [PATCH v2 0/3] " Patrick Steinhardt
2021-11-10 11:40   ` [PATCH v2 1/3] wrapper: handle EINTR in `git_fsync()` Patrick Steinhardt
2021-11-10 14:33     ` Johannes Schindelin
2021-11-10 14:39     ` Ævar Arnfjörð Bjarmason
2021-11-10 11:40   ` [PATCH v2 2/3] wrapper: provide function to sync directories Patrick Steinhardt
2021-11-10 14:40     ` Ævar Arnfjörð Bjarmason
2021-11-10 11:41   ` Patrick Steinhardt [this message]
2021-11-10 14:49     ` [PATCH v2 3/3] refs: add configuration to enable flushing of refs Ævar Arnfjörð Bjarmason
2021-11-10 19:15       ` Neeraj Singh
2021-11-10 20:23         ` Ævar Arnfjörð Bjarmason
2021-11-11  0:03           ` Neeraj Singh
2021-11-11 12:14           ` Patrick Steinhardt
2021-11-11 12:06       ` Patrick Steinhardt
2021-11-11  0:18     ` Neeraj Singh
2021-11-10 14:44   ` [PATCH v2 0/3] refs: sync loose refs to disk before committing them Johannes Schindelin
2021-11-10 20:45   ` Jeff King
2021-11-11 11:47     ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9aa96913b1730f1d0c238d7d52e27c20bc55390.1636544377.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=neerajsi@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).