From: "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Han-Wen Nienhuys <hanwen@google.com>,
Han-Wen Nienhuys <hanwenn@gmail.com>,
Han-Wen Nienhuys <hanwen@google.com>
Subject: [PATCH v2] doc/reftable: document how to handle windows
Date: Tue, 23 Feb 2021 16:57:23 +0000 [thread overview]
Message-ID: <pull.951.v2.git.git.1614099444126.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.951.git.git.1611589125365.gitgitgadget@gmail.com>
From: Han-Wen Nienhuys <hanwen@google.com>
On Windows we can't delete or overwrite files opened by other processes. Here we
sketch how to handle this situation.
We propose to use a random element in the filename. It's possible to design an
alternate solution based on counters, but that would assign semantics to the
filenames that complicates implementation.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
doc/reftable: document how to handle windows
On Windows we can't delete or overwrite files opened by other processes.
Here we sketch how to handle this situation.
Signed-off-by: Han-Wen Nienhuys hanwen@google.com
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-951%2Fhanwen%2Fwindows-doc-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-951/hanwen/windows-doc-v2
Pull-Request: https://github.com/git/git/pull/951
Range-diff vs v1:
1: a952bc478f86 ! 1: e3854f2cc106 doc/reftable: document how to handle windows
@@ Commit message
On Windows we can't delete or overwrite files opened by other processes. Here we
sketch how to handle this situation.
+ We propose to use a random element in the filename. It's possible to design an
+ alternate solution based on counters, but that would assign semantics to the
+ filenames that complicates implementation.
+
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
## Documentation/technical/reftable.txt ##
-@@ Documentation/technical/reftable.txt: A collection of reftable files are stored in the `$GIT_DIR/reftable/`
- directory:
+@@ Documentation/technical/reftable.txt: A repository must set its `$GIT_DIR/config` to configure reftable:
+ Layout
+ ^^^^^^
- ....
+-A collection of reftable files are stored in the `$GIT_DIR/reftable/`
+-directory:
+-
+-....
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
-+00000001-00000001-RANDOM1.log
-+00000002-00000002-RANDOM2.ref
-+00000003-00000003-RANDOM3.ref
- ....
-
- where reftable files are named by a unique name such as produced by the
+-....
+-
+-where reftable files are named by a unique name such as produced by the
-function `${min_update_index}-${max_update_index}.ref`.
-+function `${min_update_index}-${max_update_index}-${random}.ref`.
++A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory.
++Their names should have a random element, such that each filename is globally
++unique; this helps avoid spurious failures on Windows, where open files cannot
++be removed or overwritten. It suggested to use
++`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention.
Log-only files use the `.log` extension, while ref-only and mixed ref
and log files use `.ref`. extension.
@@ Documentation/technical/reftable.txt: current files, one per line, in order, fro
....
Readers must read `$GIT_DIR/reftable/tables.list` to determine which
-@@ Documentation/technical/reftable.txt: Reftable files not listed in `tables.list` may be new (and about to be
- added to the stack by the active writer), or ancient and ready to be
- pruned.
-
-+The random suffix added to table filenames ensures that we never attempt to
-+overwrite an existing table, which is necessary for this scheme to work on
-+Windows
-+
- Backward compatibility
- ^^^^^^^^^^^^^^^^^^^^^^
-
@@ Documentation/technical/reftable.txt: new reftable and atomically appending it to the stack:
3. Select `update_index` to be most recent file's
`max_update_index + 1`.
Documentation/technical/reftable.txt | 42 +++++++++++++++++-----------
1 file changed, 26 insertions(+), 16 deletions(-)
diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index 8095ab2590c8..3ef169af27d8 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -872,17 +872,11 @@ A repository must set its `$GIT_DIR/config` to configure reftable:
Layout
^^^^^^
-A collection of reftable files are stored in the `$GIT_DIR/reftable/`
-directory:
-
-....
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
-....
-
-where reftable files are named by a unique name such as produced by the
-function `${min_update_index}-${max_update_index}.ref`.
+A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory.
+Their names should have a random element, such that each filename is globally
+unique; this helps avoid spurious failures on Windows, where open files cannot
+be removed or overwritten. It suggested to use
+`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention.
Log-only files use the `.log` extension, while ref-only and mixed ref
and log files use `.ref`. extension.
@@ -893,9 +887,9 @@ current files, one per line, in order, from oldest (base) to newest
....
$ cat .git/reftable/tables.list
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
+00000001-00000001-RANDOM1.log
+00000002-00000002-RANDOM2.ref
+00000003-00000003-RANDOM3.ref
....
Readers must read `$GIT_DIR/reftable/tables.list` to determine which
@@ -940,7 +934,7 @@ new reftable and atomically appending it to the stack:
3. Select `update_index` to be most recent file's
`max_update_index + 1`.
4. Prepare temp reftable `tmp_XXXXXX`, including log entries.
-5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
+5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
6. Copy `tables.list` to `tables.list.lock`, appending file from (5).
7. Rename `tables.list.lock` to `tables.list`.
@@ -993,7 +987,7 @@ prevents other processes from trying to compact these files.
should always be the case, assuming that other processes are adhering to
the locking protocol.
7. Rename `${min_update_index}-${max_update_index}_XXXXXX` to
-`${min_update_index}-${max_update_index}.ref`.
+`${min_update_index}-${max_update_index}-${random}.ref`.
8. Write the new stack to `tables.list.lock`, replacing `B` and `C`
with the file from (4).
9. Rename `tables.list.lock` to `tables.list`.
@@ -1005,6 +999,22 @@ This strategy permits compactions to proceed independently of updates.
Each reftable (compacted or not) is uniquely identified by its name, so
open reftables can be cached by their name.
+Windows
+^^^^^^^
+
+On windows, and other systems that do not allow deleting or renaming to open
+files, compaction may succeed, but other readers may prevent obsolete tables
+from being deleted.
+
+On these platforms, the following strategy can be followed: on closing a
+reftable stack, reload `tables.list`, and delete any tables no longer mentioned
+in `tables.list`.
+
+Irregular program exit may still leave about unused files. In this case, a
+cleanup operation can read `tables.list`, note its modification timestamp, and
+delete any unreferenced `*.ref` files that are older.
+
+
Alternatives considered
~~~~~~~~~~~~~~~~~~~~~~~
base-commit: 66e871b6647ffea61a77a0f82c7ef3415f1ee79c
--
gitgitgadget
prev parent reply other threads:[~2021-02-23 16:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-25 15:38 [PATCH] doc/reftable: document how to handle windows Han-Wen Nienhuys via GitGitGadget
2021-01-26 5:49 ` Junio C Hamano
2021-01-26 11:38 ` Han-Wen Nienhuys
2021-01-26 17:40 ` Junio C Hamano
2021-01-26 18:11 ` Han-Wen Nienhuys
2021-01-26 20:12 ` Junio C Hamano
2021-02-23 16:57 ` Han-Wen Nienhuys via GitGitGadget [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.951.v2.git.git.1614099444126.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=hanwen@google.com \
--cc=hanwenn@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).