git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] doc/reftable: document how to handle windows
@ 2021-01-25 15:38 Han-Wen Nienhuys via GitGitGadget
  2021-01-26  5:49 ` Junio C Hamano
  2021-02-23 16:57 ` [PATCH v2] " Han-Wen Nienhuys via GitGitGadget
  0 siblings, 2 replies; 7+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-01-25 15:38 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

On Windows we can't delete or overwrite files opened by other processes. Here we
sketch how to handle this situation.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
    doc/reftable: document how to handle windows
    
    On Windows we can't delete or overwrite files opened by other processes.
    Here we sketch how to handle this situation.
    
    Signed-off-by: Han-Wen Nienhuys hanwen@google.com

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-951%2Fhanwen%2Fwindows-doc-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-951/hanwen/windows-doc-v1
Pull-Request: https://github.com/git/git/pull/951

 Documentation/technical/reftable.txt | 38 +++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index 8095ab2590c..d8be27d88c1 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -876,13 +876,13 @@ A collection of reftable files are stored in the `$GIT_DIR/reftable/`
 directory:
 
 ....
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
+00000001-00000001-RANDOM1.log
+00000002-00000002-RANDOM2.ref
+00000003-00000003-RANDOM3.ref
 ....
 
 where reftable files are named by a unique name such as produced by the
-function `${min_update_index}-${max_update_index}.ref`.
+function `${min_update_index}-${max_update_index}-${random}.ref`.
 
 Log-only files use the `.log` extension, while ref-only and mixed ref
 and log files use `.ref`. extension.
@@ -893,9 +893,9 @@ current files, one per line, in order, from oldest (base) to newest
 
 ....
 $ cat .git/reftable/tables.list
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
+00000001-00000001-RANDOM1.log
+00000002-00000002-RANDOM2.ref
+00000003-00000003-RANDOM3.ref
 ....
 
 Readers must read `$GIT_DIR/reftable/tables.list` to determine which
@@ -906,6 +906,10 @@ Reftable files not listed in `tables.list` may be new (and about to be
 added to the stack by the active writer), or ancient and ready to be
 pruned.
 
+The random suffix added to table filenames ensures that we never attempt to
+overwrite an existing table, which is necessary for this scheme to work on
+Windows
+
 Backward compatibility
 ^^^^^^^^^^^^^^^^^^^^^^
 
@@ -940,7 +944,7 @@ new reftable and atomically appending it to the stack:
 3.  Select `update_index` to be most recent file's
 `max_update_index + 1`.
 4.  Prepare temp reftable `tmp_XXXXXX`, including log entries.
-5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
+5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
 6.  Copy `tables.list` to `tables.list.lock`, appending file from (5).
 7.  Rename `tables.list.lock` to `tables.list`.
 
@@ -993,7 +997,7 @@ prevents other processes from trying to compact these files.
 should always be the case, assuming that other processes are adhering to
 the locking protocol.
 7.  Rename `${min_update_index}-${max_update_index}_XXXXXX` to
-`${min_update_index}-${max_update_index}.ref`.
+`${min_update_index}-${max_update_index}-${random}.ref`.
 8.  Write the new stack to `tables.list.lock`, replacing `B` and `C`
 with the file from (4).
 9.  Rename `tables.list.lock` to `tables.list`.
@@ -1005,6 +1009,22 @@ This strategy permits compactions to proceed independently of updates.
 Each reftable (compacted or not) is uniquely identified by its name, so
 open reftables can be cached by their name.
 
+Windows
+^^^^^^^
+
+On windows, and other systems that do not allow deleting or renaming to open
+files, compaction may succeed, but other readers may prevent obsolete tables
+from being deleted.
+
+On these platforms, the following strategy can be followed: on closing a
+reftable stack, reload `tables.list`, and delete any tables no longer mentioned
+in `tables.list`.
+
+Irregular program exit may still leave about unused files. In this case, a
+cleanup operation can read `tables.list`, note its modification timestamp, and
+delete any unreferenced `*.ref` files that are older.
+
+
 Alternatives considered
 ~~~~~~~~~~~~~~~~~~~~~~~
 

base-commit: 66e871b6647ffea61a77a0f82c7ef3415f1ee79c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc/reftable: document how to handle windows
  2021-01-25 15:38 [PATCH] doc/reftable: document how to handle windows Han-Wen Nienhuys via GitGitGadget
@ 2021-01-26  5:49 ` Junio C Hamano
  2021-01-26 11:38   ` Han-Wen Nienhuys
  2021-02-23 16:57 ` [PATCH v2] " Han-Wen Nienhuys via GitGitGadget
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2021-01-26  5:49 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  $ cat .git/reftable/tables.list
> -00000001-00000001.log
> -00000002-00000002.ref
> -00000003-00000003.ref
> +00000001-00000001-RANDOM1.log
> +00000002-00000002-RANDOM2.ref
> +00000003-00000003-RANDOM3.ref
>  ....
> @@ -940,7 +944,7 @@ new reftable and atomically appending it to the stack:
>  3.  Select `update_index` to be most recent file's
>  `max_update_index + 1`.
>  4.  Prepare temp reftable `tmp_XXXXXX`, including log entries.
> -5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
> +5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
>  6.  Copy `tables.list` to `tables.list.lock`, appending file from (5).
>  7.  Rename `tables.list.lock` to `tables.list`.

Is this because we have been assuming that in step 5. we can
"overwrite" (i.e. take over the name, implicitly unlinking the
existing one) the existing 0000001-00000001.ref with the newly
prepared one, which is not doable on Windows?

We must prepare for two "randoms" colliding and retrying the
renaming step anyway, so would it make more sense to instead
use a non-random suffix (i.e. try "-0.ref" first, and when it
fails, readdir for 0000001-00000001-*.ref to find the latest
suffix and increment it)?

> @@ -993,7 +997,7 @@ prevents other processes from trying to compact these files.
>  should always be the case, assuming that other processes are adhering to
>  the locking protocol.
>  7.  Rename `${min_update_index}-${max_update_index}_XXXXXX` to
> -`${min_update_index}-${max_update_index}.ref`.
> +`${min_update_index}-${max_update_index}-${random}.ref`.
>  8.  Write the new stack to `tables.list.lock`, replacing `B` and `C`
>  with the file from (4).

Likewise.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc/reftable: document how to handle windows
  2021-01-26  5:49 ` Junio C Hamano
@ 2021-01-26 11:38   ` Han-Wen Nienhuys
  2021-01-26 17:40     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Han-Wen Nienhuys @ 2021-01-26 11:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Tue, Jan 26, 2021 at 6:49 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >  $ cat .git/reftable/tables.list
> > -00000001-00000001.log
> > -00000002-00000002.ref
> > -00000003-00000003.ref
> > +00000001-00000001-RANDOM1.log
> > +00000002-00000002-RANDOM2.ref
> > +00000003-00000003-RANDOM3.ref
> >  ....
> > @@ -940,7 +944,7 @@ new reftable and atomically appending it to the stack:
> >  3.  Select `update_index` to be most recent file's
> >  `max_update_index + 1`.
> >  4.  Prepare temp reftable `tmp_XXXXXX`, including log entries.
> > -5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
> > +5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
> >  6.  Copy `tables.list` to `tables.list.lock`, appending file from (5).
> >  7.  Rename `tables.list.lock` to `tables.list`.
>
> Is this because we have been assuming that in step 5. we can
> "overwrite" (i.e. take over the name, implicitly unlinking the
> existing one) the existing 0000001-00000001.ref with the newly
> prepared one, which is not doable on Windows?

No, the protocol for adding a table to the end of the stack is
impervious to problems on Windows, as everything happens under lock,
so there is no possibility of collisions.

> We must prepare for two "randoms" colliding and retrying the
> renaming step anyway, so would it make more sense to instead
> use a non-random suffix (i.e. try "-0.ref" first, and when it
> fails, readdir for 0000001-00000001-*.ref to find the latest
> suffix and increment it)?

This is a lot of complexity, and both transactions and compactions can
always fail because they fail to get the lock, or because the data to
be written is out of date. So callers need to be prepared for a retry
anyway.

> > @@ -993,7 +997,7 @@ prevents other processes from trying to compact these files.
> >  should always be the case, assuming that other processes are adhering to
> >  the locking protocol.
> >  7.  Rename `${min_update_index}-${max_update_index}_XXXXXX` to
> > -`${min_update_index}-${max_update_index}.ref`.
> > +`${min_update_index}-${max_update_index}-${random}.ref`.
> >  8.  Write the new stack to `tables.list.lock`, replacing `B` and `C`
> >  with the file from (4).
>
> Likewise.

This case is different. Consider the following situation

1-1.ref:
  main=abc123 @ timestamp 1
  master=abc123 @ timestamp 1
2-2.ref:  bla=456def @ timestamp 2
3-3.ref:
  bla delete @ timestamp 3
  master delete @timestamp 3

The result of compacting this together would be a table containing

  main = abc123 @ timestamp 1

but in the previous naming convention, we'd name the resulting table
"1-1.ref", which conflicts with the table in our starting situation.


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc/reftable: document how to handle windows
  2021-01-26 11:38   ` Han-Wen Nienhuys
@ 2021-01-26 17:40     ` Junio C Hamano
  2021-01-26 18:11       ` Han-Wen Nienhuys
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2021-01-26 17:40 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

>> Is this because we have been assuming that in step 5. we can
>> "overwrite" (i.e. take over the name, implicitly unlinking the
>> existing one) the existing 0000001-00000001.ref with the newly
>> prepared one, which is not doable on Windows?
>
> No, the protocol for adding a table to the end of the stack is
> impervious to problems on Windows, as everything happens under lock,
> so there is no possibility of collisions.
>
>> We must prepare for two "randoms" colliding and retrying the
>> renaming step anyway, so would it make more sense to instead
>> use a non-random suffix (i.e. try "-0.ref" first, and when it
>> fails, readdir for 0000001-00000001-*.ref to find the latest
>> suffix and increment it)?
>
> This is a lot of complexity, and both transactions and compactions can
> always fail because they fail to get the lock, or because the data to
> be written is out of date. So callers need to be prepared for a retry
> anyway.

Sorry, are we saying the same thing and reaching different
conclusions?  

My question was, under the assumption that the callers need to be
prepared for a retry anyway,

 (1) would it be possible to use "seq" (or "take max from existing
     and add one") as the random number generator for the ${random}
     part of your document, and

 (2) if the answer to the previous question is yes, would it result
     in a system that is easier for Git developers, who observe what
     happens inside the .git directory, to understand the behaviour
     of the system, as they can immediately see that 1-1-47 is newer
     than 1-1-22 instead of 1-1-$random1 and 1-1-$random2 that
     cannot be compared?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc/reftable: document how to handle windows
  2021-01-26 17:40     ` Junio C Hamano
@ 2021-01-26 18:11       ` Han-Wen Nienhuys
  2021-01-26 20:12         ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Han-Wen Nienhuys @ 2021-01-26 18:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Tue, Jan 26, 2021 at 6:40 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Han-Wen Nienhuys <hanwen@google.com> writes:
>
> >> Is this because we have been assuming that in step 5. we can
> >> "overwrite" (i.e. take over the name, implicitly unlinking the
> >> existing one) the existing 0000001-00000001.ref with the newly
> >> prepared one, which is not doable on Windows?
> >
> > No, the protocol for adding a table to the end of the stack is
> > impervious to problems on Windows, as everything happens under lock,
> > so there is no possibility of collisions.
> >
> >> We must prepare for two "randoms" colliding and retrying the
> >> renaming step anyway, so would it make more sense to instead
> >> use a non-random suffix (i.e. try "-0.ref" first, and when it we
> >> fails, readdir for 0000001-00000001-*.ref to find the latest
> >> suffix and increment it)?
> >
> > This is a lot of complexity, and both transactions and compactions can
> > always fail because they fail to get the lock, or because the data to
> > be written is out of date. So callers need to be prepared for a retry
> > anyway.
>
> Sorry, are we saying the same thing and reaching different
> conclusions?
>
> My question was, under the assumption that the callers need to be
> prepared for a retry anyway,
>
>  (1) would it be possible to use "seq" (or "take max from existing
>      and add one") as the random number generator for the ${random}
>      part of your document, and
>
>  (2) if the answer to the previous question is yes, would it result
>      in a system that is easier for Git developers, who observe what
>      happens inside the .git directory, to understand the behaviour
>      of the system, as they can immediately see that 1-1-47 is newer
>      than 1-1-22 instead of 1-1-$random1 and 1-1-$random2 that
>      cannot be compared?

The first two parts of the file name (${min}-${max}) already provide
visibility into what is going on, and the file system timestamp
already indicates which file is newer. I picked a random name as
suffix, as it gets the job done and is simple.

We could do what you suggest, but it adds semantics to the filenames
that aren't really there: currently, tables.list is a list of
filenames, and no part of the code parses back the file names. If we'd
do what you suggest, we have more ways in which the system can break
subtly, and needs to handle error conditions if the names are
malformed. This is the complexity I was alluding to in my previous
message.

We could stipulate that a compaction must always increase the logical
timestamp, ie. in the scenario I sketched, the compacted table should
be written with a max-timestamp of 3, even though it contains no
entries at timestamp 3.  This avoids the error condition, but it's
also surprising because it is actually inconsistent with how the
format is described. But maybe we could update the description of the
format.

Or, we could rename to ${min}-${max}-0 and if that fails try
${min}-${max}-1, and if that fails ${min}-${max}-2 etc. I think that
is somewhat nicer than parsing back a counter from the existing
filenames, but it could have the effect that 1-1-0 could be newer than
1-1-2.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc/reftable: document how to handle windows
  2021-01-26 18:11       ` Han-Wen Nienhuys
@ 2021-01-26 20:12         ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2021-01-26 20:12 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

> The first two parts of the file name (${min}-${max}) already provide
> visibility into what is going on, and the file system timestamp
> already indicates which file is newer. I picked a random name as
> suffix, as it gets the job done and is simple.

OK, as long as two paths of the same ${min}-${max} part would not
confuse people, I am perfectly fine.

> Or, we could rename to ${min}-${max}-0 and if that fails try
> ${min}-${max}-1, and if that fails ${min}-${max}-2 etc. I think that
> is somewhat nicer than parsing back a counter from the existing
> filenames, but it could have the effect that 1-1-0 could be newer than
> 1-1-2.

I agree that such an approach that can get fooled by an existing gap
would not achieve anything over the ${random} approach.

Thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] doc/reftable: document how to handle windows
  2021-01-25 15:38 [PATCH] doc/reftable: document how to handle windows Han-Wen Nienhuys via GitGitGadget
  2021-01-26  5:49 ` Junio C Hamano
@ 2021-02-23 16:57 ` Han-Wen Nienhuys via GitGitGadget
  1 sibling, 0 replies; 7+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2021-02-23 16:57 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

On Windows we can't delete or overwrite files opened by other processes. Here we
sketch how to handle this situation.

We propose to use a random element in the filename. It's possible to design an
alternate solution based on counters, but that would assign semantics to the
filenames that complicates implementation.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
    doc/reftable: document how to handle windows
    
    On Windows we can't delete or overwrite files opened by other processes.
    Here we sketch how to handle this situation.
    
    Signed-off-by: Han-Wen Nienhuys hanwen@google.com

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-951%2Fhanwen%2Fwindows-doc-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-951/hanwen/windows-doc-v2
Pull-Request: https://github.com/git/git/pull/951

Range-diff vs v1:

 1:  a952bc478f86 ! 1:  e3854f2cc106 doc/reftable: document how to handle windows
     @@ Commit message
          On Windows we can't delete or overwrite files opened by other processes. Here we
          sketch how to handle this situation.
      
     +    We propose to use a random element in the filename. It's possible to design an
     +    alternate solution based on counters, but that would assign semantics to the
     +    filenames that complicates implementation.
     +
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
       ## Documentation/technical/reftable.txt ##
     -@@ Documentation/technical/reftable.txt: A collection of reftable files are stored in the `$GIT_DIR/reftable/`
     - directory:
     +@@ Documentation/technical/reftable.txt: A repository must set its `$GIT_DIR/config` to configure reftable:
     + Layout
     + ^^^^^^
       
     - ....
     +-A collection of reftable files are stored in the `$GIT_DIR/reftable/`
     +-directory:
     +-
     +-....
      -00000001-00000001.log
      -00000002-00000002.ref
      -00000003-00000003.ref
     -+00000001-00000001-RANDOM1.log
     -+00000002-00000002-RANDOM2.ref
     -+00000003-00000003-RANDOM3.ref
     - ....
     - 
     - where reftable files are named by a unique name such as produced by the
     +-....
     +-
     +-where reftable files are named by a unique name such as produced by the
      -function `${min_update_index}-${max_update_index}.ref`.
     -+function `${min_update_index}-${max_update_index}-${random}.ref`.
     ++A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory.
     ++Their names should have a random element, such that each filename is globally
     ++unique; this helps avoid spurious failures on Windows, where open files cannot
     ++be removed or overwritten. It suggested to use
     ++`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention.
       
       Log-only files use the `.log` extension, while ref-only and mixed ref
       and log files use `.ref`. extension.
     @@ Documentation/technical/reftable.txt: current files, one per line, in order, fro
       ....
       
       Readers must read `$GIT_DIR/reftable/tables.list` to determine which
     -@@ Documentation/technical/reftable.txt: Reftable files not listed in `tables.list` may be new (and about to be
     - added to the stack by the active writer), or ancient and ready to be
     - pruned.
     - 
     -+The random suffix added to table filenames ensures that we never attempt to
     -+overwrite an existing table, which is necessary for this scheme to work on
     -+Windows
     -+
     - Backward compatibility
     - ^^^^^^^^^^^^^^^^^^^^^^
     - 
      @@ Documentation/technical/reftable.txt: new reftable and atomically appending it to the stack:
       3.  Select `update_index` to be most recent file's
       `max_update_index + 1`.


 Documentation/technical/reftable.txt | 42 +++++++++++++++++-----------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index 8095ab2590c8..3ef169af27d8 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -872,17 +872,11 @@ A repository must set its `$GIT_DIR/config` to configure reftable:
 Layout
 ^^^^^^
 
-A collection of reftable files are stored in the `$GIT_DIR/reftable/`
-directory:
-
-....
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
-....
-
-where reftable files are named by a unique name such as produced by the
-function `${min_update_index}-${max_update_index}.ref`.
+A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory.
+Their names should have a random element, such that each filename is globally
+unique; this helps avoid spurious failures on Windows, where open files cannot
+be removed or overwritten. It suggested to use
+`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention.
 
 Log-only files use the `.log` extension, while ref-only and mixed ref
 and log files use `.ref`. extension.
@@ -893,9 +887,9 @@ current files, one per line, in order, from oldest (base) to newest
 
 ....
 $ cat .git/reftable/tables.list
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref
+00000001-00000001-RANDOM1.log
+00000002-00000002-RANDOM2.ref
+00000003-00000003-RANDOM3.ref
 ....
 
 Readers must read `$GIT_DIR/reftable/tables.list` to determine which
@@ -940,7 +934,7 @@ new reftable and atomically appending it to the stack:
 3.  Select `update_index` to be most recent file's
 `max_update_index + 1`.
 4.  Prepare temp reftable `tmp_XXXXXX`, including log entries.
-5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`.
+5.  Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`.
 6.  Copy `tables.list` to `tables.list.lock`, appending file from (5).
 7.  Rename `tables.list.lock` to `tables.list`.
 
@@ -993,7 +987,7 @@ prevents other processes from trying to compact these files.
 should always be the case, assuming that other processes are adhering to
 the locking protocol.
 7.  Rename `${min_update_index}-${max_update_index}_XXXXXX` to
-`${min_update_index}-${max_update_index}.ref`.
+`${min_update_index}-${max_update_index}-${random}.ref`.
 8.  Write the new stack to `tables.list.lock`, replacing `B` and `C`
 with the file from (4).
 9.  Rename `tables.list.lock` to `tables.list`.
@@ -1005,6 +999,22 @@ This strategy permits compactions to proceed independently of updates.
 Each reftable (compacted or not) is uniquely identified by its name, so
 open reftables can be cached by their name.
 
+Windows
+^^^^^^^
+
+On windows, and other systems that do not allow deleting or renaming to open
+files, compaction may succeed, but other readers may prevent obsolete tables
+from being deleted.
+
+On these platforms, the following strategy can be followed: on closing a
+reftable stack, reload `tables.list`, and delete any tables no longer mentioned
+in `tables.list`.
+
+Irregular program exit may still leave about unused files. In this case, a
+cleanup operation can read `tables.list`, note its modification timestamp, and
+delete any unreferenced `*.ref` files that are older.
+
+
 Alternatives considered
 ~~~~~~~~~~~~~~~~~~~~~~~
 

base-commit: 66e871b6647ffea61a77a0f82c7ef3415f1ee79c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-23 16:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25 15:38 [PATCH] doc/reftable: document how to handle windows Han-Wen Nienhuys via GitGitGadget
2021-01-26  5:49 ` Junio C Hamano
2021-01-26 11:38   ` Han-Wen Nienhuys
2021-01-26 17:40     ` Junio C Hamano
2021-01-26 18:11       ` Han-Wen Nienhuys
2021-01-26 20:12         ` Junio C Hamano
2021-02-23 16:57 ` [PATCH v2] " Han-Wen Nienhuys via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).