git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Han-Wen Nienhuys <hanwen@google.com>
To: Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>
Cc: git <git@vger.kernel.org>, Han-Wen Nienhuys <hanwenn@gmail.com>,
	Jeff King <peff@peff.net>,
	Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: Re: [PATCH 1/6] fixup! reftable: rest of library
Date: Tue, 1 Dec 2020 15:32:06 +0100	[thread overview]
Message-ID: <CAFQ2z_N=sZuYorZxNLn-OoWjb6y=1zrc7sCPuKzZHQODi08s+Q@mail.gmail.com> (raw)
In-Reply-To: <878bffcdfe5ca7657f839de8f7993d9098726636.1606545878.git.gitgitgadget@gmail.com>

On Sat, Nov 28, 2020 at 7:44 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Close the file descriptors to obsolete files before trying to delete or
> rename them. This is actually required on Windows.
>
> Note: this patch is just a band-aid to get the tests pass on Windows.
> The fact that it is needed raises concerns about the overall resource
> handling: are file descriptors closed properly whenever appropriate, or
> are they closed much later (which can lead to rename() problems on
> Windows, and risks running into ulimits)?
>
> Also, a `reftable_stack_destroy()` call had to be moved in
> `test_reftable_stack_uptodate()` to avoid the prompt complaining that a
> `.ref` file could not be deleted on Windows. This raises the question
> whether the code does the right thing when two concurrent processes want
> to access the reftable, and one wants to compact it. At the moment, it
> does not appear to fail gracefully.

Thanks for the report; I have to look more closely at these fixes; I
fear they might be incorrect.

The reftable spec doesn't treat this case in depth, and I think it was
rather written for Unix-like semantics. In the Unix flavor, a process
that wants to read can keep file descriptors open to keep reading from
the ref DB at a consistent snapshot.

What is the approach that the rest of Git on Windows takes in these
circumstances?

Consider processes P1 and P2, and the following sequence of actions

P1 opens ref DB (ie. opens a set of *.ref files for read)
P2 opens ref DB, executes a transaction. Post-transaction, it compacts
the reftable stack.
P2 exits
P1 exits

Currently, the compaction in P2 tries to delete the files obviated by
the compaction. On Windows this currently fails, because you can't
delete open files.

There are several options:

1) P2 should fail the compaction. This is bad because it will lead to
degraded performance over time, and it's not obvious if you can
anticipate that the deletion doesn't work.
2) P2 should retry deleting until it succeeds. This is bad, because a
reader can starve writers.
3) on exit, P1 should check if its *.ref files are still in use, and
delete them. This smells bad, because P1 is a read-only process, yet
it executes writes. Also, do we have good on-exit hooks in Git?
4) On exit, P1 does nothing. Stale *.ref files are left behind. Some
sort of GC process cleans things up asynchronously.
5) The ref DB should not keep files open, and should rather open and
close files as needed; this means P1 doesn't keep files open for long,
and P2 can retry safely.

I think 3) seems the cleanest to me (even though deleting in read
process feels weird), but perhaps we could fallback to 5) on windows
as well.

Thoughts?

What errno code does deleting an in-use file on Windows produce?


> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  reftable/stack.c      | 37 ++++++++++++++++++++++++++++---------
>  reftable/stack_test.c |  2 +-
>  2 files changed, 29 insertions(+), 10 deletions(-)
>
> diff --git a/reftable/stack.c b/reftable/stack.c
> index 1d632937d7..02c6a370ba 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -212,7 +212,6 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
>                 goto done;
>
>         new_tables = NULL;
> -       st->readers_len = new_readers_len;
>         if (st->merged != NULL) {
>                 merged_table_release(st->merged);
>                 reftable_merged_table_free(st->merged);
> @@ -220,6 +219,7 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names,
>         if (st->readers != NULL) {
>                 reftable_free(st->readers);
>         }
> +       st->readers_len = new_readers_len;
>         st->readers = new_readers;
>         new_readers = NULL;
>         new_readers_len = 0;
> @@ -939,14 +939,6 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last,
>         strbuf_addstr(&new_table_path, "/");
>         strbuf_addbuf(&new_table_path, &new_table_name);
>
> -       if (!is_empty_table) {
> -               err = rename(temp_tab_file_name.buf, new_table_path.buf);
> -               if (err < 0) {
> -                       err = REFTABLE_IO_ERROR;
> -                       goto done;
> -               }
> -       }
> -
>         for (i = 0; i < first; i++) {
>                 strbuf_addstr(&ref_list_contents, st->readers[i]->name);
>                 strbuf_addstr(&ref_list_contents, "\n");
> @@ -960,6 +952,32 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last,
>                 strbuf_addstr(&ref_list_contents, "\n");
>         }
>
> +       /*
> +        * Now release the merged tables and readers
> +        */
> +       if (st->merged != NULL) {
> +               reftable_merged_table_free(st->merged);
> +               st->merged = NULL;
> +       }
> +
> +       if (st->readers != NULL) {
> +               int i = 0;
> +               for (i = 0; i < st->readers_len; i++) {
> +                       reader_close(st->readers[i]);
> +                       reftable_reader_free(st->readers[i]);
> +               }
> +               st->readers_len = 0;
> +               FREE_AND_NULL(st->readers);
> +       }
> +
> +       if (!is_empty_table) {
> +               err = rename(temp_tab_file_name.buf, new_table_path.buf);
> +               if (err < 0) {
> +                       err = REFTABLE_IO_ERROR;
> +                       goto done;
> +               }
> +       }
> +
>         err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
>         if (err < 0) {
>                 err = REFTABLE_IO_ERROR;
> @@ -1242,6 +1260,7 @@ static int stack_check_addition(struct reftable_stack *st,
>
>         free(refs);
>         reftable_iterator_destroy(&it);
> +       reader_close(rd);
>         reftable_reader_free(rd);
>         return err;
>  }
> diff --git a/reftable/stack_test.c b/reftable/stack_test.c
> index 11d3d30799..c35abd7301 100644
> --- a/reftable/stack_test.c
> +++ b/reftable/stack_test.c
> @@ -159,12 +159,12 @@ static void test_reftable_stack_uptodate(void)
>         err = reftable_stack_add(st2, &write_test_ref, &ref2);
>         EXPECT(err == REFTABLE_LOCK_ERROR);
>
> +       reftable_stack_destroy(st1);
>         err = reftable_stack_reload(st2);
>         EXPECT_ERR(err);
>
>         err = reftable_stack_add(st2, &write_test_ref, &ref2);
>         EXPECT_ERR(err);
> -       reftable_stack_destroy(st1);
>         reftable_stack_destroy(st2);
>         clear_dir(dir);
>  }
> --
> gitgitgadget
>


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

  reply	other threads:[~2020-12-01 14:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28  6:44 [PATCH 0/6] Minimal patches to let reftable pass the CI builds Johannes Schindelin via GitGitGadget
2020-11-28  6:44 ` [PATCH 1/6] fixup! reftable: rest of library Johannes Schindelin via GitGitGadget
2020-12-01 14:32   ` Han-Wen Nienhuys [this message]
2020-12-02 10:57     ` Johannes Schindelin
2020-12-02 18:31       ` Reftable locking on Windows (Re: [PATCH 1/6] fixup! reftable: rest of library) Han-Wen Nienhuys
2020-12-03 12:24         ` Ævar Arnfjörð Bjarmason
2020-12-03 13:56           ` Han-Wen Nienhuys
2020-11-28  6:44 ` [PATCH 2/6] fixup! reftable: utility functions Johannes Schindelin via GitGitGadget
2020-11-28  6:44 ` [PATCH 3/6] fixup! reftable: rest of library Johannes Schindelin via GitGitGadget
2020-12-01 10:26   ` Jeff King
2020-12-01 11:10     ` Han-Wen Nienhuys
2020-12-01 11:57       ` Jeff King
2020-11-28  6:44 ` [PATCH 4/6] " Johannes Schindelin via GitGitGadget
2020-11-28  6:44 ` [PATCH 5/6] " Johannes Schindelin via GitGitGadget
2020-11-28  6:44 ` [PATCH 6/6] " Johannes Schindelin via GitGitGadget
2020-12-01 10:28   ` Jeff King
2020-12-01 14:24     ` Johannes Schindelin
2020-12-02  1:50       ` Jeff King
2020-12-02 11:01         ` Han-Wen Nienhuys
2020-12-02 12:43           ` Jeff King
2020-11-30 14:26 ` [PATCH 0/6] Minimal patches to let reftable pass the CI builds Han-Wen Nienhuys
2020-12-01 14:18   ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFQ2z_N=sZuYorZxNLn-OoWjb6y=1zrc7sCPuKzZHQODi08s+Q@mail.gmail.com' \
    --to=hanwen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=hanwenn@gmail.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).