git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Jeff King" <peff@peff.net>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"David Turner" <novalis@novalis.org>,
	"Brandon Williams" <bmwill@google.com>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
Date: Thu, 27 Jul 2017 23:07:22 -0700	[thread overview]
Message-ID: <CAMy9T_GP1SRigZHonC8PTPn4RoKGwa31ZTCT04C+d0ks6k8jxA@mail.gmail.com> (raw)
In-Reply-To: <xmqqlgn9mzww.fsf@gitster.mtv.corp.google.com>

On Thu, Jul 27, 2017 at 12:40 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Thu, Jul 27, 2017 at 10:19:48AM -0700, Junio C Hamano wrote:
>>
>>> Makes sense.  Makes me wonder why we need a separate .new file
>>> (instead of writing into the .lock instead), but that is a different
>>> issue.
>>
>> It comes from 42dfa7ece (commit_packed_refs(): use a staging file
>> separate from the lockfile, 2017-06-23). That commit explains that we
>> want to be able to put the new contents into service before we release
>> the lock. But it doesn't say why that's useful.
>
> By being able to hold the lock on packed-refs longer, I guess
> something like this becomes possible:
>
>  * hold the lock on packed-refs
>  * hold the lock on loose ref A, B, C, ...
>  * update packed-refs to include the freshest values of these refs
>  * start serving packed-refs without releasing the lock
>  * for X in A, B, C...: delete the loose ref X and unlock X
>  * unlock the packed-refs

Hmmm, I thought that I explained somewhere why this is useful, but I
can't find it now.

The code even after this patch series is

Prepare:
1. lock loose refs
Finish:
2. perform creates and updates of loose refs, releasing their locks as we go
3. perform deletes of loose refs
4. lock packed-refs file
5. repack_without_refs() (delete packed versions of refs to be deleted)
6. commit new packed-refs file (which also unlocks it)
7. release locks on loose refs that were deleted

This is problematic because after a loose reference is deleted, it
briefly (between steps 3 and 5) reveals any packed version of the
reference, which might even point at a commit that has been GCed. It
is even worse if the attempt to acquire the packed refs lock in step 4
fails, because then the repo could be left in that broken state.

Clearly we can avoid the second problem by acquiring the packed-refs
lock during the prepare phase. You might also attempt to fix the first
problem by deleting the references from the packed-refs file before we
delete the corresponding loose references:

Prepare:
1. lock loose refs
2. lock packed-refs file
Finish:
3. perform creates and updates of loose refs, releasing their locks as we go
4. repack_without_refs() (delete packed versions of refs to be deleted)
5. commit new packed-refs file (which also unlocks it)
6. perform deletes of loose refs
7. release locks on loose refs that were deleted

But now we have a different problem. pack-refs does the following:

A. lock packed-refs file
B. read loose refs
C. commit new packed-refs file (which also unlocks it)
Then, for each loose reference that should be pruned, delete it in a
transaction with REF_ISPRUNING, which means:
D. lock the loose reference
E. lock packed-refs file
F. check that the reference's value is the same as the value just
written to packed-refs
G. delete the loose ref file if it still exists
H. unlock the loose reference

If a reference deletion is running at the same time as a pack-refs,
you could have a race where steps A and B of a pack-refs run between
steps 5 and 6 of a reference deletion. The pack-refs would see the
loose reference and would pack it into the packed-refs file, leaving
the reference at its old value even though the user was told that the
reference was deleted.

If a new packed-refs file can be activated while retaining the lock on
the file, then a reference update could keep the packed-refs file
locked during step 6 of the second algorithm, which would block step A
of pack-refs from beginning.

I have the feeling that there might be other, more improbably races
lurking here, too.

The eternal problems with races in the files backend is a main reason
that I'm enthusiastic about the reftable proposal.

Michael

  reply	other threads:[~2017-07-28  6:07 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-01 18:30 [PATCH v3 00/30] Create a reference backend for packed refs Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 01/30] t1408: add a test of stale packed refs covered by loose refs Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 02/30] add_packed_ref(): teach function to overwrite existing refs Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 03/30] packed_ref_store: new struct Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 04/30] packed_ref_store: move `packed_refs_path` here Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 05/30] packed_ref_store: move `packed_refs_lock` member here Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 06/30] clear_packed_ref_cache(): take a `packed_ref_store *` parameter Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 07/30] validate_packed_ref_cache(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 08/30] get_packed_ref_cache(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 09/30] get_packed_refs(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 10/30] add_packed_ref(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 11/30] lock_packed_refs(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 12/30] commit_packed_refs(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 13/30] rollback_packed_refs(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 14/30] get_packed_ref(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 15/30] repack_without_refs(): " Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 16/30] packed_peel_ref(): new function, extracted from `files_peel_ref()` Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 17/30] packed_ref_store: support iteration Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 18/30] packed_read_raw_ref(): new function, replacing `resolve_packed_ref()` Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 19/30] packed-backend: new module for handling packed references Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 20/30] packed_ref_store: make class into a subclass of `ref_store` Michael Haggerty
2017-07-01 18:30 ` [PATCH v3 21/30] commit_packed_refs(): report errors rather than dying Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 22/30] commit_packed_refs(): use a staging file separate from the lockfile Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 23/30] packed_refs_lock(): function renamed from lock_packed_refs() Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 24/30] packed_refs_lock(): report errors via a `struct strbuf *err` Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 25/30] packed_refs_unlock(), packed_refs_is_locked(): new functions Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 26/30] clear_packed_ref_cache(): don't protest if the lock is held Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 27/30] commit_packed_refs(): remove call to `packed_refs_unlock()` Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 28/30] repack_without_refs(): don't lock or unlock the packed refs Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 29/30] t3210: add some tests of bogus packed-refs file contents Michael Haggerty
2017-07-01 18:31 ` [PATCH v3 30/30] read_packed_refs(): die if `packed-refs` contains bogus data Michael Haggerty
2017-07-05  9:12 ` [PATCH v3 00/30] Create a reference backend for packed refs Jeff King
2017-07-20 23:05   ` Stefan Beller
2017-07-20 23:20     ` Jonathan Nieder
2017-07-26 23:39       ` [PATCH] packed_ref_store: handle a packed-refs file that is a symlink Michael Haggerty
2017-07-27  0:15         ` Stefan Beller
2017-07-27  0:18         ` Jonathan Nieder
2017-07-27 11:12           ` Michael Haggerty
2017-07-27 17:19         ` Junio C Hamano
2017-07-27 18:28           ` Jeff King
2017-07-27 19:40             ` Junio C Hamano
2017-07-28  6:07               ` Michael Haggerty [this message]
2021-05-31 14:18         ` Ævar Arnfjörð Bjarmason
2021-06-03 19:39           ` Jeff King
2021-06-03 19:58             ` [PATCH] t: use portable wrapper for readlink(1) Jeff King
2021-06-04 21:09               ` Ævar Arnfjörð Bjarmason
2021-06-03 20:23             ` [PATCH] packed_ref_store: handle a packed-refs file that is a symlink Felipe Contreras
2021-06-03 21:08               ` Jeff King
2021-06-03 22:25                 ` Felipe Contreras
2021-06-04 21:37                 ` Ævar Arnfjörð Bjarmason
2021-06-05  1:07                   ` Felipe Contreras
2021-06-04 21:12             ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMy9T_GP1SRigZHonC8PTPn4RoKGwa31ZTCT04C+d0ks6k8jxA@mail.gmail.com \
    --to=mhagger@alum.mit.edu \
    --cc=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=novalis@novalis.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).