git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Brandon Williams" <bmwill@google.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH 07/10] t1404: demonstrate two problems with reference transactions
Date: Wed, 30 Aug 2017 10:21:14 -0700	[thread overview]
Message-ID: <CAGZ79ka+uaBD_1xk6sx9J83KuP4A_mUWVqtyniNusbDeQiUZyQ@mail.gmail.com> (raw)
In-Reply-To: <caaa44126f18869158872e5473e53478db780ba9.1503993268.git.mhagger@alum.mit.edu>

On Tue, Aug 29, 2017 at 1:20 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Currently, a loose reference is deleted even before locking the
> `packed-refs` file, let alone deleting any packed version of the
> reference. This leads to two problems, demonstrated by two new tests:
>
> * While a reference is being deleted, other processes might see the
>   old, packed value of the reference for a moment before the packed
>   version is deleted. Normally this would be hard to observe, but we
>   can prolong the window by locking the `packed-refs` file externally
>   before running `update-ref`, then unlocking it before `update-ref`'s
>   attempt to acquire the lock times out.
>
> * If the `packed-refs` file is locked so long that `update-ref` fails
>   to lock it, then the reference can be left permanently in the
>   incorrect state described in the previous point.
>
> In a moment, both problems will be fixed.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
> The first of the new tests is rather involved; it uses two background
> processes plus a foreground process that polls the value of a
> reference. But despite sensitivity to timing, I think it should be
> robust even in this broken state. Once the functionality being tested
> is fixed, this test should never produce false positives, though
> really bad timing (e.g., if it takes more than a second for
> `update-ref` to get going) could still lead to false negatives.
>
> Each of the new tests takes about a second to run because they
> simulate lock contention.
>
> If anybody has suggestions for better ways to test these things,
> please speak up :-)
>
>  t/t1404-update-ref-errors.sh | 71 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 71 insertions(+)
>
> diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
> index c34ece48f5..752f83c377 100755
> --- a/t/t1404-update-ref-errors.sh
> +++ b/t/t1404-update-ref-errors.sh
> @@ -404,4 +404,75 @@ test_expect_success 'broken reference blocks indirect create' '
>         test_cmp expected output.err
>  '
>
> +test_expect_failure 'no bogus intermediate values during delete' '
> +       prefix=refs/slow-transaction &&
> +       # Set up a reference with differing loose and packed versions:
> +       git update-ref $prefix/foo $C &&
> +       git pack-refs --all &&
> +       git update-ref $prefix/foo $D &&
> +       git for-each-ref $prefix >unchanged &&
> +       # Now try to update the reference, but hold the `packed-refs` lock
> +       # for a while to see what happens while the process is blocked:
> +       : >.git/packed-refs.lock &&
> +       test_when_finished "rm -f .git/packed-refs.lock" &&
> +       {
> +               sleep 1 &&
> +               rm -f .git/packed-refs.lock &
> +       } &&
> +       pid1=$! &&
> +       {
> +               # Note: the following command is intentionally run in the
> +               # background. We extend the timeout so that `update-ref`
> +               # tries to acquire the `packed-refs` lock longer than it
> +               # takes the background process above to delete it:
> +               git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
> +       } &&
> +       pid2=$! &&
> +       ok=true &&
> +       while kill -0 $pid2 2>/dev/null

    If sig is 0, then no signal is sent, but error checking is still
    performed; this can be used to check for the existence of a
    process ID or process group ID.

So the kill -0 is the idiomatic form of "while $pid2 is still alive"?
ignoring errors due to the dev/null redirection?

And due to the nature of this test we have to have a busy
loop, we cannot rate limit the cpu usage inside the loop
via some shorter sleeps, as ideally we want to observe
the ref at any time.

    In an ideal world this test would instruct the kernel to interrupt
    the executing program (update-ref) at certain events such as
    touching/writing/deleting files and in each interrupt we could
    inspect the file system in a read only fashion.


> +       do
> +               sha1=$(git rev-parse --verify --quiet $prefix/foo || echo undefined) &&
> +               case "$sha1" in
> +               $D)
> +                       # This is OK; it just means that nothing has happened yet.
> +                       : ;;
> +               undefined)
> +                       # This is OK; it means the deletion was successful.
> +                       : ;;
> +               $C)
> +                       # This value should never be seen. Probably the loose
> +                       # reference has been deleted but the packed reference
> +                       # is still there:
> +                       echo "$prefix/foo incorrectly observed to be C" &&
> +                       break
> +                       ;;
> +               *)
> +                       # WTF?
> +                       echo "$prefix/foo unexpected value observed: $sha1" &&
> +                       break
> +                       ;;
> +               esac
> +       done >out &&
> +       wait $pid1 &&
> +       wait $pid2 &&

oh, you use explicit pids here to check each exit code.

> If anybody has suggestions for better ways to test these things,
> please speak up :-)

I don't think I'd have a satisfactory answer to that, as the timing is inherent
to the things we test. In other software projects that are less low level, I
would have suggested to use a time/clock mock, which can be stopped
and then inspection can be performed at defined states.

  reply	other threads:[~2017-08-30 17:21 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29  8:20 [PATCH 00/10] Implement transactions for the packed ref store Michael Haggerty
2017-08-29  8:20 ` [PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock Michael Haggerty
2017-09-08  6:52   ` Jeff King
2017-09-08 10:02     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 02/10] struct ref_transaction: add a place for backends to store data Michael Haggerty
2017-09-08  7:02   ` Jeff King
2017-09-08  8:19     ` Michael Haggerty
2017-09-08  8:33       ` Jeff King
2017-08-29  8:20 ` [PATCH 03/10] packed_ref_store: implement reference transactions Michael Haggerty
2017-08-29  8:20 ` [PATCH 04/10] packed_delete_refs(): implement method Michael Haggerty
2017-08-29 18:07   ` Brandon Williams
2017-08-30  3:00     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs Michael Haggerty
2017-08-29  8:20 ` [PATCH 06/10] files_initial_transaction_commit(): use a transaction for " Michael Haggerty
2017-09-08  7:27   ` Jeff King
2017-09-08 10:04     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 07/10] t1404: demonstrate two problems with reference transactions Michael Haggerty
2017-08-30 17:21   ` Stefan Beller [this message]
2017-08-31  3:42     ` Michael Haggerty
2017-09-08  4:44   ` Junio C Hamano
2017-09-08  7:45     ` Jeff King
2017-09-08 10:06     ` Michael Haggerty
2017-08-29  8:20 ` [PATCH 08/10] files_ref_store: use a transaction to update packed refs Michael Haggerty
2017-09-08  7:38   ` Jeff King
2017-09-08 12:44     ` Michael Haggerty
2017-09-08 12:57       ` Jeff King
2017-08-29  8:20 ` [PATCH 09/10] packed-backend: rip out some now-unused code Michael Haggerty
2017-08-29 18:24   ` Brandon Williams
2017-08-29  8:20 ` [PATCH 10/10] files_transaction_finish(): delete reflogs before references Michael Haggerty
2017-08-29 18:30 ` [PATCH 00/10] Implement transactions for the packed ref store Brandon Williams
2017-09-08  7:42 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79ka+uaBD_1xk6sx9J83KuP4A_mUWVqtyniNusbDeQiUZyQ@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).