From: Eric Rannaud <e@nanocritical.com>
To: Elijah Newren <newren@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Jeremy Serror <jeremy.serror@gmail.com>
Subject: Re: [PATCH v4] fast-import: checkpoint: only write out refs and tags if we changed them
Date: Mon, 27 May 2019 15:45:49 -0700 [thread overview]
Message-ID: <CAH_=xoZ5Lt2nn50cfDbTA-ZVeshi8HZ91Od4btuUXpBGCanGng@mail.gmail.com> (raw)
In-Reply-To: <20190527211245.12292-1-newren@gmail.com>
On Mon, May 27, 2019 at 2:12 PM Elijah Newren <newren@gmail.com> wrote:
> On Fri, May 24, 2019 at 4:10 PM Eric Rannaud <e@nanocritical.com> wrote:
> >
> > With this patch, we keep track of whether a particular branch or tag has
> > been changed by this fast-import process since our last checkpoint (or
> > startup). At the next checkpoint, only refs and tags that new commands
> > have changed are written to disk. To be clear, fast-import still does
> > not update its internal view of branches in response to external
> > changes, but now it avoids interfering with external changes unless
> > there was an explicit command to do so since the last checkpoint.
>
> This patch looks reasonable to me; you can view it one of two basic ways:
> 1) an optimization (only update what needs to be updated)
> 2) a way to have early access to a completed branch
>
> In particular, if the fast-import stream is dealing with unrelated (or
> very divergent) branches and it completes one branch before others,
> then your change allows people to have early access to both read &
> update the completed branches while the import continues working on
> other branches.
That's a good way to describe how people with a more standard use case
than mine can benefit from my patch.
> If you changed your description to sell it this way, it'd be fine
> as-is. But you use --force and make multiple mentions of concurrent
> updates to branches in external processes during the fast-import
> process. That kind of description makes it really clear we need to
> tighten up what happens with the checkpoint command when it hits a
> failure (as mentioned in the commentary on V3). Below is a patch that
> does this.
>
> I think we should either:
>
> 1) update your commit message to sell it without mentioning the
> concurrent rewrites, and then I'll update my patch to not
> conflict (we both add new tests at the same location to the same
> file causing a simple conflict) by building on yours, OR
>
> 2) update your patch to come after mine and add a comment to your
> commit message about how checkpoint will abort if it hits an
> error, suggesting that people should only update branches
> fast-import will not be updating further or they should use
> --force and deal with their changes being overwritten by
> fast-import. Then you can submit our two patches as a series.
>
> Thoughts?
(2) is probably the most logical order to do things. I have a few
comments on your patch, below.
> -- 8< --
> Subject: [PATCH] fast-import: remember to check failure flag when
> checkpointing
>
> fast-import, when finished, will flush out the current packfile and
> update branches, tags, and marks, returning whether it was successful.
> The point of the 'checkpoint' command was to do all the same things, but
> continue processing the stream of input commands if it was successful.
> Unfortunately, the checkpoint code forgot to check the 'failure' flag to
> see if there was an error in e.g. updating the branches, which meant it
> would also continue if there was a failure. Fix this oversight.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> fast-import.c | 2 ++
> t/t9300-fast-import.sh | 42 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/fast-import.c b/fast-import.c
> index f38d04fa58..d0e12b03a0 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -3033,6 +3033,8 @@ static void checkpoint(void)
> dump_branches();
> dump_tags();
> dump_marks();
> + if (failure)
> + exit(1);
> }
>
> static void parse_checkpoint(void)
> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index 3668263c40..788a543b82 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -3262,6 +3262,48 @@ test_expect_success PIPE 'V: checkpoint updates tags after tag' '
> background_import_still_running
> '
>
> +background_fast_import_race () {
> + options=$1
> + input_file=$2
> + exit_status=$3
> + extra_cmd="$4"
> +
> + mkfifo V.input
> + exec 8<>V.input
> + rm V.input
> +
> + git fast-import $options <&8 &
> + echo $! >V.pid &&
> + $extra_cmd &&
> +
> + cat $input_file >&8
> + wait $(cat V.pid)
> + test $? -eq $exit_status
> +}
I think the test will not fail without the patch (and therefore won't
catch a regression): while checkpoint doesn't currently check the
failure flag, it will be checked when cmd_main() returns so
exit_status will be 1.
You either want to:
- (a) add more independent commands after the checkpoint and check
that they were not run,
- or (b) run with --done, do not include a done command, and check
that fast-import does exit (but it's racy),
- or (c) you can reuse background_import to have a more explicit
sequence of events (in which case improvements to background_import
from my patch would have to be committed first).
Something like: (I did not try the following code)
cat >input <<-INPUT_END &&
commit refs/heads/V4
mark :1
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
data 0
from refs/heads/U
commit refs/heads/V4_child
mark :1
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
data 0
from refs/heads/V4^0
checkpoint
reset refs/heads/V5
from refs/heads/U
progress do: shell
progress git branch V5 V4
checkpoint
reset refs/heads/V5
from refs/heads/V4_child
progress do: checkpoint and stop
INPUT_END
background_import "" input &&
# V5 remains equal to V4 as set by the external "git branch" command:
#
# V5 is not updated by the second checkpoint (as resetting V5 to U isn't a
# fast-forward with respect to V4), and V5 is not updated by the reset to
# V4_child after the second checkpoint (which would be a fast-forward with
# respect to V4) as fast-import immediately exits on a failed checkpoint.
test "$(git rev-parse --verify V5)" = "$(git rev-parse --verify V4)" &&
# fast-import is not running anymore (but do kill it if it is):
! kill -9 $(cat V.pid) &> /dev/null &&
# fast-import exited with an error:
test $(wait $(cat V.pid)) -eq 1
> +test_expect_success PIPE 'V: checkpoint fails if refs updated beforehand' '
> + git checkout --orphan V3 &&
> + git commit --allow-empty -m initial &&
> + INITIAL=$(git rev-parse HEAD) &&
> +
> + cat >input <<-INPUT_END &&
> + feature done
> + commit refs/heads/V3
> + mark :3
> + committer Me My <self@and.eye> 1234567890 +0123
You likely want to use:
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> + data 10
> + generated
> + from $INITIAL
> +
> + checkpoint
> + done
> + INPUT_END
> +
> + background_fast_import_race "" input 1 "git commit --allow-empty -m conflicting" &&
> + background_fast_import_race "--force" input 0 "git commit --allow-empty -m conflicting"
> +'
> +
> +
> ###
> ### series W (get-mark and empty orphan commits)
> ###
> --
> 2.22.0.rc1.dirty
>
next prev parent reply other threads:[~2019-05-27 22:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-24 23:09 [PATCH v4] fast-import: checkpoint: only write out refs and tags if we changed them Eric Rannaud
2019-05-25 2:18 ` Eric Rannaud
2019-05-27 21:12 ` Elijah Newren
2019-05-27 22:45 ` Eric Rannaud [this message]
2019-05-27 23:27 ` Elijah Newren
2019-05-28 0:11 ` Eric Rannaud
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAH_=xoZ5Lt2nn50cfDbTA-ZVeshi8HZ91Od4btuUXpBGCanGng@mail.gmail.com' \
--to=e@nanocritical.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeremy.serror@gmail.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).