git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Alban Gruin <alban.gruin@gmail.com>
Cc: Elijah Newren <newren@gmail.com>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Comparing rebase --am with --interactive via p3400
Date: Thu, 2 Jan 2020 21:17:36 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2001022108280.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <16add63e-a631-5ebf-bbbe-17823d942ee9@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6575 bytes --]

Hi Alban & Elijah,

On Sun, 29 Dec 2019, Alban Gruin wrote:

> Hi Elijah,
>
> Le 27/12/2019 à 23:45, Elijah Newren a écrit :
> > Hi Alban,
> >
> > On Fri, Dec 27, 2019 at 1:11 PM Alban Gruin <alban.gruin@gmail.com> wrote:
> >>
> >> Hi Johannes & Elijah,
> >>
> >> Le 01/02/2019 à 07:04, Johannes Schindelin a écrit :
> >>> Hi Elijah,
> >>>
> >>> as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the
> >>> --am backend) and then with --keep-empty to force the interactive backend
> >>> to be used. Here are the best of 10, on my relatively powerful Windows 10
> >>> laptop, with current `master`.
> >>>
> >>> With regular rebase --am:
> >>>
> >>> 3400.2: rebase on top of a lot of unrelated changes             5.32(0.06+0.15)
> >>> 3400.4: rebase a lot of unrelated changes without split-index   33.08(0.04+0.18)
> >>> 3400.6: rebase a lot of unrelated changes with split-index      30.29(0.03+0.18)
> >>>
> >>> with --keep-empty to force the interactive backend:
> >>>
> >>> 3400.2: rebase on top of a lot of unrelated changes             3.92(0.03+0.18)
> >>> 3400.4: rebase a lot of unrelated changes without split-index   33.92(0.03+0.22)
> >>> 3400.6: rebase a lot of unrelated changes with split-index      38.82(0.03+0.16)
> >>>
> >>> I then changed it to -m to test the current scripted version, trying to
> >>> let it run overnight, but my laptop eventually went to sleep and the tests
> >>> were not even done. I'll let them continue and report back.
> >>>
> >>> My conclusion after seeing these numbers is: the interactive rebase is
> >>> really close to the performance of the --am backend. So to me, it makes a
> >>> total lot of sense to switch --merge over to it, and to make --merge the
> >>> default. We still should investigate why the split-index performance is so
> >>> significantly worse, though.
> >>>
> >>> Ciao,
> >>> Dscho
> >>>
> >>
> >> I investigated a bit on this.  From a quick glance at a callgrind trace,
> >> I can see that ce_write_entry() is called 20 601[1] times with `git am',
> >> but 739 802 times with the sequencer when the split-index is enabled.
> >
> > Sweet, thanks for digging in and analyzing this.
> >
> >> For reference, here are the timings, measured on my Linux machine, on a
> >> tmpfs, with git.git as the repo:
> >>
> >> `rebase --am':
> >>> 3400.2: rebase on top of a lot of unrelated changes             0.29(0.24+0.03)
> >>> 3400.4: rebase a lot of unrelated changes without split-index   6.77(6.51+0.22)
> >>> 3400.6: rebase a lot of unrelated changes with split-index      4.43(4.29+0.13)
> >> `rebase --quiet':
> >
> > --quiet?  Isn't that flag supposed to work with both backends and not
> > imply either one?  We previously used --keep-empty, though there's a
> > chance that flag means we're not doing a fair comparison (since 'am'
> > will drop empty commits and thus have less work to do).  Is there any
> > chance you actually ran a different command, but when you went to
> > summarize just typed the wrong flag name?  Anyway, the best would
> > probably be to use --merge here (at the time Johannes and I were
> > testing, that wouldn't have triggered the sequencer, but it does now),
> > after first applying the en/rebase-backend series just to make sure
> > we're doing an apples to apples comparison.  However, I suspect that
> > empty commits probably weren't much of a factor and you did find some
> > interesting things...
> >
>
> Yes, I did use `--keep-empty' but misremembered it when writing this email…
>
> >>> 3400.2: rebase on top of a lot of unrelated changes             0.24(0.21+0.02)
> >>> 3400.4: rebase a lot of unrelated changes without split-index   5.60(5.32+0.27)
> >>> 3400.6: rebase a lot of unrelated changes with split-index      5.67(5.40+0.26)
> >>
> >> This comes from two things:
> >>
> >> 1. There is not enough shared entries in the index with the sequencer.
> >>
> >> do_write_index() is called only by do_write_locked_index() with `--am',
> >> but is also called by write_shared_index() with the sequencer once for
> >> every other commit.  As the latter is only called by
> >> write_locked_index(), which means that too_many_not_shared_entries()
> >> returns true for the sequencer, but never for `--am'.
> >>
> >> Removing the call to discard_index() in do_pick_commit() (as in the
> >> first attached patch) solve this particular issue, but this would
> >> require a more thorough analysis to see if it is actually safe to do.
> >
> > I'm actually surprised the sequencer would call discard_index(); I
> > would have thought it would have relied on merge_recursive() to do the
> > necessary index changes and updates other than writing the new index
> > out.  But I'm not quite as familar with the sequencer so perhaps
> > there's some reason I'm unaware of.  (Any chance this is a left-over
> > from when sequencer invoked external scripts to do the work, and thus
> > the index was updated in another processes' memory and on disk, and it
> > had to discard and re-read to get its own process updated?)
> >
>
> The sequencer re-reads the index after invoking an external command
> (either `git checkout', `git merge' or an `exec' command from the todo
> list), which makes sense.  But this one seems to come from 6eb1b437933
> ("cherry-pick/revert: make direct internal call to merge_tree()",
> 2008-09-02).  So, yes, quite old, and perhaps no longer justified.

Right. This commit also moved the `discard_cache()` call outside from the
`else` clause of the `if (no_commit)`.

That `else` clause goes all the way back to 9509af686bf (Make git-revert &
git-cherry-pick a builtin, 2007-03-01), and I admit freely that my memory
is no longer fresh on the specifics of this patch.

Looking at that patch, I think I simply discarded the index because a
subsequent code path would spawn the `git merge-recursive` process, which
would have changed the index externally.

> I know I had to add another discard_cache() in rebase--interactive.c
> because it broke something with the submodules, but it does not seems
> all that useful now that rebase.c no longer has to fork to use the
> sequencer.

FWIW I agree. The code is still quite complex at this point, but
infinitely more readable (thank you Elijah for taking point on simplifying
merge-recursive.c!). So I think that it might be the right point in time
to make sure that the index is not re-read and re-discarded over and over
again.

Thanks,
Dscho

  reply	other threads:[~2020-01-02 20:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01  6:04 Comparing rebase --am with --interactive via p3400 Johannes Schindelin
2019-02-01  7:22 ` Johannes Schindelin
2019-02-01  9:26 ` Elijah Newren
2019-12-27 21:11 ` Alban Gruin
2019-12-27 22:45   ` Elijah Newren
2019-12-29 17:25     ` Alban Gruin
2020-01-02 20:17       ` Johannes Schindelin [this message]
2020-01-31 21:23   ` Johannes Schindelin
2020-04-01 11:33     ` Alban Gruin
2020-04-01 14:00       ` Phillip Wood
2020-04-04 20:33         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2001022108280.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=alban.gruin@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).