git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Alban Gruin <alban.gruin@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Comparing rebase --am with --interactive via p3400
Date: Sun, 29 Dec 2019 18:25:14 +0100	[thread overview]
Message-ID: <16add63e-a631-5ebf-bbbe-17823d942ee9@gmail.com> (raw)
In-Reply-To: <CABPp-BGCQR+MLVTDoaXDmPrE4SCu+dQ794X6Kvx1PpiQ=6D7KQ@mail.gmail.com>

Hi Elijah,

Le 27/12/2019 à 23:45, Elijah Newren a écrit :
> Hi Alban,
> 
> On Fri, Dec 27, 2019 at 1:11 PM Alban Gruin <alban.gruin@gmail.com> wrote:
>>
>> Hi Johannes & Elijah,
>>
>> Le 01/02/2019 à 07:04, Johannes Schindelin a écrit :
>>> Hi Elijah,
>>>
>>> as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the
>>> --am backend) and then with --keep-empty to force the interactive backend
>>> to be used. Here are the best of 10, on my relatively powerful Windows 10
>>> laptop, with current `master`.
>>>
>>> With regular rebase --am:
>>>
>>> 3400.2: rebase on top of a lot of unrelated changes             5.32(0.06+0.15)
>>> 3400.4: rebase a lot of unrelated changes without split-index   33.08(0.04+0.18)
>>> 3400.6: rebase a lot of unrelated changes with split-index      30.29(0.03+0.18)
>>>
>>> with --keep-empty to force the interactive backend:
>>>
>>> 3400.2: rebase on top of a lot of unrelated changes             3.92(0.03+0.18)
>>> 3400.4: rebase a lot of unrelated changes without split-index   33.92(0.03+0.22)
>>> 3400.6: rebase a lot of unrelated changes with split-index      38.82(0.03+0.16)
>>>
>>> I then changed it to -m to test the current scripted version, trying to
>>> let it run overnight, but my laptop eventually went to sleep and the tests
>>> were not even done. I'll let them continue and report back.
>>>
>>> My conclusion after seeing these numbers is: the interactive rebase is
>>> really close to the performance of the --am backend. So to me, it makes a
>>> total lot of sense to switch --merge over to it, and to make --merge the
>>> default. We still should investigate why the split-index performance is so
>>> significantly worse, though.
>>>
>>> Ciao,
>>> Dscho
>>>
>>
>> I investigated a bit on this.  From a quick glance at a callgrind trace,
>> I can see that ce_write_entry() is called 20 601[1] times with `git am',
>> but 739 802 times with the sequencer when the split-index is enabled.
> 
> Sweet, thanks for digging in and analyzing this.
> 
>> For reference, here are the timings, measured on my Linux machine, on a
>> tmpfs, with git.git as the repo:
>>
>> `rebase --am':
>>> 3400.2: rebase on top of a lot of unrelated changes             0.29(0.24+0.03)
>>> 3400.4: rebase a lot of unrelated changes without split-index   6.77(6.51+0.22)
>>> 3400.6: rebase a lot of unrelated changes with split-index      4.43(4.29+0.13)
>> `rebase --quiet':
> 
> --quiet?  Isn't that flag supposed to work with both backends and not
> imply either one?  We previously used --keep-empty, though there's a
> chance that flag means we're not doing a fair comparison (since 'am'
> will drop empty commits and thus have less work to do).  Is there any
> chance you actually ran a different command, but when you went to
> summarize just typed the wrong flag name?  Anyway, the best would
> probably be to use --merge here (at the time Johannes and I were
> testing, that wouldn't have triggered the sequencer, but it does now),
> after first applying the en/rebase-backend series just to make sure
> we're doing an apples to apples comparison.  However, I suspect that
> empty commits probably weren't much of a factor and you did find some
> interesting things...
> 

Yes, I did use `--keep-empty' but misremembered it when writing this email…

>>> 3400.2: rebase on top of a lot of unrelated changes             0.24(0.21+0.02)
>>> 3400.4: rebase a lot of unrelated changes without split-index   5.60(5.32+0.27)
>>> 3400.6: rebase a lot of unrelated changes with split-index      5.67(5.40+0.26)
>>
>> This comes from two things:
>>
>> 1. There is not enough shared entries in the index with the sequencer.
>>
>> do_write_index() is called only by do_write_locked_index() with `--am',
>> but is also called by write_shared_index() with the sequencer once for
>> every other commit.  As the latter is only called by
>> write_locked_index(), which means that too_many_not_shared_entries()
>> returns true for the sequencer, but never for `--am'.
>>
>> Removing the call to discard_index() in do_pick_commit() (as in the
>> first attached patch) solve this particular issue, but this would
>> require a more thorough analysis to see if it is actually safe to do.
> 
> I'm actually surprised the sequencer would call discard_index(); I
> would have thought it would have relied on merge_recursive() to do the
> necessary index changes and updates other than writing the new index
> out.  But I'm not quite as familar with the sequencer so perhaps
> there's some reason I'm unaware of.  (Any chance this is a left-over
> from when sequencer invoked external scripts to do the work, and thus
> the index was updated in another processes' memory and on disk, and it
> had to discard and re-read to get its own process updated?)
> 

The sequencer re-reads the index after invoking an external command
(either `git checkout', `git merge' or an `exec' command from the todo
list), which makes sense.  But this one seems to come from 6eb1b437933
("cherry-pick/revert: make direct internal call to merge_tree()",
2008-09-02).  So, yes, quite old, and perhaps no longer justified.

I know I had to add another discard_cache() in rebase--interactive.c
because it broke something with the submodules, but it does not seems
all that useful now that rebase.c no longer has to fork to use the
sequencer.

Cheers,
Alban


  reply	other threads:[~2019-12-29 18:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01  6:04 Comparing rebase --am with --interactive via p3400 Johannes Schindelin
2019-02-01  7:22 ` Johannes Schindelin
2019-02-01  9:26 ` Elijah Newren
2019-12-27 21:11 ` Alban Gruin
2019-12-27 22:45   ` Elijah Newren
2019-12-29 17:25     ` Alban Gruin [this message]
2020-01-02 20:17       ` Johannes Schindelin
2020-01-31 21:23   ` Johannes Schindelin
2020-04-01 11:33     ` Alban Gruin
2020-04-01 14:00       ` Phillip Wood
2020-04-04 20:33         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16add63e-a631-5ebf-bbbe-17823d942ee9@gmail.com \
    --to=alban.gruin@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).