Hi Johannes & Elijah, Le 01/02/2019 à 07:04, Johannes Schindelin a écrit : > Hi Elijah, > > as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the > --am backend) and then with --keep-empty to force the interactive backend > to be used. Here are the best of 10, on my relatively powerful Windows 10 > laptop, with current `master`. > > With regular rebase --am: > > 3400.2: rebase on top of a lot of unrelated changes 5.32(0.06+0.15) > 3400.4: rebase a lot of unrelated changes without split-index 33.08(0.04+0.18) > 3400.6: rebase a lot of unrelated changes with split-index 30.29(0.03+0.18) > > with --keep-empty to force the interactive backend: > > 3400.2: rebase on top of a lot of unrelated changes 3.92(0.03+0.18) > 3400.4: rebase a lot of unrelated changes without split-index 33.92(0.03+0.22) > 3400.6: rebase a lot of unrelated changes with split-index 38.82(0.03+0.16) > > I then changed it to -m to test the current scripted version, trying to > let it run overnight, but my laptop eventually went to sleep and the tests > were not even done. I'll let them continue and report back. > > My conclusion after seeing these numbers is: the interactive rebase is > really close to the performance of the --am backend. So to me, it makes a > total lot of sense to switch --merge over to it, and to make --merge the > default. We still should investigate why the split-index performance is so > significantly worse, though. > > Ciao, > Dscho > I investigated a bit on this. From a quick glance at a callgrind trace, I can see that ce_write_entry() is called 20 601[1] times with `git am', but 739 802 times with the sequencer when the split-index is enabled. For reference, here are the timings, measured on my Linux machine, on a tmpfs, with git.git as the repo: `rebase --am': > 3400.2: rebase on top of a lot of unrelated changes 0.29(0.24+0.03) > 3400.4: rebase a lot of unrelated changes without split-index 6.77(6.51+0.22) > 3400.6: rebase a lot of unrelated changes with split-index 4.43(4.29+0.13) `rebase --quiet': > 3400.2: rebase on top of a lot of unrelated changes 0.24(0.21+0.02) > 3400.4: rebase a lot of unrelated changes without split-index 5.60(5.32+0.27) > 3400.6: rebase a lot of unrelated changes with split-index 5.67(5.40+0.26) This comes from two things: 1. There is not enough shared entries in the index with the sequencer. do_write_index() is called only by do_write_locked_index() with `--am', but is also called by write_shared_index() with the sequencer once for every other commit. As the latter is only called by write_locked_index(), which means that too_many_not_shared_entries() returns true for the sequencer, but never for `--am'. Removing the call to discard_index() in do_pick_commit() (as in the first attached patch) solve this particular issue, but this would require a more thorough analysis to see if it is actually safe to do. After this, ce_write() is still called much more by the sequencer. Here are the results of `rebase --quiet' without discarding the index: > 3400.2: rebase on top of a lot of unrelated changes 0.23(0.19+0.04) > 3400.4: rebase a lot of unrelated changes without split-index 5.14(4.95+0.18) > 3400.6: rebase a lot of unrelated changes with split-index 5.02(4.87+0.15) The performance of the rebase is better in the two cases. 2. The base index is dropped by unpack_trees_start() and unpack_trees(). Now, write_shared_index() is no longer called and write_locked_index() is less expensive than before according to callgrind. But ce_write_entry() is still called 749 302 times (which is even more than before.) The only place where ce_write_entry() is called is in a loop in do_write_index(). The number of iterations is dictated by the size of the cache, and there is a trace2 probe dumping this value. For `--am', the value goes like this: 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, … up until 101. For the sequencer, it goes like this: 1, 1, 3697, 3697, 3698, 3698, 3699, 3699, … up until 3796. The size of the cache is set in prepare_to_write_split_index(). It grows if a cache entry has no index (most of them should have one by now), or if the split index has no base index (with `--am', the split index always has a base.) This comes from unpack_trees_start() -- it creates a new index, and unpack_trees() does not carry the base index, hence the size of the cache. The second attached patch (which is broken for the non-interactive rebase case) demonstrates what we could expect for the split-index case if we fix this: > 3400.2: rebase on top of a lot of unrelated changes 0.24(0.21+0.03) > 3400.4: rebase a lot of unrelated changes without split-index 5.81(5.62+0.17) > 3400.6: rebase a lot of unrelated changes with split-index 4.76(4.54+0.20) So, for everything related to the index, I think that’s it. [1] Numbers may vary, but they should remain in the same order of magnitude. Cheers, Alban