git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] Final optimization batch (#15): use memory pools
@ 2021-07-23 12:54 Elijah Newren via GitGitGadget
  2021-07-23 12:54 ` [PATCH 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
                   ` (8 more replies)
  0 siblings, 9 replies; 65+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-07-23 12:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Elijah Newren

This series textually depends on en/ort-perf-batch-14, but the ideas are
orthogonal to it and orthogonal to previous series. It can be reviewed
independently.

This series is more about strmaps & memory pools than merge logic. CC'ing
Peff since he reviewed the strmap work[1], and that work included a number
of decisions that specifically had this series in mind.

[1]
https://lore.kernel.org/git/20201111200701.GB39046@coredump.intra.peff.net/

=== Basic Optimization idea ===

In this series, I make use of memory pools to get faster allocations and
deallocations for many data structures that tend to all be deallocated at
the same time anyway.

=== Results ===

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28), the
changes in just this series improves the performance as follows:

                     Before Series           After Series
no-renames:      204.2  ms ±  3.0  ms    198.3 ms ±  2.9 ms
mega-renames:      1.076 s ±  0.015 s    661.8 ms ±  5.9 ms
just-one-mega:   364.1  ms ±  7.0  ms    264.6 ms ±  2.5 ms


As a reminder, before any merge-ort/diffcore-rename performance work, the
performance results we started with were:

no-renames-am:      6.940 s ±  0.485 s
no-renames:        18.912 s ±  0.174 s
mega-renames:    5964.031 s ± 10.459 s
just-one-mega:    149.583 s ±  0.751 s


=== Overall Results across all optimization work ===

This is my final prepared optimization series. It might be worth reviewing
how my optimizations fared overall, comparing the original merge-recursive
timings with three things: how much merge-recursive improved (as a
side-effect of optimizing merge-ort), how much improvement we would have
gotten from a hypothetical infinite parallelization of rename detection, and
what I achieved at the end with merge-ort:

                               Timings

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename    merge-ort
                 v2.30.0      current     detection     current
                ----------   ---------   -----------   ---------
no-renames:       18.912 s    18.030 s     11.699 s     198.3 ms
mega-renames:   5964.031 s   361.281 s    203.886 s     661.8 ms
just-one-mega:   149.583 s    11.009 s      7.553 s     264.6 ms

                           Speedup factors

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
no-renames:         1           1.05         1.6           95
mega-renames:       1          16.5         29           9012
just-one-mega:      1          13.6         20            565


And, for partial clone users:

             Factor reduction in number of objects needed

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
mega-renames:       1            1            1          181.3


=== Caveat ===

It may be worth noting, though, that my optimization numbers above for
merge-ort use test-tool fast-rebase. git rebase -s ort on the three
testcases above is 5-20 times slower (taking 3.835s, 6.798s, and 1.235s,
respectively). At this point, any further optimization work should go into
making a faster full-featured rebase by copying the ideas from fast-rebase:
avoid unnecessary process forking, avoid updating the index and working copy
until either the rebase is finished or you hit a conflict (and don't write
rebase metadata to disk until that point either), get rid of the glacially
slow revision walking of the upstream side of history (nuke
can_fast_forward(), make --reapply-cherry-picks the default) or at least
don't revision walk so many times (multiple calls to get_merge_bases in
can_fast_forward() plus a is_linear_history() walk, checking for upstream
cherry-picks, probably more), turn off per-commit hooks that probably should
have never been on anyway, etc.

Elijah Newren (7):
  diffcore-rename: use a mem_pool for exact rename detection's hashmap
  merge-ort: set up a memory pool
  merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
  merge-ort: switch our strmaps over to using memory pools
  diffcore-rename, merge-ort: add wrapper functions for filepair
    alloc/dealloc
  merge-ort: store filepairs and filespecs in our mem_pool
  merge-ort: reuse path strings in pool_alloc_filespec

 diffcore-rename.c |  66 +++++++++++--
 diffcore.h        |   3 +
 merge-ort.c       | 247 +++++++++++++++++++++++++++++++++++-----------
 3 files changed, 251 insertions(+), 65 deletions(-)


base-commit: c9ada8369e6575be488028aae0f654422a9b1410
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-990%2Fnewren%2Fort-perf-batch-15-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-990/newren/ort-perf-batch-15-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/990
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2021-08-03 15:45 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23 12:54 [PATCH 0/7] Final optimization batch (#15): use memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-23 21:59   ` Eric Sunshine
2021-07-23 22:03     ` Elijah Newren
2021-07-23 12:54 ` [PATCH 2/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 3/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-23 22:07   ` Eric Sunshine
2021-07-26 14:36   ` Derrick Stolee
2021-07-28 22:49     ` Elijah Newren
2021-07-29 15:26       ` Jeff King
2021-07-30  2:27         ` Elijah Newren
2021-07-30 16:12           ` Jeff King
2021-07-23 12:54 ` [PATCH 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-26 14:44 ` [PATCH 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-28 22:52   ` Elijah Newren
2021-07-29  3:58 ` [PATCH v2 " Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 2/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 3/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-29 15:28     ` Jeff King
2021-07-29 18:37       ` Elijah Newren
2021-07-29 20:09         ` Jeff King
2021-07-30  2:30           ` Elijah Newren
2021-07-30 16:12             ` Jeff King
2021-07-30 13:30           ` Ævar Arnfjörð Bjarmason
2021-07-30 14:36             ` Elijah Newren
2021-07-30 16:23               ` Ævar Arnfjörð Bjarmason
2021-07-29  3:58   ` [PATCH v2 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-29  3:58   ` [PATCH v2 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-29 14:58   ` [PATCH v2 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-29 16:20   ` Jeff King
2021-07-29 16:23     ` Jeff King
2021-07-29 19:46       ` Junio C Hamano
2021-07-29 20:48         ` Junio C Hamano
2021-07-29 21:05           ` Elijah Newren
2021-07-29 20:46     ` Elijah Newren
2021-07-29 21:14       ` Jeff King
2021-07-30 11:47   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-30 11:47     ` [PATCH v3 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-07-30 16:24       ` Jeff King
2021-07-31 17:27     ` [PATCH v4 0/9] Final optimization batch (#15): use " Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-31 17:27       ` [PATCH v4 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-08-02 15:27       ` [PATCH v4 0/9] Final optimization batch (#15): use " Derrick Stolee
2021-08-03 15:45       ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).