From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
Eric Sunshine <sunshine@sunshineco.com>,
Elijah Newren <newren@gmail.com>,
Derrick Stolee <stolee@gmail.com>,
Elijah Newren <newren@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v2 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap
Date: Thu, 29 Jul 2021 03:58:35 +0000 [thread overview]
Message-ID: <ea08b34d29b999c1df54bce8dc668f1a23915471.1627531121.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.990.v2.git.1627531121.gitgitgadget@gmail.com>
From: Elijah Newren <newren@gmail.com>
Exact rename detection, via insert_file_table(), uses a hashmap to store
files by oid. Use a mem_pool for the hashmap entries so these can all be
allocated and deallocated together.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 204.2 ms ± 3.0 ms 202.5 ms ± 3.2 ms
mega-renames: 1.076 s ± 0.015 s 1.072 s ± 0.012 s
just-one-mega: 364.1 ms ± 7.0 ms 357.3 ms ± 3.9 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
---
diffcore-rename.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 4ef0459cfb5..73d884099eb 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -317,10 +317,11 @@ static int find_identical_files(struct hashmap *srcs,
}
static void insert_file_table(struct repository *r,
+ struct mem_pool *pool,
struct hashmap *table, int index,
struct diff_filespec *filespec)
{
- struct file_similarity *entry = xmalloc(sizeof(*entry));
+ struct file_similarity *entry = mem_pool_alloc(pool, sizeof(*entry));
entry->index = index;
entry->filespec = filespec;
@@ -336,7 +337,8 @@ static void insert_file_table(struct repository *r,
* and then during the second round we try to match
* cache-dirty entries as well.
*/
-static int find_exact_renames(struct diff_options *options)
+static int find_exact_renames(struct diff_options *options,
+ struct mem_pool *pool)
{
int i, renames = 0;
struct hashmap file_table;
@@ -346,7 +348,7 @@ static int find_exact_renames(struct diff_options *options)
*/
hashmap_init(&file_table, NULL, NULL, rename_src_nr);
for (i = rename_src_nr-1; i >= 0; i--)
- insert_file_table(options->repo,
+ insert_file_table(options->repo, pool,
&file_table, i,
rename_src[i].p->one);
@@ -354,8 +356,8 @@ static int find_exact_renames(struct diff_options *options)
for (i = 0; i < rename_dst_nr; i++)
renames += find_identical_files(&file_table, i, options);
- /* Free the hash data structure and entries */
- hashmap_clear_and_free(&file_table, struct file_similarity, entry);
+ /* Free the hash data structure (entries will be freed with the pool) */
+ hashmap_clear(&file_table);
return renames;
}
@@ -1341,6 +1343,7 @@ void diffcore_rename_extended(struct diff_options *options,
int num_destinations, dst_cnt;
int num_sources, want_copies;
struct progress *progress = NULL;
+ struct mem_pool local_pool;
struct dir_rename_info info;
struct diff_populate_filespec_options dpf_options = {
.check_binary = 0,
@@ -1409,11 +1412,18 @@ void diffcore_rename_extended(struct diff_options *options,
goto cleanup; /* nothing to do */
trace2_region_enter("diff", "exact renames", options->repo);
+ mem_pool_init(&local_pool, 32*1024);
/*
* We really want to cull the candidates list early
* with cheap tests in order to avoid doing deltas.
*/
- rename_count = find_exact_renames(options);
+ rename_count = find_exact_renames(options, &local_pool);
+ /*
+ * Discard local_pool immediately instead of at "cleanup:" in order
+ * to reduce maximum memory usage; inexact rename detection uses up
+ * a fair amount of memory, and mem_pools can too.
+ */
+ mem_pool_discard(&local_pool, 0);
trace2_region_leave("diff", "exact renames", options->repo);
/* Did we only want exact renames? */
--
gitgitgadget
next prev parent reply other threads:[~2021-07-29 3:58 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-23 12:54 [PATCH 0/7] Final optimization batch (#15): use memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 1/7] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-23 21:59 ` Eric Sunshine
2021-07-23 22:03 ` Elijah Newren
2021-07-23 12:54 ` [PATCH 2/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 3/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-23 22:07 ` Eric Sunshine
2021-07-26 14:36 ` Derrick Stolee
2021-07-28 22:49 ` Elijah Newren
2021-07-29 15:26 ` Jeff King
2021-07-30 2:27 ` Elijah Newren
2021-07-30 16:12 ` Jeff King
2021-07-23 12:54 ` [PATCH 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-23 12:54 ` [PATCH 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-26 14:44 ` [PATCH 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-28 22:52 ` Elijah Newren
2021-07-29 3:58 ` [PATCH v2 " Elijah Newren via GitGitGadget
2021-07-29 3:58 ` Elijah Newren via GitGitGadget [this message]
2021-07-29 3:58 ` [PATCH v2 2/7] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 3/7] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 4/7] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-29 15:28 ` Jeff King
2021-07-29 18:37 ` Elijah Newren
2021-07-29 20:09 ` Jeff King
2021-07-30 2:30 ` Elijah Newren
2021-07-30 16:12 ` Jeff King
2021-07-30 13:30 ` Ævar Arnfjörð Bjarmason
2021-07-30 14:36 ` Elijah Newren
2021-07-30 16:23 ` Ævar Arnfjörð Bjarmason
2021-07-29 3:58 ` [PATCH v2 5/7] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 6/7] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-29 3:58 ` [PATCH v2 7/7] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-29 14:58 ` [PATCH v2 0/7] Final optimization batch (#15): use memory pools Derrick Stolee
2021-07-29 16:20 ` Jeff King
2021-07-29 16:23 ` Jeff King
2021-07-29 19:46 ` Junio C Hamano
2021-07-29 20:48 ` Junio C Hamano
2021-07-29 21:05 ` Elijah Newren
2021-07-29 20:46 ` Elijah Newren
2021-07-29 21:14 ` Jeff King
2021-07-30 11:47 ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-30 11:47 ` [PATCH v3 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-07-30 16:24 ` Jeff King
2021-07-31 17:27 ` [PATCH v4 0/9] Final optimization batch (#15): use " Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 1/9] merge-ort: rename str{map,intmap,set}_func() Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 2/9] diffcore-rename: use a mem_pool for exact rename detection's hashmap Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 3/9] merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 4/9] merge-ort: set up a memory pool Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 5/9] merge-ort: switch our strmaps over to using memory pools Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 6/9] diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 7/9] merge-ort: store filepairs and filespecs in our mem_pool Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 8/9] merge-ort: reuse path strings in pool_alloc_filespec Elijah Newren via GitGitGadget
2021-07-31 17:27 ` [PATCH v4 9/9] merge-ort: remove compile-time ability to turn off usage of memory pools Elijah Newren via GitGitGadget
2021-08-02 15:27 ` [PATCH v4 0/9] Final optimization batch (#15): use " Derrick Stolee
2021-08-03 15:45 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ea08b34d29b999c1df54bce8dc668f1a23915471.1627531121.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).