From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>
Subject: [RFC PATCH v2 9/9] diffcore-rename: filter rename_src list when possible
Date: Mon, 20 Nov 2017 14:19:44 -0800 [thread overview]
Message-ID: <20171120221944.15431-10-newren@gmail.com> (raw)
In-Reply-To: <20171120221944.15431-1-newren@gmail.com>
We have to look at each entry in rename_src a total of rename_dst_nr
times. When we're not detecting copies, any exact renames or ignorable
rename paths will just be skipped over. While checking that these can
be skipped over is a relatively cheap check, it's still a waste of time
to do that check more than once, let alone rename_dst_nr times. When
rename_src_nr is a few thousand times bigger than the number of relevant
sources (such as when cherry-picking a commit that only touched a
handful of files, but from a side of history that has different names
for some high level directories), this time can add up.
First make an initial pass over the rename_src array and move all the
relevant entries to the front, so that we can iterate over just those
relevant entries.
In one particular testcase involving a large repository and some
high-level directories having been renamed, this cut the time necessary
for a cherry-pick down by a factor of about 2 (from around 34 seconds
down to just under 16 seconds)
Signed-off-by: Elijah Newren <newren@gmail.com>
---
diffcore-rename.c | 47 +++++++++++++++++++++++++++++++----------------
1 file changed, 31 insertions(+), 16 deletions(-)
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 5bf5bf7379..e60abb5980 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -437,16 +437,14 @@ static int find_renames(struct diff_score *mx, int dst_cnt, int minimum_score, i
return count;
}
-static int handle_rename_ignores(struct diff_options *options)
+static void handle_rename_ignores(struct diff_options *options)
{
- int detect_rename = options->detect_rename;
struct string_list *ignores = options->ignore_for_renames;
- int ignored = 0;
int i, j;
/* rename_ignores onlhy relevant when we're not detecting copies */
- if (ignores == NULL || detect_rename == DIFF_DETECT_COPY)
- return 0;
+ if (ignores == NULL)
+ return;
for (i = 0, j = 0; i < ignores->nr && j < rename_src_nr;) {
struct diff_filespec *one = rename_src[j].p->one;
@@ -464,11 +462,27 @@ static int handle_rename_ignores(struct diff_options *options)
j++;
else {
one->rename_used++;
- ignored++;
+ i++;
+ j++;
}
}
+}
+
+static int remove_renames_from_src(void)
+{
+ int j, new_j;
+
+ for (j = 0, new_j = 0; j < rename_src_nr; j++) {
+ if (rename_src[j].p->one->rename_used)
+ continue;
+
+ if (new_j < j)
+ memcpy(&rename_src[new_j], &rename_src[j],
+ sizeof(struct diff_rename_src));
+ new_j++;
+ }
- return ignored;
+ return new_j;
}
void diffcore_rename(struct diff_options *options)
@@ -479,7 +493,7 @@ void diffcore_rename(struct diff_options *options)
struct diff_queue_struct outq;
struct diff_score *mx;
int i, j, rename_count, skip_unmodified = 0;
- int num_create, dst_cnt, num_src, ignore_count;
+ int num_create, dst_cnt, num_src;
struct progress *progress = NULL;
if (!minimum_score)
@@ -542,18 +556,19 @@ void diffcore_rename(struct diff_options *options)
/*
* Mark source files as used if they are found in the
- * ignore_for_renames list.
+ * ignore_for_renames list, and clean out files from rename_src
+ * that we don't need to continue considering.
*/
- ignore_count = handle_rename_ignores(options);
+ num_src = rename_src_nr;
+ if (detect_rename != DIFF_DETECT_COPY) {
+ handle_rename_ignores(options);
+ num_src = remove_renames_from_src();
+ }
/*
- * Calculate how many renames are left (but all the source
- * files still remain as options for rename/copies!)
+ * Calculate how many renames are left
*/
num_create = (rename_dst_nr - rename_count);
- num_src = (detect_rename == DIFF_DETECT_COPY ?
- rename_src_nr : rename_src_nr - rename_count);
- num_src -= ignore_count;
/* All done? */
if (!num_create)
@@ -588,7 +603,7 @@ void diffcore_rename(struct diff_options *options)
for (j = 0; j < NUM_CANDIDATE_PER_DST; j++)
m[j].dst = -1;
- for (j = 0; j < rename_src_nr; j++) {
+ for (j = 0; j < num_src; j++) {
struct diff_filespec *one = rename_src[j].p->one;
struct diff_score this_src;
--
2.15.0.323.g31fe956618
prev parent reply other threads:[~2017-11-20 22:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 22:19 [RFC PATCH v2 0/9] Improve merge recursive performance Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 1/9] diffcore-rename: no point trying to find a match better than exact Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 2/9] merge-recursive: avoid unnecessary string list lookups Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 3/9] merge-recursive: new function for better colliding conflict resolutions Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 4/9] Add testcases for improved file collision conflict handling Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 5/9] merge-recursive: fix rename/add " Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 6/9] merge-recursive: improve handling for rename/rename(2to1) conflicts Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 7/9] merge-recursive: improve handling for add/add conflicts Elijah Newren
2017-11-20 22:19 ` [RFC PATCH v2 8/9] merge-recursive: accelerate rename detection Elijah Newren
2017-11-20 22:19 ` Elijah Newren [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171120221944.15431-10-newren@gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).