git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Phillip Wood" <phillip.wood@dunelm.org.uk>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Elijah Newren" <newren@gmail.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Phillip Wood" <phillip.wood@dunelm.org.uk>,
	"Phillip Wood" <phillip.wood@dunelm.org.uk>
Subject: [PATCH v4 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize
Date: Tue, 16 Nov 2021 09:49:31 +0000	[thread overview]
Message-ID: <d7bbc0041e076077d3c3299858799cc9907d9831.1637056179.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.981.v4.git.1637056178.gitgitgadget@gmail.com>

From: Phillip Wood <phillip.wood@dunelm.org.uk>

If we already have a block of potentially moved lines then as we move
down the diff we need to check if the next line of each potentially
moved line matches the current line of the diff. The implementation of
--color-moved-ws=allow-indentation-change was needlessly performing
this check on all the lines in the diff that matched the current line
rather than just the current line. To exacerbate the problem finding
all the other lines in the diff that match the current line involves a
fuzzy lookup so we were wasting even more time performing a second
comparison to filter out the non-matching lines. Fixing this reduces
time to run
  git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0
by 93% compared to master and simplifies the code.

Test                                                                  HEAD^              HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38 (0.35+0.03)   0.38(0.35+0.03)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.86 (0.80+0.06)   0.87(0.83+0.04)  +1.2%
4002.3: diff --color-moved-ws=allow-indentation-change large change  19.01(18.93+0.06)   0.97(0.92+0.04) -94.9%
4002.4: log --no-color-moved --no-color-moved-ws                      1.16 (1.06+0.09)   1.17(1.06+0.10)  +0.9%
4002.5: log --color-moved --no-color-moved-ws                         1.32 (1.25+0.07)   1.32(1.24+0.08)  +0.0%
4002.6: log --color-moved-ws=allow-indentation-change                 1.71 (1.64+0.06)   1.36(1.25+0.10) -20.5%

Test                                                                  master             HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38 (0.33+0.05)   0.38(0.35+0.03)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.80 (0.75+0.04)   0.87(0.83+0.04)  +8.7%
4002.3: diff --color-moved-ws=allow-indentation-change large change  14.20(14.15+0.05)   0.97(0.92+0.04) -93.2%
4002.4: log --no-color-moved --no-color-moved-ws                      1.15 (1.05+0.09)   1.17(1.06+0.10)  +1.7%
4002.5: log --color-moved --no-color-moved-ws                         1.30 (1.19+0.11)   1.32(1.24+0.08)  +1.5%
4002.6: log --color-moved-ws=allow-indentation-change                 1.70 (1.63+0.06)   1.36(1.25+0.10) -20.0%

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff.c | 70 +++++++++++++++++-----------------------------------------
 1 file changed, 20 insertions(+), 50 deletions(-)

diff --git a/diff.c b/diff.c
index 9aff167be27..78a486021ab 100644
--- a/diff.c
+++ b/diff.c
@@ -879,37 +879,21 @@ static int compute_ws_delta(const struct emitted_diff_symbol *a,
 	return 1;
 }
 
-static int cmp_in_block_with_wsd(const struct diff_options *o,
-				 const struct moved_entry *cur,
-				 const struct moved_entry *match,
-				 struct moved_block *pmb,
-				 int n)
-{
-	struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
-	int al = cur->es->len, bl = match->es->len, cl = l->len;
+static int cmp_in_block_with_wsd(const struct moved_entry *cur,
+				 const struct emitted_diff_symbol *l,
+				 struct moved_block *pmb)
+{
+	int al = cur->es->len, bl = l->len;
 	const char *a = cur->es->line,
-		   *b = match->es->line,
-		   *c = l->line;
+		   *b = l->line;
 	int a_off = cur->es->indent_off,
 	    a_width = cur->es->indent_width,
-	    c_off = l->indent_off,
-	    c_width = l->indent_width;
+	    b_off = l->indent_off,
+	    b_width = l->indent_width;
 	int delta;
 
-	/*
-	 * We need to check if 'cur' is equal to 'match'.  As those
-	 * are from the same (+/-) side, we do not need to adjust for
-	 * indent changes. However these were found using fuzzy
-	 * matching so we do have to check if they are equal. Here we
-	 * just check the lengths. We delay calling memcmp() to check
-	 * the contents until later as if the length comparison for a
-	 * and c fails we can avoid the call all together.
-	 */
-	if (al != bl)
-		return 1;
-
 	/* If 'l' and 'cur' are both blank then they match. */
-	if (a_width == INDENT_BLANKLINE && c_width == INDENT_BLANKLINE)
+	if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE)
 		return 0;
 
 	/*
@@ -918,7 +902,7 @@ static int cmp_in_block_with_wsd(const struct diff_options *o,
 	 * match those of the current block and that the text of 'l' and 'cur'
 	 * after the indentation match.
 	 */
-	delta = c_width - a_width;
+	delta = b_width - a_width;
 
 	/*
 	 * If the previous lines of this block were all blank then set its
@@ -927,9 +911,8 @@ static int cmp_in_block_with_wsd(const struct diff_options *o,
 	if (pmb->wsd == INDENT_BLANKLINE)
 		pmb->wsd = delta;
 
-	return !(delta == pmb->wsd && al - a_off == cl - c_off &&
-		 !memcmp(a, b, al) && !
-		 memcmp(a + a_off, c + c_off, al - a_off));
+	return !(delta == pmb->wsd && al - a_off == bl - b_off &&
+		 !memcmp(a + a_off, b + b_off, al - a_off));
 }
 
 static int moved_entry_cmp(const void *hashmap_cmp_fn_data,
@@ -1030,36 +1013,23 @@ static void pmb_advance_or_null(struct diff_options *o,
 }
 
 static void pmb_advance_or_null_multi_match(struct diff_options *o,
-					    struct moved_entry *match,
-					    struct hashmap *hm,
+					    struct emitted_diff_symbol *l,
 					    struct moved_block *pmb,
-					    int pmb_nr, int n)
+					    int pmb_nr)
 {
 	int i;
-	char *got_match = xcalloc(1, pmb_nr);
-
-	hashmap_for_each_entry_from(hm, match, ent) {
-		for (i = 0; i < pmb_nr; i++) {
-			struct moved_entry *prev = pmb[i].match;
-			struct moved_entry *cur = (prev && prev->next_line) ?
-					prev->next_line : NULL;
-			if (!cur)
-				continue;
-			if (!cmp_in_block_with_wsd(o, cur, match, &pmb[i], n))
-				got_match[i] |= 1;
-		}
-	}
 
 	for (i = 0; i < pmb_nr; i++) {
-		if (got_match[i]) {
+		struct moved_entry *prev = pmb[i].match;
+		struct moved_entry *cur = (prev && prev->next_line) ?
+			prev->next_line : NULL;
+		if (cur && !cmp_in_block_with_wsd(cur, l, &pmb[i])) {
 			/* Advance to the next line */
-			pmb[i].match = pmb[i].match->next_line;
+			pmb[i].match = cur;
 		} else {
 			moved_block_clear(&pmb[i]);
 		}
 	}
-
-	free(got_match);
 }
 
 static int shrink_potential_moved_blocks(struct moved_block *pmb,
@@ -1223,7 +1193,7 @@ static void mark_color_as_moved(struct diff_options *o,
 
 		if (o->color_moved_ws_handling &
 		    COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE)
-			pmb_advance_or_null_multi_match(o, match, hm, pmb, pmb_nr, n);
+			pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr);
 		else
 			pmb_advance_or_null(o, match, hm, pmb, pmb_nr);
 
-- 
gitgitgadget


  parent reply	other threads:[~2021-11-16  9:50 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 13:04 [PATCH 00/10] diff --color-moved[-ws] speedups Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 01/10] diff --color-moved=zerba: fix alternate coloring Phillip Wood via GitGitGadget
2021-06-15  3:24   ` Junio C Hamano
2021-06-15 11:22     ` Phillip Wood
2021-06-14 13:04 ` [PATCH 02/10] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 03/10] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 04/10] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 05/10] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 06/10] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 07/10] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 08/10] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 09/10] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-07-09 15:36   ` Elijah Newren
2021-06-14 13:04 ` [PATCH 10/10] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-06-16 14:24 ` [PATCH 00/10] diff --color-moved[-ws] speedups Ævar Arnfjörð Bjarmason
2021-06-21 10:03   ` Phillip Wood
2021-07-20 10:36 ` [PATCH v2 00/12] " Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 01/12] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 02/12] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 03/12] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 04/12] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 05/12] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 06/12] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 07/12] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 08/12] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 09/12] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 10/12] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 11/12] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 12/12] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-07-20 13:38   ` [PATCH v2 00/12] diff --color-moved[-ws] speedups Phillip Wood
2021-10-27 12:04   ` [PATCH v3 00/15] " Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-10-28 21:32       ` Junio C Hamano
2021-10-29 10:24         ` Phillip Wood
2021-10-29 11:06           ` Ævar Arnfjörð Bjarmason
2021-11-10 11:05             ` Phillip Wood
2021-10-27 12:04     ` [PATCH v3 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-10-28 21:51       ` Junio C Hamano
2021-10-29 10:35         ` Phillip Wood
2021-10-27 12:04     ` [PATCH v3 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 06/15] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-10-27 13:28     ` [PATCH v3 00/15] diff --color-moved[-ws] speedups Phillip Wood
2021-11-16  9:49     ` [PATCH v4 " Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-11-22 13:34         ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 06/15] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-11-22 14:18         ` Johannes Schindelin
2021-11-22 19:00           ` Phillip Wood
2021-11-22 21:54             ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-11-16  9:49       ` Phillip Wood via GitGitGadget [this message]
2021-11-23 14:51         ` [PATCH v4 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-11-23 15:09         ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-12-08 12:30       ` [PATCH v4 00/15] diff --color-moved[-ws] speedups Johannes Schindelin
2021-12-09 10:29       ` [PATCH v5 " Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 06/15] diff --color-moved: avoid false short line matches and bad zebra coloring Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7bbc0041e076077d3c3299858799cc9907d9831.1637056179.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=phillip.wood123@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).