git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Stefan Beller <sbeller@google.com>
To: gitster@pobox.com
Cc: git@vger.kernel.org, bmwill@google.com, jrnieder@gmail.com, jonathantanmy@google.com, peff@peff.net, mhagger@alum.mit.edu, Stefan Beller <sbeller@google.com>
Subject: [PATCHv4 17/17] diff.c: color moved lines differently
Date: Mon, 22 May 2017 19:40:48 -0700
Message-ID: <20170523024048.16879-18-sbeller@google.com> (raw)
In-Reply-To: <20170523024048.16879-1-sbeller@google.com>

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

Adjacent blocks are colored differently. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why the alternative coloring is really needed.

As the reviewers attention should be brought to the places, where the
difference is introduced to the moved code, we cannot just have one new
color for all of moved code.

First I implemented an alternative design, which would show a moved hunk
in one color, and its boundaries in another color. This idea was error
prone as it inspected each line and its neighboring lines to determine
if the line was (a) moved and (b) if was deep inside a hunk by having
matching neighboring lines. This is unreliable as the we can construct
hunks which have equal neighbors that just exceed the number of lines
inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that
is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then switches color to the alternative color
for the next hunk. By doing this any permutation is recognized and
displayed. That implies that there is no dedicated boundary or
inside-hunk color, but instead we'll have just two colors alternating
for hunks.

It would be a bit more UX friendly if the two corresponding hunks
(of added and deleted lines) for one move would get the same color id.
(Both get "regular moved" or "alternative moved"). This problem is
deferred to a later patch for now.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

# Conflicts:
#	diff.c
---
 Documentation/config.txt   |  14 ++-
 diff.c                     | 275 +++++++++++++++++++++++++++++++++++++++++++--
 diff.h                     |   9 +-
 t/t4015-diff-whitespace.sh | 267 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 552 insertions(+), 13 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..902d017c3b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,14 +1051,24 @@ This does not affect linkgit:git-format-patch[1] or the
 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
 command line with the `--color[=<when>]` option.
 
+color.moved::
+	A boolean value, whether a diff should color moved lines
+	differently. The moved lines are searched for in the diff only.
+	Duplicated lines from somewhere in the project that are not
+	part of the diff are not colored as moved.
+	Defaults to false.
+
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
 	which part of the patch to use the specified color, and is one
 	of `context` (context text - `plain` is a historical synonym),
 	`meta` (metainformation), `frag`
 	(hunk header), 'func' (function in hunk header), `old` (removed lines),
-	`new` (added lines), `commit` (commit headers), or `whitespace`
-	(highlighting whitespace errors).
+	`new` (added lines), `commit` (commit headers), `whitespace`
+	(highlighting whitespace errors), `oldMoved` (removed lines that
+	reappear), `newMoved` (added lines that were removed elsewhere),
+	`oldMovedAlternative` and `newMovedAlternative` (as a fallback to
+	cover adjacent blocks of moved code)
 
 color.decorate.<slot>::
 	Use customized color for 'git log --decorate' output.  `<slot>` is one
diff --git a/diff.c b/diff.c
index c0b8afa38f..23e70d348e 100644
--- a/diff.c
+++ b/diff.c
@@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_BOLD_RED,	/* OLD_MOVED_A */
+	GIT_COLOR_BG_RED,	/* OLD_MOVED_B */
+	GIT_COLOR_BOLD_GREEN,	/* NEW_MOVED_A */
+	GIT_COLOR_BG_GREEN,	/* NEW_MOVED_B */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -234,6 +247,10 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "color.moved")) {
+		diff_color_moved_default = git_config_bool(var, value);
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -354,6 +371,88 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct diff_line *line;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct diff_line *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
+					 const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
+				   const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return diff_line_cmp(a->line, b->line, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
+}
+
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(line, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(line->line, line->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct diff_line *l = &o->line_buffer[line_no];
+
+	ret->ent.hash = get_line_hash(l, ignore_ws);
+	ret->line = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
 static char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
@@ -516,6 +615,141 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		int sign = 0;
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->line_buffer[n].sign) {
+		case '+':
+			sign = '+';
+			hm = add_lines;
+			break;
+		case '-':
+			sign = '-';
+			hm = del_lines;
+			break;
+		case ' ':
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line &&
+		    prev_line->line->sign == sign)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	int pmb_nr = 0, pmb_alloc = 0;
+	int use_alt_color = 0;
+	int n;
+
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct diff_line *l = &o->line_buffer[n];
+		int i, lp, rp;
+
+		switch (l->sign) {
+		case '+':
+			hm = del_lines;
+			break;
+		case '-':
+			hm = add_lines;
+			break;
+		default:
+			use_alt_color = 0;
+			pmb_nr = 0; /* no running sets */
+			continue;
+		}
+
+		/* Check for any match to color it as a move. */
+		key = prepare_entry(o, n);
+		match = hashmap_get(hm, key, o);
+		free(key);
+		if (!match)
+			continue;
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !diff_line_cmp(pnext->line, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set to the remaining runs */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		if (rp > -1) {
+			/* Remember the number of running sets */
+			pmb_nr = rp + 1;
+		} else {
+			/* Toggle color */
+			use_alt_color = (use_alt_color + 1) % 2;
+
+			/* Build up a new set */
+			pmb_nr = 0;
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+		}
+
+		switch (l->sign) {
+		case '+':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_NEW_MOVED + use_alt_color);
+			break;
+		case '-':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_OLD_MOVED + use_alt_color);
+			break;
+		default:
+			die("BUG: we should have continued earlier?");
+		}
+	}
+	free(pmb);
+}
+
 static void emit_diff_line(struct diff_options *o,
 				     struct diff_line *e)
 {
@@ -3518,6 +3752,8 @@ void diff_setup(struct diff_options *options)
 	options->line_buffer = NULL;
 	options->line_buffer_nr = 0;
 	options->line_buffer_alloc = 0;
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3627,6 +3863,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4051,6 +4290,10 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
+	else if (!strcmp(arg, "--color-moved"))
+		options->color_moved = 1;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = 0;
 	else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
@@ -4856,16 +5099,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_diff_line, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this flag in a later patch.
-	 */
-	o->use_buffer = 1;
+
+	if (o->color_moved)
+		o->use_buffer = 1;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -4874,6 +5110,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->use_buffer) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < o->line_buffer_nr; i++)
 			emit_diff_line(o, &o->line_buffer[i]);
 
@@ -4962,6 +5216,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index fad1258556..445259ebf7 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "hashmap.h"
 
 struct rev_info;
 struct diff_options;
@@ -228,6 +229,8 @@ struct diff_options {
 
 	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
+
+	int color_moved;
 };
 
 /* Emit [line_prefix] [set] line [reset] */
@@ -243,7 +246,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..0e92bf94bf 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,271 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect moved code, inside file' '
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BRED>-if (!u->is_allowed_foo)<RESET>
+	<BRED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BRED>-}<RESET>
+	<BRED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..e34eb69 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BGREEN>+<RESET><BGREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BGREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect permutations inside moved code' '
+	# reusing the move example from last test:
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			foo(u);
+			if (!u->is_allowed_foo)
+				return;
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BOLD;RED>-if (!u->is_allowed_foo)<RESET>
+	<BOLD;RED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BOLD;RED>-}<RESET>
+	<BOLD;RED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..2bedec9 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>}<RESET>
+	<BOLD;GREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.13.0.18.g7d86cc8ba0


  parent reply index

Thread overview: 128+ messages in thread (expand / mbox.gz / Atom feed / [top])
2017-05-14  4:00 [RFC PATCH 00/19] Diff machine: highlight moved lines Stefan Beller
2017-05-14  4:00 ` [PATCH 01/19] diff: readability fix Stefan Beller
2017-05-14  4:01 ` [PATCH 02/19] diff: move line ending check into emit_hunk_header Stefan Beller
2017-05-15  6:48   ` Junio C Hamano
2017-05-15 16:13     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0 Stefan Beller
2017-05-15 18:26   ` Jonathan Tan
2017-05-15 18:33     ` Stefan Beller
2017-05-16 16:05       ` Jonathan Tan
2017-05-15 19:22   ` Brandon Williams
2017-05-15 19:35     ` Stefan Beller
2017-05-15 19:45       ` Brandon Williams
2017-05-14  4:01 ` [PATCH 04/19] diff.c: factor out diff_flush_patch_all_file_pairs Stefan Beller
2017-05-14  4:01 ` [PATCH 05/19] diff.c: emit_line_0 can handle no color setting Stefan Beller
2017-05-15 18:31   ` Jonathan Tan
2017-05-15 22:11     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 06/19] diff: add emit_line_fmt Stefan Beller
2017-05-15 19:31   ` Brandon Williams
2017-05-14  4:01 ` [PATCH 07/19] diff.c: convert fn_out_consume to use emit_line_* Stefan Beller
2017-05-16  1:00   ` Junio C Hamano
2017-05-16  1:05     ` Junio C Hamano
2017-05-16 16:23       ` Stefan Beller
2017-05-14  4:01 ` [PATCH 08/19] diff.c: convert builtin_diff " Stefan Beller
2017-05-15 18:42   ` Jonathan Tan
2017-05-14  4:01 ` [PATCH 09/19] diff.c: convert emit_rewrite_diff " Stefan Beller
2017-05-14  4:01 ` [PATCH 10/19] diff.c: convert emit_rewrite_lines " Stefan Beller
2017-05-15 19:09   ` Jonathan Tan
2017-05-15 19:31     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 11/19] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
2017-05-14  4:01 ` [PATCH 12/19] diff.c: convert emit_binary_diff_body to use emit_line_* Stefan Beller
2017-05-14  4:01 ` [PATCH 13/19] diff.c: convert show_stats " Stefan Beller
2017-05-14  4:01 ` [PATCH 14/19] diff.c: convert word diffing " Stefan Beller
2017-05-15 22:40   ` Jonathan Tan
2017-05-15 23:12     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 15/19] diff.c: convert diff_flush " Stefan Beller
2017-05-15 20:21   ` Jonathan Tan
2017-05-15 22:08     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 16/19] diff.c: convert diff_summary " Stefan Beller
2017-05-14  4:01 ` [PATCH 17/19] diff.c: factor out emit_line_ws for coloring whitespaces Stefan Beller
2017-05-14  4:01 ` [PATCH 18/19] diff: buffer all output if asked to Stefan Beller
2017-05-14  4:06   ` Jeff King
2017-05-14  4:25     ` Stefan Beller
2017-05-16  4:14   ` Jonathan Tan
2017-05-16 16:42     ` Stefan Beller
2017-05-14  4:01 ` [PATCH 19/19] diff.c: color moved lines differently Stefan Beller
2017-05-15 22:42   ` Brandon Williams
2017-05-16  4:34   ` Jonathan Tan
2017-05-16 12:31   ` Jeff King
2017-05-15 12:43 ` [RFC PATCH 00/19] Diff machine: highlight moved lines Junio C Hamano
2017-05-15 16:33   ` Stefan Beller
2017-05-17  2:58 ` [PATCHv2 00/20] " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 01/20] diff: readability fix Stefan Beller
2017-05-17  2:58   ` [PATCHv2 02/20] diff: move line ending check into emit_hunk_header Stefan Beller
2017-05-17  2:58   ` [PATCHv2 03/20] diff.c: factor out diff_flush_patch_all_file_pairs Stefan Beller
2017-05-17  2:58   ` [PATCHv2 04/20] diff.c: teach emit_line_0 to accept sign parameter Stefan Beller
2017-05-17  2:58   ` [PATCHv2 05/20] diff.c: emit_line_0 can handle no color setting Stefan Beller
2017-05-17  2:58   ` [PATCHv2 06/20] diff.c: emit_line_0 takes parameter whether to output line prefix Stefan Beller
2017-05-17  2:58   ` [PATCHv2 07/20] diff.c: inline emit_line_0 into emit_line Stefan Beller
2017-05-17  2:58   ` [PATCHv2 08/20] diff.c: convert fn_out_consume to use " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 09/20] diff.c: convert builtin_diff to use emit_line_* Stefan Beller
2017-05-17  2:58   ` [PATCHv2 10/20] diff.c: convert emit_rewrite_diff " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 11/20] diff.c: convert emit_rewrite_lines " Stefan Beller
2017-05-17  5:03     ` Junio C Hamano
2017-05-17 21:16       ` Stefan Beller
2017-05-18  3:35     ` Junio C Hamano
2017-05-17  2:58   ` [PATCHv2 12/20] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
2017-05-17  5:19     ` Junio C Hamano
2017-05-17 21:05       ` Stefan Beller
2017-05-18  3:25         ` Junio C Hamano
2017-05-18 17:12           ` Stefan Beller
2017-05-20  4:50             ` Junio C Hamano
2017-05-20 22:00               ` Stefan Beller
2017-05-17  2:58   ` [PATCHv2 13/20] diff.c: convert emit_binary_diff_body to use emit_line_* Stefan Beller
2017-05-17  2:58   ` [PATCHv2 14/20] diff.c: convert show_stats " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 15/20] diff.c: convert word diffing " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 16/20] diff.c: convert diff_flush " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 17/20] diff.c: convert diff_summary " Stefan Beller
2017-05-17  2:58   ` [PATCHv2 18/20] diff.c: emit_line includes whitespace highlighting Stefan Beller
2017-05-17  2:58   ` [PATCHv2 19/20] diff: buffer all output if asked to Stefan Beller
2017-05-17  2:58   ` [PATCHv2 20/20] diff.c: color moved lines differently Stefan Beller
2017-05-18 19:37   ` [PATCHv3 00/20] Diff machine: highlight moved lines Stefan Beller
2017-05-18 19:37     ` [PATCHv3 01/20] diff: readability fix Stefan Beller
2017-05-18 19:37     ` [PATCHv3 02/20] diff: move line ending check into emit_hunk_header Stefan Beller
2017-05-18 19:37     ` [PATCHv3 03/20] diff.c: factor out diff_flush_patch_all_file_pairs Stefan Beller
2017-05-18 19:37     ` [PATCHv3 04/20] diff.c: teach emit_line_0 to accept sign parameter Stefan Beller
2017-05-18 23:33       ` Jonathan Tan
2017-05-22 23:36         ` Stefan Beller
2017-05-18 19:37     ` [PATCHv3 05/20] diff.c: emit_line_0 can handle no color setting Stefan Beller
2017-05-18 19:37     ` [PATCHv3 06/20] diff.c: emit_line_0 takes parameter whether to output line prefix Stefan Beller
2017-05-18 19:37     ` [PATCHv3 07/20] diff.c: inline emit_line_0 into emit_line Stefan Beller
2017-05-18 19:37     ` [PATCHv3 08/20] diff.c: convert fn_out_consume to use " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 09/20] diff.c: convert builtin_diff to use emit_line_* Stefan Beller
2017-05-18 19:37     ` [PATCHv3 10/20] diff.c: convert emit_rewrite_diff " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 11/20] diff.c: convert emit_rewrite_lines " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 12/20] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
2017-05-18 19:37     ` [PATCHv3 13/20] diff.c: convert emit_binary_diff_body to use emit_line_* Stefan Beller
2017-05-18 19:37     ` [PATCHv3 14/20] diff.c: convert show_stats " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 15/20] diff.c: convert word diffing " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 16/20] diff.c: convert diff_flush " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 17/20] diff.c: convert diff_summary " Stefan Beller
2017-05-18 19:37     ` [PATCHv3 18/20] diff.c: emit_line includes whitespace highlighting Stefan Beller
2017-05-18 19:37     ` [PATCHv3 19/20] diff: buffer all output if asked to Stefan Beller
2017-05-18 19:37     ` [PATCHv3 20/20] diff.c: color moved lines differently Stefan Beller
2017-05-19 18:23       ` Jonathan Tan
2017-05-19 18:40         ` Stefan Beller
2017-05-19 19:34           ` Jonathan Tan
2017-05-23  2:40     ` [PATCHv4 00/17] Diff machine: highlight moved lines Stefan Beller
2017-05-23  2:40       ` [PATCHv4 01/17] diff: readability fix Stefan Beller
2017-05-23  2:40       ` [PATCHv4 02/17] diff: move line ending check into emit_hunk_header Stefan Beller
2017-05-23  2:40       ` [PATCHv4 03/17] diff.c: factor out diff_flush_patch_all_file_pairs Stefan Beller
2017-05-23  2:40       ` [PATCHv4 04/17] diff: introduce more flexible emit function Stefan Beller
2017-05-23  2:40       ` [PATCHv4 05/17] diff.c: convert fn_out_consume to use emit_line Stefan Beller
2017-05-23  2:40       ` [PATCHv4 06/17] diff.c: convert builtin_diff to use emit_line_* Stefan Beller
2017-05-23  2:40       ` [PATCHv4 07/17] diff.c: convert emit_rewrite_diff " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 08/17] diff.c: convert emit_rewrite_lines " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
2017-05-23  5:59         ` Junio C Hamano
2017-05-23 18:14           ` Stefan Beller
2017-05-23  2:40       ` [PATCHv4 10/17] diff.c: convert emit_binary_diff_body to use emit_line_* Stefan Beller
2017-05-23  2:40       ` [PATCHv4 11/17] diff.c: convert show_stats " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 12/17] diff.c: convert word diffing " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 13/17] diff.c: convert diff_flush " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 14/17] diff.c: convert diff_summary " Stefan Beller
2017-05-23  2:40       ` [PATCHv4 15/17] diff.c: emit_line includes whitespace highlighting Stefan Beller
2017-05-23  2:40       ` [PATCHv4 16/17] diff: buffer all output if asked to Stefan Beller
2017-05-23  2:40       ` Stefan Beller [this message]
2017-05-27  1:04       ` [PATCHv4 00/17] Diff machine: highlight moved lines Jacob Keller
2017-05-30 21:38         ` Stefan Beller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply to all the recipients using the --to, --cc,
  and --in-reply-to switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170523024048.16879-18-sbeller@google.com \
    --to=sbeller@google.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox