git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/6] Fixing up sb/diff-color-moved
@ 2017-06-28  0:56 Stefan Beller
  2017-06-28  0:56 ` [PATCH 1/6] diff.c: factor out shrinking of potential moved line blocks Stefan Beller
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

This goes on top of sb/diff-color-moved.

Each patch is written as if it can go in as a normal patch,
but I intend to squash them into the series appropriately if
there is more feedback on that series in general.

Stefan Beller (6):
  diff.c: factor out shrinking of potential moved line blocks
  diff.c: change the default for move coloring to zebra
  diff.c: better reporting on color.moved bogus configuration
  Documentation/diff: reword color moved
  diff.c: omit uninteresting moved lines
  diff.c: detect blocks despite whitespace changes

 Documentation/config.txt       |  6 ++--
 Documentation/diff-options.txt | 10 ++++--
 diff.c                         | 79 ++++++++++++++++++++++++++++--------------
 diff.h                         |  2 ++
 t/t4015-diff-whitespace.sh     | 65 ++++++++++++++++++++++++++++++++++
 5 files changed, 131 insertions(+), 31 deletions(-)

-- 
2.13.0.31.g9b732c453e


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] diff.c: factor out shrinking of potential moved line blocks
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  0:56 ` [PATCH 2/6] diff.c: change the default for move coloring to zebra Stefan Beller
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

This is cleaner and keeps the rather large function
that performs the move detection smaller.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c | 50 +++++++++++++++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/diff.c b/diff.c
index 82ace48c38..5311dcf133 100644
--- a/diff.c
+++ b/diff.c
@@ -808,6 +808,33 @@ static void add_lines_to_move_detection(struct diff_options *o,
 	}
 }
 
+static int shrink_potential_moved_blocks(struct moved_entry **pmb,
+					 int pmb_nr)
+{
+	int lp, rp;
+
+	/* Shrink the set of potential block to the remaining running */
+	for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+		while (lp < pmb_nr && pmb[lp])
+			lp++;
+		/* lp points at the first NULL now */
+
+		while (rp > -1 && !pmb[rp])
+			rp--;
+		/* rp points at the last non-NULL */
+
+		if (lp < pmb_nr && rp > -1 && lp < rp) {
+			pmb[lp] = pmb[rp];
+			pmb[rp] = NULL;
+			rp--;
+			lp++;
+		}
+	}
+
+	/* Remember the number of running sets */
+	return rp + 1;
+}
+
 /* Find blocks of moved code, delegate actual coloring decision to helper */
 static void mark_color_as_moved(struct diff_options *o,
 				struct hashmap *add_lines,
@@ -822,7 +849,7 @@ static void mark_color_as_moved(struct diff_options *o,
 		struct moved_entry *key;
 		struct moved_entry *match = NULL;
 		struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
-		int i, lp, rp;
+		int i;
 
 		switch (l->s) {
 		case DIFF_SYMBOL_PLUS:
@@ -864,26 +891,7 @@ static void mark_color_as_moved(struct diff_options *o,
 			}
 		}
 
-		/* Shrink the set of potential block to the remaining running */
-		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
-			while (lp < pmb_nr && pmb[lp])
-				lp++;
-			/* lp points at the first NULL now */
-
-			while (rp > -1 && !pmb[rp])
-				rp--;
-			/* rp points at the last non-NULL */
-
-			if (lp < pmb_nr && rp > -1 && lp < rp) {
-				pmb[lp] = pmb[rp];
-				pmb[rp] = NULL;
-				rp--;
-				lp++;
-			}
-		}
-
-		/* Remember the number of running sets */
-		pmb_nr = rp + 1;
+		pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr);
 
 		if (pmb_nr == 0) {
 			/*
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] diff.c: change the default for move coloring to zebra
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
  2017-06-28  0:56 ` [PATCH 1/6] diff.c: factor out shrinking of potential moved line blocks Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  3:14   ` Junio C Hamano
  2017-06-28  0:56 ` [PATCH 3/6] diff.c: better reporting on color.moved bogus configuration Stefan Beller
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

Introduce a new mode COLOR_MOVED_DEFAULT, which is the same as
COLOR_MOVED_ZEBRA. But having two different symbols allows us to
differentiate them in the code.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/diff-options.txt |  3 +++
 diff.c                         | 13 ++++++++++++-
 diff.h                         |  1 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 058c8014ed..d2c6a60af2 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -243,6 +243,9 @@ endif::git-diff[]
 --
 no::
 	Moved lines are not highlighted.
+default::
+	Is a synonym for `zebra`. This may change to more sensible modes
+	in the future.
 plain::
 	Any line that is added in one location and was removed
 	in another location will be colored with 'color.diff.newMoved'.
diff --git a/diff.c b/diff.c
index 5311dcf133..31cdec05ac 100644
--- a/diff.c
+++ b/diff.c
@@ -256,12 +256,23 @@ int git_diff_heuristic_config(const char *var, const char *value, void *cb)
 
 static int parse_color_moved(const char *arg)
 {
+	int v = git_parse_maybe_bool(arg);
+
+	if (v != -1) {
+		if (v == 0)
+			return COLOR_MOVED_NO;
+		else if (v == 1)
+			return COLOR_MOVED_DEFAULT;
+	}
+
 	if (!strcmp(arg, "no"))
 		return COLOR_MOVED_NO;
 	else if (!strcmp(arg, "plain"))
 		return COLOR_MOVED_PLAIN;
 	else if (!strcmp(arg, "zebra"))
 		return COLOR_MOVED_ZEBRA;
+	else if (!strcmp(arg, "default"))
+		return COLOR_MOVED_DEFAULT;
 	else if (!strcmp(arg, "dimmed_zebra"))
 		return COLOR_MOVED_ZEBRA_DIM;
 	else
@@ -4654,7 +4665,7 @@ int diff_opt_parse(struct diff_options *options,
 		if (diff_color_moved_default)
 			options->color_moved = diff_color_moved_default;
 		if (options->color_moved == COLOR_MOVED_NO)
-			options->color_moved = COLOR_MOVED_ZEBRA_DIM;
+			options->color_moved = COLOR_MOVED_DEFAULT;
 	} else if (!strcmp(arg, "--no-color-moved"))
 		options->color_moved = COLOR_MOVED_NO;
 	else if (skip_prefix(arg, "--color-moved=", &arg)) {
diff --git a/diff.h b/diff.h
index 98abd75521..9298d211d7 100644
--- a/diff.h
+++ b/diff.h
@@ -192,6 +192,7 @@ struct diff_options {
 		COLOR_MOVED_NO = 0,
 		COLOR_MOVED_PLAIN = 1,
 		COLOR_MOVED_ZEBRA = 2,
+		COLOR_MOVED_DEFAULT = 2,
 		COLOR_MOVED_ZEBRA_DIM = 3,
 	} color_moved;
 };
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] diff.c: better reporting on color.moved bogus configuration
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
  2017-06-28  0:56 ` [PATCH 1/6] diff.c: factor out shrinking of potential moved line blocks Stefan Beller
  2017-06-28  0:56 ` [PATCH 2/6] diff.c: change the default for move coloring to zebra Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  0:56 ` [PATCH 4/6] Documentation/diff: reword color moved Stefan Beller
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 31cdec05ac..015c854530 100644
--- a/diff.c
+++ b/diff.c
@@ -276,7 +276,7 @@ static int parse_color_moved(const char *arg)
 	else if (!strcmp(arg, "dimmed_zebra"))
 		return COLOR_MOVED_ZEBRA_DIM;
 	else
-		return -1;
+		return error(_("color moved setting must be one of 'default', 'plain', 'zebra', 'dimmed_zebra'"));
 }
 
 int git_diff_ui_config(const char *var, const char *value, void *cb)
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] Documentation/diff: reword color moved
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
                   ` (2 preceding siblings ...)
  2017-06-28  0:56 ` [PATCH 3/6] diff.c: better reporting on color.moved bogus configuration Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  3:25   ` Junio C Hamano
  2017-06-28  0:56 ` [PATCH 5/6] diff.c: omit uninteresting moved lines Stefan Beller
  2017-06-28  0:56 ` [PATCH 6/6] diff.c: detect blocks despite whitespace changes Stefan Beller
  5 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

This is easier for the casual reader.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt       | 6 ++++--
 Documentation/diff-options.txt | 7 ++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 29e0b9fa69..3d89be2d84 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1052,8 +1052,10 @@ This does not affect linkgit:git-format-patch[1] or the
 command line with the `--color[=<when>]` option.
 
 diff.colorMoved::
-	If set moved lines in a diff are colored differently,
-	for details see '--color-moved' in linkgit:git-diff[1].
+	If set to either a valid `<mode>` or a true value, moved lines
+	in a diff are colored differently, for details of valid modes
+	see '--color-moved' in linkgit:git-diff[1]. If simply set to
+	true the default color mode will be used.
 
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index d2c6a60af2..d4dc46ee2f 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -252,9 +252,10 @@ plain::
 	Similarly 'color.diff.oldMoved' will be used for removed lines
 	that are added somewhere else in the diff.
 zebra::
-	Blocks of moved code are detected. The detected blocks are
-	painted using the 'color.diff.{old,new}Moved' alternating with
-	'color.diff.{old,new}MovedAlternative'.
+	Blocks of moved code are detected greedily. The detected blocks are
+	painted using either the 'color.diff.{old,new}Moved' color or
+	'color.diff.{old,new}MovedAlternative'. The change between
+	the two colors indicates that a new block was detected.
 dimmed_zebra::
 	Similar to 'zebra', but additional dimming of uninteresting parts
 	of moved code is performed. The bordering lines of two adjacent
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] diff.c: omit uninteresting moved lines
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
                   ` (3 preceding siblings ...)
  2017-06-28  0:56 ` [PATCH 4/6] Documentation/diff: reword color moved Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  3:31   ` Junio C Hamano
  2017-06-28  0:56 ` [PATCH 6/6] diff.c: detect blocks despite whitespace changes Stefan Beller
  5 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.

While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.

One of the first solutions considered, started off by these hypothesis':
  (a) The more blocks of the same code we have, the less interesting it is.
  (b) The shorter a block of moved code is the less need of markup there
      is for review.

      Introduce a heuristic which drops any potential moved blocks if their
      length is shorter than the number of potential moved blocks.

      This heuristic was chosen as it is agnostic of the content (in other
      languages or contents to manage, we may have longer lines, e.g. in
      shell the closing of a condition is already 2 characters. Thinking
      about Latex documents tracked in Git, there can also be some
      boilerplate code with lots of characters) while taking both
      hypothesis' into account. An alternative considered was the number
      of non-whitespace characters in a line for example.

Thinking further about this, a linear relation between number of moved
blocks and number of lines of code seems like a bad idea to start with.
So let's start with a simpler solution of hardcoding the number of lines
to 3.

Note, that the length is applied across all blocks to find the 'lonely'
blocks that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c | 11 ++++++++++-
 diff.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 015c854530..1d93e98e3a 100644
--- a/diff.c
+++ b/diff.c
@@ -853,7 +853,8 @@ static void mark_color_as_moved(struct diff_options *o,
 {
 	struct moved_entry **pmb = NULL; /* potentially moved blocks */
 	int pmb_nr = 0, pmb_alloc = 0;
-	int n, flipped_block = 1;
+	int n, flipped_block = 1, block_length = 0;
+
 
 	for (n = 0; n < o->emitted_symbols->nr; n++) {
 		struct hashmap *hm = NULL;
@@ -880,11 +881,19 @@ static void mark_color_as_moved(struct diff_options *o,
 		}
 
 		if (!match) {
+			if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH) {
+				for (i = 0; i < block_length + 1; i++) {
+					l = &o->emitted_symbols->buf[n - i];
+					l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
+				}
+			}
 			pmb_nr = 0;
+			block_length = 0;
 			continue;
 		}
 
 		l->flags |= DIFF_SYMBOL_MOVED_LINE;
+		block_length++;
 
 		if (o->color_moved == COLOR_MOVED_PLAIN)
 			continue;
diff --git a/diff.h b/diff.h
index 9298d211d7..cc1224a93b 100644
--- a/diff.h
+++ b/diff.h
@@ -195,6 +195,7 @@ struct diff_options {
 		COLOR_MOVED_DEFAULT = 2,
 		COLOR_MOVED_ZEBRA_DIM = 3,
 	} color_moved;
+	#define COLOR_MOVED_MIN_BLOCK_LENGTH 3
 };
 
 void diff_emit_submodule_del(struct diff_options *o, const char *line);
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
                   ` (4 preceding siblings ...)
  2017-06-28  0:56 ` [PATCH 5/6] diff.c: omit uninteresting moved lines Stefan Beller
@ 2017-06-28  0:56 ` Stefan Beller
  2017-06-28  3:41   ` Junio C Hamano
  5 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-28  0:56 UTC (permalink / raw)
  To: gitster; +Cc: git, avarab, peff, Stefan Beller

Reuse the compare function from the hash map instead of calling the
compare function directly. Then we pick the correct compare function
when told to compare ignoring white space.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c                     |  3 +--
 t/t4015-diff-whitespace.sh | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 1d93e98e3a..4bcf938e3a 100644
--- a/diff.c
+++ b/diff.c
@@ -903,8 +903,7 @@ static void mark_color_as_moved(struct diff_options *o,
 			struct moved_entry *p = pmb[i];
 			struct moved_entry *pnext = (p && p->next_line) ?
 					p->next_line : NULL;
-			if (pnext &&
-			    !emitted_symbol_cmp(pnext->es, l, o)) {
+			if (pnext && !hm->cmpfn(pnext, match, NULL)) {
 				pmb[i] = p->next_line;
 			} else {
 				pmb[i] = NULL;
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index ae8c686f3c..c3b697411a 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1317,6 +1317,71 @@ test_expect_success 'no effect from --color-moved with --word-diff' '
 	test_cmp expect actual
 '
 
+test_expect_success 'move detection ignoring whitespace ' '
+	git reset --hard &&
+	cat <<\EOF >lines.txt &&
+line 1
+line 2
+line 3
+line 4
+line 5
+line 6
+line 7
+EOF
+	git add lines.txt &&
+	git commit -m "add poetry" &&
+	cat <<\EOF >lines.txt &&
+	line 5
+	line 6
+	line 7
+line 1
+line 2
+line 3
+line 4
+EOF
+	test_config color.diff.oldMoved "magenta" &&
+	test_config color.diff.newMoved "cyan" &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 734156d..eb89ead 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,7 +1,7 @@<RESET>
+	<GREEN>+<RESET>	<GREEN>line 5<RESET>
+	<GREEN>+<RESET>	<GREEN>line 6<RESET>
+	<GREEN>+<RESET>	<GREEN>line 7<RESET>
+	 line 1<RESET>
+	 line 2<RESET>
+	 line 3<RESET>
+	 line 4<RESET>
+	<RED>-line 5<RESET>
+	<RED>-line 6<RESET>
+	<RED>-line 7<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	git diff HEAD --no-renames -w --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 734156d..eb89ead 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,7 +1,7 @@<RESET>
+	<CYAN>+<RESET>	<CYAN>line 5<RESET>
+	<CYAN>+<RESET>	<CYAN>line 6<RESET>
+	<CYAN>+<RESET>	<CYAN>line 7<RESET>
+	 line 1<RESET>
+	 line 2<RESET>
+	 line 3<RESET>
+	 line 4<RESET>
+	<MAGENTA>-line 5<RESET>
+	<MAGENTA>-line 6<RESET>
+	<MAGENTA>-line 7<RESET>
+	EOF
+	test_cmp expected actual
+'
+
 test_expect_success 'move detection with submodules' '
 	test_create_repo bananas &&
 	echo ripe >bananas/recipe &&
-- 
2.13.0.31.g9b732c453e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] diff.c: change the default for move coloring to zebra
  2017-06-28  0:56 ` [PATCH 2/6] diff.c: change the default for move coloring to zebra Stefan Beller
@ 2017-06-28  3:14   ` Junio C Hamano
  2017-06-28 19:54     ` Stefan Beller
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28  3:14 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, peff

Stefan Beller <sbeller@google.com> writes:

> Introduce a new mode COLOR_MOVED_DEFAULT, which is the same as
> COLOR_MOVED_ZEBRA. But having two different symbols allows us to
> differentiate them in the code.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/diff-options.txt |  3 +++
>  diff.c                         | 13 ++++++++++++-
>  diff.h                         |  1 +
>  3 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 058c8014ed..d2c6a60af2 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -243,6 +243,9 @@ endif::git-diff[]
>  --
>  no::
>  	Moved lines are not highlighted.
> +default::
> +	Is a synonym for `zebra`. This may change to more sensible modes
> +	in the future.

"to a more sensible mode"?

This is part of the choice for

	--color-moved[=<mode>]::

and the current text does not exactly say what happens when =<mode>
is omitted.  

I am guessing that the intent is to behave as if "=default" is given
when it happens; this new entry would be a good place to mention it.

	default::
		Uses a sensible default mode (currently `zebra`).
		Giving the `--color-moved` option without an
		explicit `=<mode>` also behaves like this.

or something like that, perhaps.

The "diff.colorMoved" configuration is now a bool-or-string; does it
need to be documented as such in Documentation/config.txt?

	diff.colorMoved::
		When set to `false`, moved lines are not treated any
		differently.  When set to any one of the valid
		`<mode>` for `--color-moved=<mode>` option for `git
		diff` familly of commands, they behave as if
		`--color-moved=<mode>` option was given from the
		command line.  Setting it to `true` has the same
		effect as setting it to `default`.

As the configuration can express everything without the optional
boolness, it may not be worth describing it.  I dunno.

> diff --git a/diff.c b/diff.c
> index 5311dcf133..31cdec05ac 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -256,12 +256,23 @@ int git_diff_heuristic_config(const char *var, const char *value, void *cb)
>  
>  static int parse_color_moved(const char *arg)
>  {
> +	int v = git_parse_maybe_bool(arg);
> +
> +	if (v != -1) {
> +		if (v == 0)
> +			return COLOR_MOVED_NO;
> +		else if (v == 1)
> +			return COLOR_MOVED_DEFAULT;
> +	}
> +

This is not wrong per se, but

	switch (git_parse_maybe_bool(arg)) {
	case 0:
		return COLOR_MOVED_NO;
	case 1:
		return COLOR_MOVED_DEFAULT;
	default:
		break;
	}

without an extra variable "v" may be easier to follow.

> @@ -4654,7 +4665,7 @@ int diff_opt_parse(struct diff_options *options,
>  		if (diff_color_moved_default)
>  			options->color_moved = diff_color_moved_default;
>  		if (options->color_moved == COLOR_MOVED_NO)
> -			options->color_moved = COLOR_MOVED_ZEBRA_DIM;
> +			options->color_moved = COLOR_MOVED_DEFAULT;

This part made me look at the hunk with wider context.  

This code is reacting to "--color-moved" (no arguments) and
diff_color_moved_default presumably comes from the configuration.
When the configuration says diff.colorMoved is 'false' by default,
the "--color-moved" option from the command line needs to trump it,
but we do not have any mode given (other than the configuration
saying "no, no, no, we do not want color-moved at all!"), so we
choose the default setting.  Which is correct but was a bit tricky
to reason about.

> diff --git a/diff.h b/diff.h
> index 98abd75521..9298d211d7 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -192,6 +192,7 @@ struct diff_options {
>  		COLOR_MOVED_NO = 0,
>  		COLOR_MOVED_PLAIN = 1,
>  		COLOR_MOVED_ZEBRA = 2,
> +		COLOR_MOVED_DEFAULT = 2,
>  		COLOR_MOVED_ZEBRA_DIM = 3,
>  	} color_moved;
>  };

Hmph.  I would have expected that COLOR_MOVED_DEFAULT would not be
part of the enum, but a 

    #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA

I do not have a strong preference either way; it was just a bit
unexpected.

Thanks.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] Documentation/diff: reword color moved
  2017-06-28  0:56 ` [PATCH 4/6] Documentation/diff: reword color moved Stefan Beller
@ 2017-06-28  3:25   ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28  3:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, peff

Stefan Beller <sbeller@google.com> writes:

> This is easier for the casual reader.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/config.txt       | 6 ++++--
>  Documentation/diff-options.txt | 7 ++++---
>  2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 29e0b9fa69..3d89be2d84 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1052,8 +1052,10 @@ This does not affect linkgit:git-format-patch[1] or the
>  command line with the `--color[=<when>]` option.
>  
>  diff.colorMoved::
> -	If set moved lines in a diff are colored differently,
> -	for details see '--color-moved' in linkgit:git-diff[1].
> +	If set to either a valid `<mode>` or a true value, moved lines
> +	in a diff are colored differently, for details of valid modes
> +	see '--color-moved' in linkgit:git-diff[1]. If simply set to
> +	true the default color mode will be used.

Ah, I was ahead of myself on a review on an earlier patch.  The
above is OK but it may want to say what happens when set to false.

>  
>  color.diff.<slot>::
>  	Use customized color for diff colorization.  `<slot>` specifies
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index d2c6a60af2..d4dc46ee2f 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -252,9 +252,10 @@ plain::
>  	Similarly 'color.diff.oldMoved' will be used for removed lines
>  	that are added somewhere else in the diff.
>  zebra::
> -	Blocks of moved code are detected. The detected blocks are
> -	painted using the 'color.diff.{old,new}Moved' alternating with
> -	'color.diff.{old,new}MovedAlternative'.
> +	Blocks of moved code are detected greedily. The detected blocks are
> +	painted using either the 'color.diff.{old,new}Moved' color or
> +	'color.diff.{old,new}MovedAlternative'. The change between
> +	the two colors indicates that a new block was detected.
>  dimmed_zebra::
>  	Similar to 'zebra', but additional dimming of uninteresting parts
>  	of moved code is performed. The bordering lines of two adjacent

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] diff.c: omit uninteresting moved lines
  2017-06-28  0:56 ` [PATCH 5/6] diff.c: omit uninteresting moved lines Stefan Beller
@ 2017-06-28  3:31   ` Junio C Hamano
  2017-06-28 20:00     ` Stefan Beller
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28  3:31 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, peff

Stefan Beller <sbeller@google.com> writes:

> It is useful to have moved lines colored, but there are annoying corner
> cases, such as a single line moved, that is very common. For example
> in a typical patch of C code, we have closing braces that end statement
> blocks or functions.
>
> While it is technically true that these lines are moved as they show up
> elsewhere, it is harmful for the review as the reviewers attention is
> drawn to such a minor side annoyance.
>
> One of the first solutions considered, started off by these hypothesis':

Hypotheses is the plural form of that word, I think.

>   (a) The more blocks of the same code we have, the less interesting it is.
>   (b) The shorter a block of moved code is the less need of markup there
>       is for review.
>
>       Introduce a heuristic which drops any potential moved blocks if their
>       length is shorter than the number of potential moved blocks.
>
>       This heuristic was chosen as it is agnostic of the content (in other
>       languages or contents to manage, we may have longer lines, e.g. in
>       shell the closing of a condition is already 2 characters. Thinking
>       about Latex documents tracked in Git, there can also be some
>       boilerplate code with lots of characters) while taking both
>       hypothesis' into account. An alternative considered was the number
>       of non-whitespace characters in a line for example.

It was puzzling what the above two paragraphs were.  I took (a) and
(b) were the hypotheses, and the two above, and also the next
paragraphs, were the design that fell out of them.  But that is not
what is happening.  You changed your mind and settled on the design
in the next paragraph.

Perhaps we can do without all of the "I thought about this but it
didn't make sense" that is longer than the solution in the patch?

> Thinking further about this, a linear relation between number of moved
> blocks and number of lines of code seems like a bad idea to start with.
> So let's start with a simpler solution of hardcoding the number of lines
> to 3.
>
> Note, that the length is applied across all blocks to find the 'lonely'
> blocks that pollute new code, but do not interfere with a permutated
> block where each permutation has less lines than 3.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  diff.c | 11 ++++++++++-
>  diff.h |  1 +
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/diff.c b/diff.c
> index 015c854530..1d93e98e3a 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -853,7 +853,8 @@ static void mark_color_as_moved(struct diff_options *o,
>  {
>  	struct moved_entry **pmb = NULL; /* potentially moved blocks */
>  	int pmb_nr = 0, pmb_alloc = 0;
> -	int n, flipped_block = 1;
> +	int n, flipped_block = 1, block_length = 0;
> +
>  
>  	for (n = 0; n < o->emitted_symbols->nr; n++) {
>  		struct hashmap *hm = NULL;
> @@ -880,11 +881,19 @@ static void mark_color_as_moved(struct diff_options *o,
>  		}
>  
>  		if (!match) {
> +			if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH) {
> +				for (i = 0; i < block_length + 1; i++) {
> +					l = &o->emitted_symbols->buf[n - i];
> +					l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
> +				}
> +			}
>  			pmb_nr = 0;
> +			block_length = 0;
>  			continue;
>  		}
>  
>  		l->flags |= DIFF_SYMBOL_MOVED_LINE;
> +		block_length++;
>  
>  		if (o->color_moved == COLOR_MOVED_PLAIN)
>  			continue;
> diff --git a/diff.h b/diff.h
> index 9298d211d7..cc1224a93b 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -195,6 +195,7 @@ struct diff_options {
>  		COLOR_MOVED_DEFAULT = 2,
>  		COLOR_MOVED_ZEBRA_DIM = 3,
>  	} color_moved;
> +	#define COLOR_MOVED_MIN_BLOCK_LENGTH 3
>  };
>  
>  void diff_emit_submodule_del(struct diff_options *o, const char *line);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28  0:56 ` [PATCH 6/6] diff.c: detect blocks despite whitespace changes Stefan Beller
@ 2017-06-28  3:41   ` Junio C Hamano
  2017-06-28  5:06     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28  3:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, peff

Stefan Beller <sbeller@google.com> writes:

> Reuse the compare function from the hash map instead of calling the
> compare function directly. Then we pick the correct compare function
> when told to compare ignoring white space.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  diff.c                     |  3 +--
>  t/t4015-diff-whitespace.sh | 65 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 66 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 1d93e98e3a..4bcf938e3a 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -903,8 +903,7 @@ static void mark_color_as_moved(struct diff_options *o,
>  			struct moved_entry *p = pmb[i];
>  			struct moved_entry *pnext = (p && p->next_line) ?
>  					p->next_line : NULL;
> -			if (pnext &&
> -			    !emitted_symbol_cmp(pnext->es, l, o)) {
> +			if (pnext && !hm->cmpfn(pnext, match, NULL)) {
>  				pmb[i] = p->next_line;
>  			} else {
>  				pmb[i] = NULL;

I presume that this makes the use of emitted_symbol_cmp() vs
emitted_symbol_cmp_no_ws() consistent with the remainder of the
file.

Looking at the implementation of get_ws_cleaned_string() that is the
workhorse of emitted_symbol_cmp_no_ws(), it seems to be doing wrong
things for various "ignore whitespace" options (i.e. there is only
one implementation, while "git diff" family takes things like
"ignore space change", "ignore all whitespace", etc.), though.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28  3:41   ` Junio C Hamano
@ 2017-06-28  5:06     ` Junio C Hamano
  2017-06-28 20:36       ` Stefan Beller
  2017-06-29 21:01       ` Stefan Beller
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28  5:06 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, peff

Junio C Hamano <gitster@pobox.com> writes:

> Looking at the implementation of get_ws_cleaned_string() that is the
> workhorse of emitted_symbol_cmp_no_ws(), it seems to be doing wrong
> things for various "ignore whitespace" options (i.e. there is only
> one implementation, while "git diff" family takes things like
> "ignore space change", "ignore all whitespace", etc.), though.

This probably deserves a bit more illustration of how I envision the
code should evolve.

In the longer term, I would prefer to see emitted_symbol_cmp_no_ws()
to go and instead emitted_symbol_cmp() to take the diff options so
that it can change the behaviour of the comparison function based on
the -w/-b/--ignore-space-at-eol/etc. settings.  And compare two strings
in place.

For that, you may need a helper function that takes a pointer to a
character pointer, picks the next byte that matters while advancing
the pointer, and returns that byte.  The emitted_symbol_cmp(a, b)
which is not used for real comparison (i.e. ordering to see if a
sorts earlier than b) but for equivalence (i.e. considering various
whitespace-ignoring settings, does a and b matfch?) may become
something like:

        int
        emitted_symbol_eqv(struct emitted_diff_symbol *a,
                           struct emitted_diff_symbol *b,
                           const void *keydata) {
                struct diff_options *diffopt = keydata;
                const char *ap = a->line;
                const char *bp = b->line;

                while (1) {
                        int ca, cb;
                        ca = next_byte(&ap, diffopt);
                        cb = next_byte(&bp, diffopt);
                        if (ca != cb)
                                return 1; /* differs */
                        if (!ca)
                                return 0;
                };
        }                           

where next_byte() may look like:

        static int
        next_byte(const char **cp, struct diff_options *diffopt)
        {                  
                int retval;

        again:
                retval = **cp;
                if (!retval)
                        return retval; /* end of string */
                if (!isspace(retval)) {
                        (*cp)++; /* advance */
                        return retval;
                }

                switch (ignore_space_type(diffopt)) {
                case NOT_IGNORING:
                        (*cp)++; /* advance */
                        return retval;
                case IGNORE_SPACE_CHANGE:
                        while (**cp && isspace(**cp))
                                (*cp)++; /* squash consecutive spaces */
                        return ' '; /* normalize spaces with a SP */
                case IGNORE_ALL_SPACE:
                        (*cp)++; /* advance */
                        goto again;
                ... other cases here ...
                }
        }



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] diff.c: change the default for move coloring to zebra
  2017-06-28  3:14   ` Junio C Hamano
@ 2017-06-28 19:54     ` Stefan Beller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2017-06-28 19:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

On Tue, Jun 27, 2017 at 8:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> diff --git a/diff.h b/diff.h
>> index 98abd75521..9298d211d7 100644
>> --- a/diff.h
>> +++ b/diff.h
>> @@ -192,6 +192,7 @@ struct diff_options {
>>               COLOR_MOVED_NO = 0,
>>               COLOR_MOVED_PLAIN = 1,
>>               COLOR_MOVED_ZEBRA = 2,
>> +             COLOR_MOVED_DEFAULT = 2,
>>               COLOR_MOVED_ZEBRA_DIM = 3,
>>       } color_moved;
>>  };
>
> Hmph.  I would have expected that COLOR_MOVED_DEFAULT would not be
> part of the enum, but a
>
>     #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
>
> I do not have a strong preference either way; it was just a bit
> unexpected.

Yes it is not as one would expect.
Originally I thought it could buy us some more protection via
smart static analysis (to be invented yet), but now I am not so
sure any more. Changed it to the #define.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] diff.c: omit uninteresting moved lines
  2017-06-28  3:31   ` Junio C Hamano
@ 2017-06-28 20:00     ` Stefan Beller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2017-06-28 20:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

On Tue, Jun 27, 2017 at 8:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> It is useful to have moved lines colored, but there are annoying corner
>> cases, such as a single line moved, that is very common. For example
>> in a typical patch of C code, we have closing braces that end statement
>> blocks or functions.
>>
>> While it is technically true that these lines are moved as they show up
>> elsewhere, it is harmful for the review as the reviewers attention is
>> drawn to such a minor side annoyance.
>>
>> One of the first solutions considered, started off by these hypothesis':
>
> Hypotheses is the plural form of that word, I think.
>
>>   (a) The more blocks of the same code we have, the less interesting it is.
>>   (b) The shorter a block of moved code is the less need of markup there
>>       is for review.
>>
>>       Introduce a heuristic which drops any potential moved blocks if their
>>       length is shorter than the number of potential moved blocks.
>>
>>       This heuristic was chosen as it is agnostic of the content (in other
>>       languages or contents to manage, we may have longer lines, e.g. in
>>       shell the closing of a condition is already 2 characters. Thinking
>>       about Latex documents tracked in Git, there can also be some
>>       boilerplate code with lots of characters) while taking both
>>       hypothesis' into account. An alternative considered was the number
>>       of non-whitespace characters in a line for example.
>
> It was puzzling what the above two paragraphs were.  I took (a) and
> (b) were the hypotheses, and the two above, and also the next
> paragraphs, were the design that fell out of them.  But that is not
> what is happening.  You changed your mind and settled on the design
> in the next paragraph.

Yes, I somehow want to say:

  "What is implemented in this patch is stupid. And I know it, but I
   know no smarter idea. This is what I thought was smarter, maybe
   someone in the future can be inspired by this, at least."

> Perhaps we can do without all of the "I thought about this but it
> didn't make sense" that is longer than the solution in the patch?

As I do changes based on your responses, I want to squash
these patches sent out last night into the original patch, so I'll butcher
the commit message to be way smaller

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28  5:06     ` Junio C Hamano
@ 2017-06-28 20:36       ` Stefan Beller
  2017-06-28 21:16         ` Junio C Hamano
  2017-06-29 21:01       ` Stefan Beller
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-28 20:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

On Tue, Jun 27, 2017 at 10:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Looking at the implementation of get_ws_cleaned_string() that is the
>> workhorse of emitted_symbol_cmp_no_ws(), it seems to be doing wrong
>> things for various "ignore whitespace" options (i.e. there is only
>> one implementation, while "git diff" family takes things like
>> "ignore space change", "ignore all whitespace", etc.), though.
>
> This probably deserves a bit more illustration of how I envision the
> code should evolve.
>
> In the longer term, I would prefer to see emitted_symbol_cmp_no_ws()
> to go and instead emitted_symbol_cmp() to take the diff options
> so
> that it can change the behaviour of the comparison function based on
> the -w/-b/--ignore-space-at-eol/etc. settings.  And compare two strings
> in place.

ok, in-place is no problem. But passing down the diff options into the
compare function is a bit hard.

Originally I wanted to do that, see prep work in [1], but Jeff explained that
the additional pointer in the compare function is **not** supposed to be
a additional payload (such as the diff options specifying the white space
options.)

[1] https://public-inbox.org/git/20170512200244.25245-1-sbeller@google.com/

However as we no settled on the struct emitted_diff_symbol,
that has a 'flags' field in there, which ought to contain everything we
know about whitespace settings, we should be able to do that from there.

>         emitted_symbol_eqv(struct emitted_diff_symbol *a,
>                            struct emitted_diff_symbol *b,
>                            const void *keydata) {
>                 struct diff_options *diffopt = keydata;

The prep work mentioned, would allow for this, as keydata
would be passed through as-is in all calls of the hashmap API,
such that the user can decide if they use it as the actual 'keydata'
or rather as an additional payload.

Thanks for outlining the idea in code, but maybe we need to
reconsider the hashmap API before that.

By not considering the change in the hashmap API, the current
implementation tried to get away by having different compare functions.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28 20:36       ` Stefan Beller
@ 2017-06-28 21:16         ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-06-28 21:16 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

Stefan Beller <sbeller@google.com> writes:

> Originally I wanted to do that, see prep work in [1], but Jeff explained that
> the additional pointer in the compare function is **not** supposed to be
> a additional payload (such as the diff options specifying the white space
> options.)
>
> [1] https://public-inbox.org/git/20170512200244.25245-1-sbeller@google.com/

Ah, yes, keydata is a wrong thing to use to give additional hint to
the eqv function.  We do need to fix the function signature of
hashmap_cmp_fn.  It is common for us to want to use a single
implementation of an eqv function that can be configured to behave
slightly differently but the customization will stay the same
throughout the lifetime of a hashmap.  IOW, hashmap_init() needs to
be able to take a pointer to such a configuration data
(e.g. diff_options), and then the comparison made between two
elements that hash to the same bucket needs to be given not just the
two elements but also that configuration data to tweak the function.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-28  5:06     ` Junio C Hamano
  2017-06-28 20:36       ` Stefan Beller
@ 2017-06-29 21:01       ` Stefan Beller
  2017-06-29 23:51         ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2017-06-29 21:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

On Tue, Jun 27, 2017 at 10:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Looking at the implementation of get_ws_cleaned_string() that is the
>> workhorse of emitted_symbol_cmp_no_ws(), it seems to be doing wrong
>> things for various "ignore whitespace" options (i.e. there is only
>> one implementation, while "git diff" family takes things like
>> "ignore space change", "ignore all whitespace", etc.), though.
>
> This probably deserves a bit more illustration of how I envision the
> code should evolve.
>
> In the longer term, I would prefer to see emitted_symbol_cmp_no_ws()
> to go and instead emitted_symbol_cmp() to take the diff options so
> that it can change the behaviour of the comparison function based on
> the -w/-b/--ignore-space-at-eol/etc. settings.  And compare two strings
> in place.
>
> For that, you may need a helper function that takes a pointer to a
> character pointer, picks the next byte that matters while advancing
> the pointer, and returns that byte.  The emitted_symbol_cmp(a, b)
> which is not used for real comparison (i.e. ordering to see if a
> sorts earlier than b) but for equivalence (i.e. considering various
> whitespace-ignoring settings, does a and b matfch?) may become
> something like:
>
>         int
>         emitted_symbol_eqv(struct emitted_diff_symbol *a,
>                            struct emitted_diff_symbol *b,
>                            const void *keydata) {
>                 struct diff_options *diffopt = keydata;
>                 const char *ap = a->line;
>                 const char *bp = b->line;
>
>                 while (1) {
>                         int ca, cb;
>                         ca = next_byte(&ap, diffopt);
>                         cb = next_byte(&bp, diffopt);
>                         if (ca != cb)
>                                 return 1; /* differs */
>                         if (!ca)
>                                 return 0;
>                 };
>         }
>
> where next_byte() may look like:
>
>         static int
>         next_byte(const char **cp, struct diff_options *diffopt)
>         {
>                 int retval;
>
>         again:
>                 retval = **cp;
>                 if (!retval)
>                         return retval; /* end of string */
>                 if (!isspace(retval)) {
>                         (*cp)++; /* advance */
>                         return retval;
>                 }
>
>                 switch (ignore_space_type(diffopt)) {
>                 case NOT_IGNORING:
>                         (*cp)++; /* advance */
>                         return retval;
>                 case IGNORE_SPACE_CHANGE:
>                         while (**cp && isspace(**cp))
>                                 (*cp)++; /* squash consecutive spaces */
>                         return ' '; /* normalize spaces with a SP */
>                 case IGNORE_ALL_SPACE:
>                         (*cp)++; /* advance */
>                         goto again;
>                 ... other cases here ...
>                 }
>         }
>
>

I just rebased the diff series on top of the hashmap series and now
I want to implement the compare function based on the new data
feature. So I thought I might start off this proposal, as you seem to
have good taste for how to approach problems.

It turns out that the code here is rather a very loose proposal,
not to be copied literally, because of these constraints:
* When dealing with user content, we do not have C-strings,
  but memory + length, such that next_byte also needs to be
  aware of the end pointer.
* The ignore_space_type() as well as these constants do not exist
  as-is, I think the cleanest for now is to parse diffopt->xdl_opts via
  DIFF_XDL_TST

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] diff.c: detect blocks despite whitespace changes
  2017-06-29 21:01       ` Stefan Beller
@ 2017-06-29 23:51         ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-06-29 23:51 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Ævar Arnfjörð Bjarmason,
	Jeff King

On Thu, Jun 29, 2017 at 2:01 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jun 27, 2017 at 10:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Junio C Hamano <gitster@pobox.com> writes:
>> This probably deserves a bit more illustration of how I envision the
>> code should evolve.
>> ...
> It turns out that the code here is rather a very loose proposal,
> not to be copied literally,

Yeah, whenever I say "illustration", do not take it as anything more than
scribbling on the back of an envelope, whose primary purpose is to show
some idea (in this case, the central idea is that a helper function that lets
you ask "what's the next byte that matters? tell me also when you run out"
is all you need to do an in-place comparison with various "ignore" modes;
by the way, that applies also to ignore-case comparison if there is a need
to support it).

The implementation detail to support that central idea is left
to the readers. ;-)

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-06-29 23:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-28  0:56 [PATCH 0/6] Fixing up sb/diff-color-moved Stefan Beller
2017-06-28  0:56 ` [PATCH 1/6] diff.c: factor out shrinking of potential moved line blocks Stefan Beller
2017-06-28  0:56 ` [PATCH 2/6] diff.c: change the default for move coloring to zebra Stefan Beller
2017-06-28  3:14   ` Junio C Hamano
2017-06-28 19:54     ` Stefan Beller
2017-06-28  0:56 ` [PATCH 3/6] diff.c: better reporting on color.moved bogus configuration Stefan Beller
2017-06-28  0:56 ` [PATCH 4/6] Documentation/diff: reword color moved Stefan Beller
2017-06-28  3:25   ` Junio C Hamano
2017-06-28  0:56 ` [PATCH 5/6] diff.c: omit uninteresting moved lines Stefan Beller
2017-06-28  3:31   ` Junio C Hamano
2017-06-28 20:00     ` Stefan Beller
2017-06-28  0:56 ` [PATCH 6/6] diff.c: detect blocks despite whitespace changes Stefan Beller
2017-06-28  3:41   ` Junio C Hamano
2017-06-28  5:06     ` Junio C Hamano
2017-06-28 20:36       ` Stefan Beller
2017-06-28 21:16         ` Junio C Hamano
2017-06-29 21:01       ` Stefan Beller
2017-06-29 23:51         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).