git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Brandon Williams <bmwill@google.com>
To: Stefan Beller <sbeller@google.com>
Cc: git@vger.kernel.org, jacob.keller@gmail.com,
	jonathantanmy@google.com, simon@ruderich.org,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm
Date: Mon, 2 Jul 2018 10:22:24 -0700
Message-ID: <20180702172224.GC246956@google.com> (raw)
In-Reply-To: <20180629001958.85143-8-sbeller@google.com>

On 06/28, Stefan Beller wrote:
> In the original implementation of the move detection logic the choice for
> ignoring white space changes is the same for the move detection as it is
> for the regular diff.  Some cases came up where different treatment would
> have been nice.
> 
> Allow the user to specify that white space should be ignored differently
> during detection of moved lines than during generation of added and removed
> lines. This is done by providing analogs to the --ignore-space-at-eol,
> -b, and -w options by introducing the option --color-moved-ws=<modes>
> with the modes named "ignore-space-at-eol", "ignore-space-change" and
> "ignore-all-space", which is used only during the move detection phase.
> 
> As we change the default, we'll adjust the tests.
> 
> For now we do not infer any options to treat white spaces in the move
> detection from the generic white space options given to diff.
> This can be tuned later to reasonable default.
> 
> As we plan on adding more white space related options in a later patch,
> that interferes with the current white space options, use a flag field
> and clamp it down to  XDF_WHITESPACE_FLAGS, as that (a) allows to easily
> check at parse time if we give invalid combinations and (b) can reuse
> parts of this patch.
> 
> By having the white space treatment in its own option, we'll also
> make it easier for a later patch to have an config option for
> spaces in the move detection.
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/diff-options.txt | 17 +++++++++
>  diff.c                         | 39 +++++++++++++++++++--
>  diff.h                         |  1 +
>  t/t4015-diff-whitespace.sh     | 64 +++++++++++++++++++++++++++++++---
>  4 files changed, 115 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index ba56169de31..80e29e39854 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -292,6 +292,23 @@ dimmed_zebra::
>  	blocks are considered interesting, the rest is uninteresting.
>  --
>  
> +--color-moved-ws=<modes>::
> +	This configures how white spaces are ignored when performing the
> +	move detection for `--color-moved`. These modes can be given
> +	as a comma separated list:
> ++
> +--
> +ignore-space-at-eol::
> +	Ignore changes in whitespace at EOL.
> +ignore-space-change::
> +	Ignore changes in amount of whitespace.  This ignores whitespace
> +	at line end, and considers all other sequences of one or
> +	more whitespace characters to be equivalent.
> +ignore-all-space::
> +	Ignore whitespace when comparing lines. This ignores differences
> +	even if one line has whitespace where the other line has none.
> +--
> +
>  --word-diff[=<mode>]::
>  	Show a word diff, using the <mode> to delimit changed words.
>  	By default, words are delimited by whitespace; see
> diff --git a/diff.c b/diff.c
> index 95c51c0b7df..70eeb40c5fd 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -283,6 +283,36 @@ static int parse_color_moved(const char *arg)
>  		return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'"));
>  }
>  
> +static int parse_color_moved_ws(const char *arg)
> +{
> +	int ret = 0;
> +	struct string_list l = STRING_LIST_INIT_DUP;
> +	struct string_list_item *i;
> +
> +	string_list_split(&l, arg, ',', -1);
> +
> +	for_each_string_list_item(i, &l) {
> +		struct strbuf sb = STRBUF_INIT;
> +		strbuf_addstr(&sb, i->string);
> +		strbuf_trim(&sb);
> +
> +		if (!strcmp(sb.buf, "ignore-space-change"))
> +			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
> +		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
> +			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
> +		else if (!strcmp(sb.buf, "ignore-all-space"))
> +			ret |= XDF_IGNORE_WHITESPACE;
> +		else
> +			error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf);
> +
> +		strbuf_release(&sb);
> +	}
> +
> +	string_list_clear(&l, 0);
> +
> +	return ret;
> +}
> +
>  int git_diff_ui_config(const char *var, const char *value, void *cb)
>  {
>  	if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
> @@ -717,10 +747,12 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data,
>  	const struct diff_options *diffopt = hashmap_cmp_fn_data;
>  	const struct moved_entry *a = entry;
>  	const struct moved_entry *b = entry_or_key;
> +	unsigned flags = diffopt->color_moved_ws_handling
> +			 & XDF_WHITESPACE_FLAGS;
>  
>  	return !xdiff_compare_lines(a->es->line, a->es->len,
>  				    b->es->line, b->es->len,
> -				    diffopt->xdl_opts);
> +				    flags);
>  }
>  
>  static struct moved_entry *prepare_entry(struct diff_options *o,
> @@ -728,8 +760,9 @@ static struct moved_entry *prepare_entry(struct diff_options *o,
>  {
>  	struct moved_entry *ret = xmalloc(sizeof(*ret));
>  	struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no];
> +	unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS;
>  
> -	ret->ent.hash = xdiff_hash_string(l->line, l->len, o->xdl_opts);
> +	ret->ent.hash = xdiff_hash_string(l->line, l->len, flags);
>  	ret->es = l;
>  	ret->next_line = NULL;
>  
> @@ -4710,6 +4743,8 @@ int diff_opt_parse(struct diff_options *options,
>  		if (cm < 0)
>  			die("bad --color-moved argument: %s", arg);
>  		options->color_moved = cm;
> +	} else if (skip_prefix(arg, "--color-moved-ws=", &arg)) {
> +		options->color_moved_ws_handling = parse_color_moved_ws(arg);
>  	} else if (skip_to_optional_arg_default(arg, "--color-words", &options->word_regex, NULL)) {
>  		options->use_color = 1;
>  		options->word_diff = DIFF_WORDS_COLOR;
> diff --git a/diff.h b/diff.h
> index 7bd4f182c33..de5dc680051 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -214,6 +214,7 @@ struct diff_options {
>  	} color_moved;
>  	#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
>  	#define COLOR_MOVED_MIN_ALNUM_COUNT 20
> +	int color_moved_ws_handling;
>  };
>  
>  void diff_emit_submodule_del(struct diff_options *o, const char *line);
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index e54529f026d..0c737a47cf8 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -1460,7 +1460,8 @@ test_expect_success 'move detection ignoring whitespace ' '
>  	EOF
>  	test_cmp expected actual &&
>  
> -	git diff HEAD --no-renames -w --color-moved --color >actual.raw &&
> +	git diff HEAD --no-renames --color-moved --color \
> +		--color-moved-ws=ignore-all-space >actual.raw &&
>  	grep -v "index" actual.raw | test_decode_color >actual &&
>  	cat <<-\EOF >expected &&
>  	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> @@ -1522,7 +1523,8 @@ test_expect_success 'move detection ignoring whitespace changes' '
>  	EOF
>  	test_cmp expected actual &&
>  
> -	git diff HEAD --no-renames -b --color-moved --color >actual.raw &&
> +	git diff HEAD --no-renames --color-moved --color \
> +		--color-moved-ws=ignore-space-change >actual.raw &&
>  	grep -v "index" actual.raw | test_decode_color >actual &&
>  	cat <<-\EOF >expected &&
>  	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> @@ -1587,7 +1589,8 @@ test_expect_success 'move detection ignoring whitespace at eol' '
>  	EOF
>  	test_cmp expected actual &&
>  
> -	git diff HEAD --no-renames --ignore-space-at-eol --color-moved --color >actual.raw &&
> +	git diff HEAD --no-renames --color-moved --color \
> +		--color-moved-ws=ignore-space-at-eol >actual.raw &&
>  	grep -v "index" actual.raw | test_decode_color >actual &&
>  	cat <<-\EOF >expected &&
>  	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> @@ -1757,7 +1760,60 @@ test_expect_success 'move detection with submodules' '
>  
>  	# nor did we mess with it another way
>  	git diff --submodule=diff --color | test_decode_color >expect &&
> -	test_cmp expect decoded_actual
> +	test_cmp expect decoded_actual &&
> +	rm -rf bananas &&
> +	git submodule deinit bananas

Maybe you want to use a test_when_finished call for this instead of
doing this at the end of the test?  I guess this comes up as a point of
debate: Do you have each test clean up after itself or do you ensure
that subsequent tests makes sure its env is ready before testing.
Anyway this is just a suggestion.

> +'
> +
> +test_expect_success 'only move detection ignores white spaces' '
> +	git reset --hard &&
> +	q_to_tab <<-\EOF >text.txt &&
> +		a long line to exceed per-line minimum
> +		another long line to exceed per-line minimum
> +		original file
> +	EOF
> +	git add text.txt &&
> +	git commit -m "add text" &&
> +	q_to_tab <<-\EOF >text.txt &&
> +		Qa long line to exceed per-line minimum
> +		Qanother long line to exceed per-line minimum
> +		new file
> +	EOF
> +
> +	# Make sure we get a different diff using -w
> +	git diff --color --color-moved -w |
> +		grep -v "index" |
> +		test_decode_color >actual &&
> +	q_to_tab <<-\EOF >expected &&
> +	<BOLD>diff --git a/text.txt b/text.txt<RESET>
> +	<BOLD>--- a/text.txt<RESET>
> +	<BOLD>+++ b/text.txt<RESET>
> +	<CYAN>@@ -1,3 +1,3 @@<RESET>
> +	 Qa long line to exceed per-line minimum<RESET>
> +	 Qanother long line to exceed per-line minimum<RESET>
> +	<RED>-original file<RESET>
> +	<GREEN>+<RESET><GREEN>new file<RESET>
> +	EOF
> +	test_cmp expected actual &&
> +
> +	# And now ignoring white space only in the move detection
> +	git diff --color --color-moved \
> +		--color-moved-ws=ignore-all-space,ignore-space-change,ignore-space-at-eol |
> +		grep -v "index" |
> +		test_decode_color >actual &&
> +	q_to_tab <<-\EOF >expected &&
> +	<BOLD>diff --git a/text.txt b/text.txt<RESET>
> +	<BOLD>--- a/text.txt<RESET>
> +	<BOLD>+++ b/text.txt<RESET>
> +	<CYAN>@@ -1,3 +1,3 @@<RESET>
> +	<BOLD;MAGENTA>-a long line to exceed per-line minimum<RESET>
> +	<BOLD;MAGENTA>-another long line to exceed per-line minimum<RESET>
> +	<RED>-original file<RESET>
> +	<BOLD;YELLOW>+<RESET>Q<BOLD;YELLOW>a long line to exceed per-line minimum<RESET>
> +	<BOLD;YELLOW>+<RESET>Q<BOLD;YELLOW>another long line to exceed per-line minimum<RESET>
> +	<GREEN>+<RESET><GREEN>new file<RESET>
> +	EOF
> +	test_cmp expected actual
>  '
>  
>  test_done
> -- 
> 2.18.0.399.gad0ab374a1-goog
> 

-- 
Brandon Williams

  reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-22  1:57 [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation Stefan Beller
2018-06-22  1:57 ` [PATCH v3 1/8] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-22  1:57 ` [PATCH v3 2/8] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-22  1:57 ` [PATCH v3 3/8] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-22  1:57 ` [PATCH v3 4/8] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-22  1:57 ` [PATCH v3 5/8] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-06-22  1:57 ` [PATCH v3 6/8] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-06-22  1:57 ` [PATCH v3 7/8] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-22  1:57 ` [PATCH v3 8/8] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-06-23 16:52   ` SZEDER Gábor
2018-06-22 22:37 ` [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation Junio C Hamano
2018-06-29  0:19 ` [PATCH v4 0/9] " Stefan Beller
2018-06-29  0:19   ` [PATCH v4 1/9] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-29  0:19   ` [PATCH v4 2/9] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-29  0:19   ` [PATCH v4 3/9] t4015: avoid git as a pipe input Stefan Beller
2018-06-29  0:19   ` [PATCH v4 4/9] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-29  0:19   ` [PATCH v4 5/9] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-29  0:19   ` [PATCH v4 6/9] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-07-02 17:18     ` Brandon Williams
2018-06-29  0:19   ` [PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-07-02 17:22     ` Brandon Williams [this message]
2018-06-29  0:19   ` [PATCH v4 8/9] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-29  0:19   ` [PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-07-02 17:36     ` Brandon Williams
2018-07-02 18:59       ` Stefan Beller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180702172224.GC246956@google.com \
    --to=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=sbeller@google.com \
    --cc=simon@ruderich.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox