git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: sbeller@google.com
Cc: git@vger.kernel.org, jacob.keller@gmail.com,
	jonathantanmy@google.com, simon@ruderich.org
Subject: [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation
Date: Thu, 21 Jun 2018 18:57:17 -0700	[thread overview]
Message-ID: <20180622015725.219575-1-sbeller@google.com> (raw)

This replaces sb/diff-color-move-more and is also available at
https://github.com/stefanbeller/git/tree/color-moved-only
It applies on v2.18.0.

Move detection is a nice feature, but doesn't work well with indentation
or dedentation. Make it possible to indent/dedent code and still have
it recognized as moved code in the diff.

The fun is in the last patch, which allows white space sensitive
languages to trust the move detection, too. Each block that is marked as
moved will have the same delta in {in-, de-}dentation.
I would think this mode might be a reasonable default eventually.

Thanks,
Stefan

v3:
 This is a complete rewrite of the actual patch, with slight modifications]
 on the refactoring how to decouple the white space treatment from the
 move detection. See range-diff below.

v2: https://public-inbox.org/git/20180424210330.87861-1-sbeller@google.com/

v1: https://public-inbox.org/git/20180402224854.86922-1-sbeller@google.com/

Stefan Beller (8):
  xdiff/xdiff.h: remove unused flags
  xdiff/xdiffi.c: remove unneeded function declarations
  diff.c: do not pass diff options as keydata to hashmap
  diff.c: adjust hash function signature to match hashmap expectation
  diff.c: add a blocks mode for moved code detection
  diff.c: decouple white space treatment from move detection algorithm
  diff.c: factor advance_or_nullify out of mark_color_as_moved
  diff.c: add white space mode to move detection that allows indent
    changes

 Documentation/diff-options.txt |  29 +++-
 diff.c                         | 253 +++++++++++++++++++++++++++++----
 diff.h                         |   9 +-
 t/t4015-diff-whitespace.sh     | 202 +++++++++++++++++++++++++-
 xdiff/xdiff.h                  |   8 --
 xdiff/xdiffi.c                 |  17 ---
 6 files changed, 458 insertions(+), 60 deletions(-)

-- 
2.18.0.rc2.346.g013aa6912e-goog

git branch-diff fe0a9eaf31dd0c349ae4308498c33a5c3794b293..origin/sb/diff-color-move-more origin/master..HEAD

1:  a7a7af6b76b = 1:  7e01bd9a972 xdiff/xdiff.h: remove unused flags
2:  a7b6aaf7bc0 = 2:  46e11a99bb7 xdiff/xdiffi.c: remove unneeded function declarations
3:  d9e57cc6b05 = 3:  8fd0ce94aaf diff.c: do not pass diff options as keydata to hashmap
4:  87111ba726d = 4:  4a07b39163c diff.c: adjust hash function signature to match hashmap expectation
5:  9559b8cb456 = 5:  ef1976a301d diff.c: add a blocks mode for moved code detection
6:  41a70464209 ! 6:  a60a3f0de9d diff.c: decouple white space treatment from move detection algorithm
    @@ -7,24 +7,30 @@
         for the regular diff.  Some cases came up where different treatment would
         have been nice.
     
    -    Allow the user to specify that whitespace should be ignored differently
    +    Allow the user to specify that white space should be ignored differently
         during detection of moved lines than during generation of added and removed
         lines. This is done by providing analogs to the --ignore-space-at-eol,
    -    -b, and -w options (namely,
    -      --color-moved-[no-]ignore-space-at-eol
    -      --color-moved-[no-]ignore-space-change
    -      --color-moved-[no-]ignore-all-space) that affect only the color of the
    -    output, and making the existing --ignore-space-at-eol, -b, and -w options
    -    no longer affect the color of the output.
    +    -b, and -w options by introducing the option --color-moved-ws=<modes>
    +    with the modes named "ignore-space-at-eol", "ignore-space-change" and
    +    "ignore-all-space", which is used only during the move detection phase.
     
         As we change the default, we'll adjust the tests.
     
    -    For now we do not infer any options to treat whitespaces in the move
    +    For now we do not infer any options to treat white spaces in the move
         detection from the generic white space options given to diff.
         This can be tuned later to reasonable default.
     
    +    As we plan on adding more white space related options in a later patch,
    +    that interferes with the current white space options, use a flag field
    +    and clamp it down to  XDF_WHITESPACE_FLAGS, as that (a) allows to easily
    +    check at parse time if we give invalid combinations and (b) can reuse
    +    parts of this patch.
    +
    +    By having the white space treatment in its own option, we'll also
    +    make it easier for a later patch to have an config option for
    +    spaces in the move detection.
    +
         Signed-off-by: Stefan Beller <sbeller@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
     --- a/Documentation/diff-options.txt
    @@ -33,18 +39,21 @@
      	blocks are considered interesting, the rest is uninteresting.
      --
      
    -+--color-moved-[no-]ignore-space-at-eol::
    -+	Ignore changes in whitespace at EOL when performing the move
    -+	detection for --color-moved.
    -+--color-moved-[no-]ignore-space-change::
    -+	Ignore changes in amount of whitespace when performing the move
    -+	detection for --color-moved.  This ignores whitespace
    ++--color-moved-ws=<modes>::
    ++	This configures how white spaces are ignored when performing the
    ++	move detection for `--color-moved`. These modes can be given
    ++	as a comma separated list:
    +++
    ++--
    ++ignore-space-at-eol::
    ++	Ignore changes in whitespace at EOL.
    ++ignore-space-change::
    ++	Ignore changes in amount of whitespace.  This ignores whitespace
     +	at line end, and considers all other sequences of one or
     +	more whitespace characters to be equivalent.
    -+--color-moved-[no-]ignore-all-space::
    -+	Ignore whitespace when comparing lines when performing the move
    -+	detection for --color-moved.  This ignores differences even if
    -+	one line has whitespace where the other line has none.
    ++ignore-all-space::
    ++	Ignore whitespace when comparing lines. This ignores differences
    ++	even if one line has whitespace where the other line has none.
     +
      --word-diff[=<mode>]::
      	Show a word diff, using the <mode> to delimit changed words.
    @@ -53,6 +62,43 @@
     diff --git a/diff.c b/diff.c
     --- a/diff.c
     +++ b/diff.c
    +@@
    + 		return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed_zebra', 'plain'"));
    + }
    + 
    ++static int parse_color_moved_ws(const char *arg)
    ++{
    ++	int ret = 0;
    ++	struct string_list l = STRING_LIST_INIT_DUP;
    ++	struct string_list_item *i;
    ++
    ++	string_list_split(&l, arg, ',', -1);
    ++
    ++	for_each_string_list_item(i, &l) {
    ++		struct strbuf sb = STRBUF_INIT;
    ++		strbuf_addstr(&sb, i->string);
    ++		strbuf_trim(&sb);
    ++
    ++		if (!strcmp(sb.buf, "ignore-space-change"))
    ++			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
    ++		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
    ++			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
    ++		else if (!strcmp(sb.buf, "ignore-all-space"))
    ++			ret |= XDF_IGNORE_WHITESPACE;
    ++		else
    ++			error(_("ignoring unknown color-moved-ws mode '%s'"), sb.buf);
    ++
    ++		strbuf_release(&sb);
    ++	}
    ++
    ++	string_list_clear(&l, 0);
    ++
    ++	return ret;
    ++}
    ++
    + int git_diff_ui_config(const char *var, const char *value, void *cb)
    + {
    + 	if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
     @@
      	const struct diff_options *diffopt = hashmap_cmp_fn_data;
      	const struct moved_entry *a = entry;
    @@ -79,24 +125,14 @@
      	ret->next_line = NULL;
      
     @@
    - 		DIFF_XDL_SET(options, IGNORE_CR_AT_EOL);
    - 	else if (!strcmp(arg, "--ignore-blank-lines"))
    - 		DIFF_XDL_SET(options, IGNORE_BLANK_LINES);
    -+	else if (!strcmp(arg, "--color-moved-no-ignore-all-space"))
    -+		options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE;
    -+	else if (!strcmp(arg, "--color-moved-no-ignore-space-change"))
    -+		options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE_CHANGE;
    -+	else if (!strcmp(arg, "--color-moved-no-ignore-space-at-eol"))
    -+		options->color_moved_ws_handling &= ~XDF_IGNORE_WHITESPACE_AT_EOL;
    -+	else if (!strcmp(arg, "--color-moved-ignore-all-space"))
    -+		options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE;
    -+	else if (!strcmp(arg, "--color-moved-ignore-space-change"))
    -+		options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE_CHANGE;
    -+	else if (!strcmp(arg, "--color-moved-ignore-space-at-eol"))
    -+		options->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE_AT_EOL;
    - 	else if (!strcmp(arg, "--indent-heuristic"))
    - 		DIFF_XDL_SET(options, INDENT_HEURISTIC);
    - 	else if (!strcmp(arg, "--no-indent-heuristic"))
    + 		if (cm < 0)
    + 			die("bad --color-moved argument: %s", arg);
    + 		options->color_moved = cm;
    ++	} else if (skip_prefix(arg, "--color-moved-ws=", &arg)) {
    ++		options->color_moved_ws_handling = parse_color_moved_ws(arg);
    + 	} else if (skip_to_optional_arg_default(arg, "--color-words", &options->word_regex, NULL)) {
    + 		options->use_color = 1;
    + 		options->word_diff = DIFF_WORDS_COLOR;
     
     diff --git a/diff.h b/diff.h
     --- a/diff.h
    @@ -113,39 +149,13 @@
     diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
     --- a/t/t4015-diff-whitespace.sh
     +++ b/t/t4015-diff-whitespace.sh
    -@@
    - 	line 4
    - 	line 5
    - 	EOF
    --	git diff HEAD --no-renames --color-moved --color |
    -+	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-no-ignore-space-at-eol |
    - 		grep -v "index" |
    - 		test_decode_color >actual &&
    - 	cat <<-\EOF >expected &&
     @@
      	EOF
      	test_cmp expected actual &&
      
     -	git diff HEAD --no-renames -w --color-moved --color |
     +	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-no-ignore-space-at-eol |
    - 		grep -v "index" |
    - 		test_decode_color >actual &&
    - 	cat <<-\EOF >expected &&
    -@@
    - 	line 5
    - 	EOF
    - 
    --	git diff HEAD --no-renames --color-moved --color |
    -+	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-no-ignore-space-at-eol |
    ++		--color-moved-ws=ignore-all-space |
      		grep -v "index" |
      		test_decode_color >actual &&
      	cat <<-\EOF >expected &&
    @@ -155,21 +165,7 @@
      
     -	git diff HEAD --no-renames -b --color-moved --color |
     +	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-at-eol \
    -+		--color-moved-ignore-space-change |
    - 		grep -v "index" |
    - 		test_decode_color >actual &&
    - 	cat <<-\EOF >expected &&
    -@@
    - 	# avoid cluttering the output with complaints about our eol whitespace
    - 	test_config core.whitespace -blank-at-eol &&
    - 
    --	git diff HEAD --no-renames --color-moved --color |
    -+	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-no-ignore-space-at-eol |
    ++		--color-moved-ws=ignore-space-change |
      		grep -v "index" |
      		test_decode_color >actual &&
      	cat <<-\EOF >expected &&
    @@ -179,9 +175,7 @@
      
     -	git diff HEAD --no-renames --ignore-space-at-eol --color-moved --color |
     +	git diff HEAD --no-renames --color-moved --color \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-ignore-space-at-eol |
    ++		--color-moved-ws=ignore-space-at-eol |
      		grep -v "index" |
      		test_decode_color >actual &&
      	cat <<-\EOF >expected &&
    @@ -211,10 +205,7 @@
     +	EOF
     +
     +	# Make sure we get a different diff using -w
    -+	git diff --color --color-moved -w \
    -+		--color-moved-no-ignore-all-space \
    -+		--color-moved-no-ignore-space-change \
    -+		--color-moved-no-ignore-space-at-eol |
    ++	git diff --color --color-moved -w |
     +		grep -v "index" |
     +		test_decode_color >actual &&
     +	q_to_tab <<-\EOF >expected &&
    @@ -231,9 +222,7 @@
     +
     +	# And now ignoring white space only in the move detection
     +	git diff --color --color-moved \
    -+		--color-moved-ignore-all-space \
    -+		--color-moved-ignore-space-change \
    -+		--color-moved-ignore-space-at-eol |
    ++		--color-moved-ws=ignore-all-space,ignore-space-change,ignore-space-at-eol |
     +		grep -v "index" |
     +		test_decode_color >actual &&
     +	q_to_tab <<-\EOF >expected &&
7:  ce99fa38553 < -:  ----------- diff.c: add --color-moved-ignore-space-delta option
8:  39c5337cd9e < -:  ----------- diff: color-moved white space handling options imply color-moved
-:  ----------- > 7:  b76faee22fe diff.c: factor advance_or_nullify out of mark_color_as_moved
-:  ----------- > 8:  ab003ed7e27 diff.c: add white space mode to move detection that allows indent changes

             reply	other threads:[~2018-06-22  1:57 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-22  1:57 Stefan Beller [this message]
2018-06-22  1:57 ` [PATCH v3 1/8] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-22  1:57 ` [PATCH v3 2/8] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-22  1:57 ` [PATCH v3 3/8] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-22  1:57 ` [PATCH v3 4/8] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-22  1:57 ` [PATCH v3 5/8] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-06-22  1:57 ` [PATCH v3 6/8] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-06-22  1:57 ` [PATCH v3 7/8] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-22  1:57 ` [PATCH v3 8/8] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-06-23 16:52   ` SZEDER Gábor
2018-06-22 22:37 ` [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation Junio C Hamano
2018-06-29  0:19 ` [PATCH v4 0/9] " Stefan Beller
2018-06-29  0:19   ` [PATCH v4 1/9] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-29  0:19   ` [PATCH v4 2/9] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-29  0:19   ` [PATCH v4 3/9] t4015: avoid git as a pipe input Stefan Beller
2018-06-29  0:19   ` [PATCH v4 4/9] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-29  0:19   ` [PATCH v4 5/9] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-29  0:19   ` [PATCH v4 6/9] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-07-02 17:18     ` Brandon Williams
2018-06-29  0:19   ` [PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-07-02 17:22     ` Brandon Williams
2018-06-29  0:19   ` [PATCH v4 8/9] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-29  0:19   ` [PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-07-02 17:36     ` Brandon Williams
2018-07-02 18:59       ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180622015725.219575-1-sbeller@google.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=simon@ruderich.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).