git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: "Jonathan Tan" <jonathantanmy@google.com>,
	git <git@vger.kernel.org>, "Simon Ruderich" <simon@ruderich.org>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jacob Keller" <jacob.keller@gmail.com>
Subject: Re: [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option
Date: Tue, 24 Apr 2018 16:23:28 -0700	[thread overview]
Message-ID: <CAGZ79kbQUWq_pfvwLWotqCxc1-Y=ctJ4SqgqR+EvJ5wkJkp5kQ@mail.gmail.com> (raw)
In-Reply-To: <CAGZ79kbGkHFSS9K8KKTwNikx3Tw+m+RMLY3RAf8SW_iK9a2OJQ@mail.gmail.com>

Replied to Jonathan only instead of all. My reply is below:

On Tue, Apr 24, 2018 at 3:55 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Apr 24, 2018 at 3:35 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
>> On Tue, 24 Apr 2018 14:03:30 -0700
>> Stefan Beller <sbeller@google.com> wrote:
>>
>>> +--color-moved-[no-]ignore-space-prefix-delta::
>>> +     Ignores whitespace when comparing lines when performing the move
>>> +     detection for --color-moved. This ignores uniform differences
>>> +     of white space at the beginning lines in moved blocks.
>>
>> Setting this option means that moved lines may be indented or dedented,
>> and if they have been indented or dedented by the same amount, they are
>> still considered to be the same block. Maybe call this
>> --color-moved-allow-indentation-change.
>
> ok, sounds good as well. I tried coming up with a name that refers to
> the block check as that is the important part.
>
>>> +struct ws_delta {
>>> +     char *string; /* The prefix delta, which is the same in the block */
>>
>> Probably better described as "the difference between the '-' line and
>> the '+' line". This may be the empty string if there is no difference.
>
> Makes sense.
>
>>
>>> +     int direction; /* adding or removing the line? */
>>
>> What is the value when "added" and what when "removed"? Also, it is not
>> truly "added" or "removed", so a better way might be: 1 if the '-' line
>> is longer than the '+' line, and 0 otherwise. (And make sure that the
>> documentation is correct with respect to equal lines.)
>>
>>> +     int missmatch; /* in the remainder */
>>
>> s/missmatch/mismatch/
>> Also, what is this used for?
>
> The mismatch should be used for (thanks for catching!)
> checking if the remainder of a line is the same, although a boolean
> may be not the correct choice. We know that the two strings
> passed into compute_ws_delta come from a complete white space
> agnostic comparison, so consider:
>
> + SP SP more TAB more
> + SP SP text TAB text
>
> - SP more TAB more
> - SP text TAB text
>
> which would mark this as a moved block. This is the feature
> working as intended, but what about
>
> + SP SP more TAB more
> + SP SP text TAB text
>
> - SP more SP more
> - SP text TAB text
>
> Note how the length of the strings is the same, hence the current
> code of
>
>     compute_ws_delta(...) {
>         int d = longer->len - shorter->len;
>         out->string = xmemdupz(longer->line, d);
>     }
>
> would work fine and not notice the "change in the remainder".
> That ought to be caught by the mismatch variable, that
> is assigned, but not used.
>
> The compare_ws_delta(a, b) needs to be extended to
>
>   !a->mismatch && !b->mismatch && existing_condition
>
> Ideally the example from above would be different depending
> on whether the other white space flags are given or not.
>
>>> +     if (diffopt->color_moved_ws_handling & COLOR_MOVED_DELTA_WHITESPACES)
>>> +             /*
>>> +              * As there is not specific white space config given,
>>> +              * we'd need to check for a new block, so ignore all
>>> +              * white space. The setup of the white space
>>> +              * configuration for the next block is done else where
>>> +              */
>>> +             flags |= XDF_IGNORE_WHITESPACE;
>>> +
>>>       return !xdiff_compare_lines(a->es->line, a->es->len,
>>>                                   b->es->line, b->es->len,
>>>                                   flags);
>>
>> I wrote in [1]:
>>
>>   I think we should just prohibit combining this with any of the
>>   whitespace ignoring flags except for the space-at-eol one. They seem
>>   to contradict anyway.
>
> As outlined above, I think there are corner cases in which they do not
> contradict. So I think the COLOR_MOVED_DELTA_WHITESPACES
> will go into its own variable, and then we can solve the corner cases
> correctly.
>
>> To elaborate, adding a specific flag that may interfere with other
>> user-provided flags sounds like we're unnecessarily adding combinations
>> that we must keep track of, and that it's both logical (from a user's
>> point of view) and clearer (as for the code) to just forbid such
>> combinations.
>
> Yes, I think you mentioned this before. Thanks for reminding!
>
>>> +     struct ws_delta *wsd = NULL; /* white space deltas between pmb */
>>> +     int wsd_alloc = 0;
>>> +
>>> +     int n, flipped_block = 1, block_length = 0;
>>
>> It seems like pmb and wsd are parallel arrays - could each wsd be
>> embedded into the corresponding entry of pmb instead?
>
> I'll explore that. It sounds like a good idea for code hygiene.
> Although if you do not intend to use this feature, then keeping it separate
> would allow for a smaller footprint in memory.
>
>>
>>> --- a/diff.h
>>> +++ b/diff.h
>>> @@ -214,6 +214,8 @@ struct diff_options {
>>>       } color_moved;
>>>       #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
>>>       #define COLOR_MOVED_MIN_ALNUM_COUNT 20
>>> +     /* XDF_WHITESPACE_FLAGS regarding block detection are set at 2, 3, 4 */
>>> +     #define COLOR_MOVED_DELTA_WHITESPACES   (1 << 22)
>>>       int color_moved_ws_handling;
>>>  };
>>
>> Setting of DELTA_WHITESPACES should be a separate field, not as part of
>> ws_handling. It's fine for the ws_handling to be a bitset, since that's
>> how it's passed to xdiff_compare_lines(), but we don't need to do the
>> same for DELTA_WHITESPACES.
>
> You are correct. Thanks for your patience in this series!
>
>> +     git diff --color --color-moved-ignore-space-prefix-delta |
>> +             grep -v "index" |
>> +             test_decode_color >actual &&
>
>>> +     q_to_tab <<-\EOF >expected &&
>>> +             <BOLD>diff --git a/text.txt b/text.txt<RESET>
>>> +             <BOLD>--- a/text.txt<RESET>
>>> +             <BOLD>+++ b/text.txt<RESET>
>>> +             <CYAN>@@ -1,7 +1,7 @@<RESET>
>>> +             <RED>-QIndented<RESET>
>>> +             <RED>-QText across<RESET>
>>> +             <RED>-Qthree lines<RESET>
>>> +             <RED>-QBut! <- this stands out<RESET>
>>> +             <RED>-Qthis one<RESET>
>>> +             <RED>-QQline did<RESET>
>>> +             <RED>-Qnot adjust<RESET>
>>> +             <GREEN>+<RESET>QQ<GREEN>Indented<RESET>
>>> +             <GREEN>+<RESET>QQ<GREEN>Text across<RESET>
>>> +             <GREEN>+<RESET>QQ<GREEN>three lines<RESET>
>>> +             <GREEN>+<RESET>QQQ<GREEN>But! <- this stands out<RESET>
>>> +             <GREEN>+<RESET><GREEN>this one<RESET>
>>> +             <GREEN>+<RESET>Q<GREEN>line did<RESET>
>>> +             <GREEN>+<RESET><GREEN>not adjust<RESET>
>>> +     EOF
>>> +
>>> +     test_cmp expected actual
>>> +'
>>
>> I would have expected every line except the "this stands out" line to be
>> colored differently than the usual RED and GREEN. Is this test output
>> expected?
>
> It is wrong indeed. I blindly copied the actual file once interactive testing
> confirmed it worked.
>
> The command is missing a --color-moved, as the --color-moved-whitespace-settings
> do not imply --color-moved, yet(?)
>
> Thanks,
> Stefan

  parent reply	other threads:[~2018-04-24 23:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-24 21:03 [PATCHv2 0/7] Moved code detection: ignore space on uniform indentation Stefan Beller
2018-04-24 21:03 ` [PATCH 1/7] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-04-24 21:03 ` [PATCH 2/7] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-04-24 21:03 ` [PATCH 3/7] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-04-24 21:03 ` [PATCH 4/7] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-04-24 21:03 ` [PATCH 5/7] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-04-24 21:50   ` Jonathan Tan
2018-04-24 22:37     ` Stefan Beller
2018-04-24 22:59       ` Jonathan Tan
2018-04-24 21:03 ` [PATCH 6/7] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-04-24 22:00   ` Jonathan Tan
2018-04-24 22:19     ` Stefan Beller
2018-04-24 21:03 ` [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option Stefan Beller
2018-04-24 22:35   ` Jonathan Tan
     [not found]     ` <CAGZ79kbGkHFSS9K8KKTwNikx3Tw+m+RMLY3RAf8SW_iK9a2OJQ@mail.gmail.com>
2018-04-24 23:23       ` Stefan Beller [this message]
2018-04-25  0:11         ` Jonathan Tan
2018-04-24 22:37 ` [PATCHv2 0/7] Moved code detection: ignore space on uniform indentation Jonathan Tan
2018-04-24 22:58   ` Stefan Beller
  -- strict thread matches above, loose matches on Subject: below --
2018-04-02 22:48 [RFC PATCH " Stefan Beller
2018-04-02 22:48 ` [PATCH 7/7] diff.c: add --color-moved-ignore-space-delta option Stefan Beller
2018-04-03  0:41   ` Jonathan Tan
2018-04-03 19:22     ` Stefan Beller
2018-04-03 20:38       ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kbQUWq_pfvwLWotqCxc1-Y=ctJ4SqgqR+EvJ5wkJkp5kQ@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=simon@ruderich.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).