git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Brandon Williams <bmwill@google.com>
Cc: git <git@vger.kernel.org>, Jacob Keller <jacob.keller@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Simon Ruderich <simon@ruderich.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes
Date: Mon, 2 Jul 2018 11:59:06 -0700	[thread overview]
Message-ID: <CAGZ79kasAqE+=7ciVrdjoRdu0UFjVBr8Ma502nw+3hZL=ebXYQ@mail.gmail.com> (raw)
In-Reply-To: <20180702173614.GD246956@google.com>

On Mon, Jul 2, 2018 at 10:36 AM Brandon Williams <bmwill@google.com> wrote:
>
> On 06/28, Stefan Beller wrote:
> > The option of --color-moved has proven to be useful as observed on the
> > mailing list. However when refactoring sometimes the indentation changes,
> > for example when partitioning a functions into smaller helper functions
> > the code usually mostly moved around except for a decrease in indentation.
> >
> > To just review the moved code ignoring the change in indentation, a mode
> > to ignore spaces in the move detection as implemented in a previous patch
> > would be enough.  However the whole move coloring as motivated in commit
> > 2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
> > up the notion of the reviewer being able to trust the move of a "block".
> >
> > As there are languages such as python, which depend on proper relative
> > indentation for the control flow of the program, ignoring any white space
> > change in a block would not uphold the promises of 2e2d5ac that allows
> > reviewers to pay less attention to the inside of a block, as inside
> > the reviewer wants to assume the same program flow.
> >
> > This new mode of white space ignorance will take this into account and will
> > only allow the same white space changes per line in each block. This patch
> > even allows only for the same change at the beginning of the lines.
> >
> > As this is a white space mode, it is made exclusive to other white space
> > modes in the move detection.
> >
> > This patch brings some challenges, related to the detection of blocks.
> > We need a white net the catch the possible moved lines, but then need to
>
> s/white/wide/

oops. will fix.

>
> > +
> > +/**
> > + * The struct ws_delta holds white space differences between moved lines, i.e.
> > + * between '+' and '-' lines that have been detected to be a move.
> > + * The string contains the difference in leading white spaces, before the
> > + * rest of the line is compared using the white space config for move
> > + * coloring. The current_longer indicates if the first string in the
> > + * comparision is longer than the second.
> > + */
> > +struct ws_delta {
> > +     char *string;
> > +     unsigned int current_longer : 1;
> >  };
> > +#define WS_DELTA_INIT { NULL, 0 }
> > +
> > +static int compute_ws_delta(const struct emitted_diff_symbol *a,
> > +                          const struct emitted_diff_symbol *b,
> > +                          struct ws_delta *out)
> > +{
> > +     const struct emitted_diff_symbol *longer =  a->len > b->len ? a : b;
> > +     const struct emitted_diff_symbol *shorter = a->len > b->len ? b : a;
> > +     int d = longer->len - shorter->len;
> > +
> > +     out->string = xmemdupz(longer->line, d);
> > +     out->current_longer = (a == longer);
> > +
> > +     return !strncmp(longer->line + d, shorter->line, shorter->len);
> > +}
>
> I'm having a harder time understanding this block.  This is used to fill
> a ws_delta struct by calculating the whitespace delta between two lines.

yes.

> If that is the case then why doesn't this function verify that the first
> 'd' characters in the longer line are indeed whitespace?

This was done implicitly before as compute_ws_delta is called only
on two lines that are "equal with XDF_IGNORE_WHITESPACE".

> Also, maybe
> this is just because I'm not as familiar with the move detection code,
> but how would the whitespace detection handle a line being moved from
> being indented with tabs to spaces or vice versa?  Is this something
> already handled and not an issue?

The ws_delta stores the string (of whitespaces) that the 'shorter string'
needs to be prefixed with to create the 'longer string.
(I chose 'shorter' and 'longer' as any of them can be '+' or '-' lines)

Then later when we compare the strings (the current line in
consideration and strings that we put in hashmaps that we
know are moved lines) in cmp_in_block_with_wsd,
we (should) take this white space delta into account.

I just realize that there we do not check for the tab/space replacement
as there we would need to compare the ws_delta string to the beginning
of the 'longer' string there.

I'll add a test for the space/tab replacement as well.

Thanks,
Stefan

      reply	other threads:[~2018-07-02 19:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-22  1:57 [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation Stefan Beller
2018-06-22  1:57 ` [PATCH v3 1/8] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-22  1:57 ` [PATCH v3 2/8] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-22  1:57 ` [PATCH v3 3/8] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-22  1:57 ` [PATCH v3 4/8] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-22  1:57 ` [PATCH v3 5/8] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-06-22  1:57 ` [PATCH v3 6/8] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-06-22  1:57 ` [PATCH v3 7/8] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-22  1:57 ` [PATCH v3 8/8] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-06-23 16:52   ` SZEDER Gábor
2018-06-22 22:37 ` [PATCH v3 0/8] Moved code detection: ignore space on uniform indentation Junio C Hamano
2018-06-29  0:19 ` [PATCH v4 0/9] " Stefan Beller
2018-06-29  0:19   ` [PATCH v4 1/9] xdiff/xdiff.h: remove unused flags Stefan Beller
2018-06-29  0:19   ` [PATCH v4 2/9] xdiff/xdiffi.c: remove unneeded function declarations Stefan Beller
2018-06-29  0:19   ` [PATCH v4 3/9] t4015: avoid git as a pipe input Stefan Beller
2018-06-29  0:19   ` [PATCH v4 4/9] diff.c: do not pass diff options as keydata to hashmap Stefan Beller
2018-06-29  0:19   ` [PATCH v4 5/9] diff.c: adjust hash function signature to match hashmap expectation Stefan Beller
2018-06-29  0:19   ` [PATCH v4 6/9] diff.c: add a blocks mode for moved code detection Stefan Beller
2018-07-02 17:18     ` Brandon Williams
2018-06-29  0:19   ` [PATCH v4 7/9] diff.c: decouple white space treatment from move detection algorithm Stefan Beller
2018-07-02 17:22     ` Brandon Williams
2018-06-29  0:19   ` [PATCH v4 8/9] diff.c: factor advance_or_nullify out of mark_color_as_moved Stefan Beller
2018-06-29  0:19   ` [PATCH v4 9/9] diff.c: add white space mode to move detection that allows indent changes Stefan Beller
2018-07-02 17:36     ` Brandon Williams
2018-07-02 18:59       ` Stefan Beller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kasAqE+=7ciVrdjoRdu0UFjVBr8Ma502nw+3hZL=ebXYQ@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=simon@ruderich.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).