git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Antoine Pelisse <apelisse@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git <git@vger.kernel.org>
Subject: Re: [PATCH] diff: add --ignore-blank-lines option
Date: Mon, 17 Jun 2013 21:09:59 +0200	[thread overview]
Message-ID: <CALWbr2zM=rD3GE9a=Xyrvz0E5mAMsDesJu8-Zs7JH7W4U4AbeA@mail.gmail.com> (raw)
In-Reply-To: <7vzjuog175.fsf@alter.siamese.dyndns.org>

On Mon, Jun 17, 2013 at 6:18 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Antoine Pelisse <apelisse@gmail.com> writes:
>
>> So here is a more thorough description of the option:
>
>> - real changes are interesting
>
> OK, I think I can understand it.
>
>> - blank lines that are close enough (less than context size) to
>>   interesting changes are considered interesting (recursive definition)
>
> OK.
>
>> - "context" lines are used around each hunk of interesting changes
>
> OK.
>
>> - If two hunks are separated by less than "inter-hunk-context", they
>>   will be merged into one.
>
> Makes sense.
>
>> The current implementation does the "interesting changes selection" in a
>> single pass.
>
> "current" meaning "the code after this patch is applied"?  Is there
> a possible future enhancement hinted here?

No. There might be, but I'm not sure it should be discussed right now
(In case you're curious, I'm thinking about interaction with combined
diff). I will take the hint and rephrase.

>> +xdchange_t *xdl_get_hunk(xdchange_t **xscr, xdemitconf_t const *xecfg)
>> +{
>> +     xdchange_t *xch, *xchp, *lxch;
>>       long max_common = 2 * xecfg->ctxlen + xecfg->interhunkctxlen;
>> +     long max_ignorable = xecfg->ctxlen;
>> +     unsigned long changes = ULONG_MAX;

Let me explain what "changes" means, as I know it will help the rest
of the message:
It counts the number of *added* blank lines we have ignored since
"lxch" (needed to calculate the distance between lxch and xch)
It also has the meaning of what was called "interesting" before.
If changes == ULONG_MAX, we are still in interesting zone, otherwise
it means we have ignored "changes" *added* blank lines (0 being a
valid value).
(Actually, After rereading this part, it looks like I could check that
lxch == xchp rather than setting changes to ULONG_MAX).

>> +
>> +     /* remove ignorable changes that are too far before other changes */
>> +     for (xchp = *xscr; xchp && xchp->ignore; xchp = xchp->next) {
>> +             xch = xchp->next;
>> +
>> +             if (xch == NULL ||
>> +                 xch->i1 - (xchp->i1 + xchp->chg1) >= max_ignorable)
>> +                     *xscr = xch;
>> +     }
>
> This strips leading ignorable ones away until we see an unignorable
> one.  Looks sane.
>
>> +     if (*xscr == NULL)
>> +             return NULL;
>> +
>> +     lxch = *xscr;
>
> "lxch" remembers the last one that is "interesting".
>
>> +     for (xchp = *xscr, xch = xchp->next; xch; xchp = xch, xch = xch->next) {
>> +             long distance = xch->i1 - (xchp->i1 + xchp->chg1);
>> +             if (distance > max_common)
>>                       break;
>
> If we see large-enough gap, the one we processed last (in xchp) is
> the end of the current hunk.  Looks sane.
>
>> +             if (distance < max_ignorable &&
>> +                 (!xch->ignore || changes == ULONG_MAX)) {
>> +                     lxch = xch;
>> +                     changes = ULONG_MAX;
>
> The current one is made into the "last interesting one we have seen"
> and the hunk continues, if either (1) the current one is interesting
> by itself, or (2) the last one we saw does not match some
> unexplainable criteria to cause changes set to not ULONG_MAX.
>
> Puzzling.

- If we are still in interesting zone, we take it, even if it's
ignorable change. Because it's close enough.
- Otherwise, only take real changes. We are close to another change,
and we are still in the loop, so it must be interesting.

>> +             } else if (changes != ULONG_MAX &&
>> +                        xch->i1 + changes - (lxch->i1 + lxch->chg1) > max_common) {
>> +                     break;
>
> If the last one we saw does not match some unexplainable criteria to
> cause changes set to not ULONG_MAX, and the distance between this
> one and the last "intersting" one is further than the context, this
> one will not be a part of the current hunk.
>
> Puzzling.

If we are no longer in "interesting zone" (changes != ULONG_MAX), it
means we will stop if the distance is too big.
"changes" is used in the calculation to consider the changes we have
already ignored (xch->i1 - (lxch->i1 + lxch->chg1) will only work if
xch and lxch are consecutive, we need to add the blank lines we
ignored).

> Could you add comment to the "changes" variable and explain what the
> variable means?
>
>> +             } else if (!xch->ignore) {
>> +                     lxch = xch;
>> +                     changes = ULONG_MAX;
>
> When this change by itself is interesting, it becomes the "last
> interesting one" and the hunk continues.

Exactly, and changes goes back to "interesting".

>> +             } else {
>> +                     if (changes == ULONG_MAX)
>> +                             changes = 0;
>> +                     changes += xch->chg2;
>
> Puzzled beyond guessing.  Also it is curious why here and only here
> we look at chg2 side of the things, not i1/chg1 in this whole thing.

chg2 being the number of blank line *additions*.
I don't want to coalesce two hunks because some blank lines have been
removed between the two, so we must not change the distance
calculation because of a blank line removal. That behavior can be seen
in "ignore-blank-lines: between changes" test.

Hope that makes things clearer,
Thanks again for the thorough reading,

Antoine

  parent reply	other threads:[~2013-06-17 19:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-26 17:58 [PATCH] diff: add --ignore-blank-lines option Antoine Pelisse
2013-05-26 20:35 ` Johannes Sixt
2013-05-27  7:14   ` Antoine Pelisse
2013-06-01  8:48     ` Antoine Pelisse
2013-06-04 18:26 ` Junio C Hamano
2013-06-04 19:08   ` Antoine Pelisse
2013-06-04 20:46     ` Junio C Hamano
2013-06-04 20:51       ` Antoine Pelisse
2013-06-08 20:44       ` Antoine Pelisse
2013-06-09  7:33         ` Eric Sunshine
2013-06-09 20:07         ` Junio C Hamano
2013-06-09 20:32           ` Antoine Pelisse
2013-06-09 21:28             ` Junio C Hamano
2013-06-10 21:03           ` Antoine Pelisse
2013-06-10 21:43             ` Junio C Hamano
2013-06-12 13:21               ` Antoine Pelisse
2013-06-12 17:22                 ` Junio C Hamano
2013-06-15 13:01                   ` Antoine Pelisse
2013-06-17 16:18                     ` Junio C Hamano
2013-06-17 17:58                       ` Antoine Pelisse
2013-06-17 19:09                       ` Antoine Pelisse [this message]
2013-06-17 19:52                         ` Junio C Hamano
2013-06-17 21:33                           ` Antoine Pelisse
2013-06-17 23:27                             ` Junio C Hamano
2013-06-19 18:46                               ` Antoine Pelisse
2013-06-19 22:23                                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALWbr2zM=rD3GE9a=Xyrvz0E5mAMsDesJu8-Zs7JH7W4U4AbeA@mail.gmail.com' \
    --to=apelisse@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).