From: Antoine Pelisse <apelisse@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git <git@vger.kernel.org>
Subject: Re: [PATCH] diff: add --ignore-blank-lines option
Date: Mon, 17 Jun 2013 21:09:59 +0200 [thread overview]
Message-ID: <CALWbr2zM=rD3GE9a=Xyrvz0E5mAMsDesJu8-Zs7JH7W4U4AbeA@mail.gmail.com> (raw)
In-Reply-To: <7vzjuog175.fsf@alter.siamese.dyndns.org>
On Mon, Jun 17, 2013 at 6:18 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Antoine Pelisse <apelisse@gmail.com> writes:
>
>> So here is a more thorough description of the option:
>
>> - real changes are interesting
>
> OK, I think I can understand it.
>
>> - blank lines that are close enough (less than context size) to
>> interesting changes are considered interesting (recursive definition)
>
> OK.
>
>> - "context" lines are used around each hunk of interesting changes
>
> OK.
>
>> - If two hunks are separated by less than "inter-hunk-context", they
>> will be merged into one.
>
> Makes sense.
>
>> The current implementation does the "interesting changes selection" in a
>> single pass.
>
> "current" meaning "the code after this patch is applied"? Is there
> a possible future enhancement hinted here?
No. There might be, but I'm not sure it should be discussed right now
(In case you're curious, I'm thinking about interaction with combined
diff). I will take the hint and rephrase.
>> +xdchange_t *xdl_get_hunk(xdchange_t **xscr, xdemitconf_t const *xecfg)
>> +{
>> + xdchange_t *xch, *xchp, *lxch;
>> long max_common = 2 * xecfg->ctxlen + xecfg->interhunkctxlen;
>> + long max_ignorable = xecfg->ctxlen;
>> + unsigned long changes = ULONG_MAX;
Let me explain what "changes" means, as I know it will help the rest
of the message:
It counts the number of *added* blank lines we have ignored since
"lxch" (needed to calculate the distance between lxch and xch)
It also has the meaning of what was called "interesting" before.
If changes == ULONG_MAX, we are still in interesting zone, otherwise
it means we have ignored "changes" *added* blank lines (0 being a
valid value).
(Actually, After rereading this part, it looks like I could check that
lxch == xchp rather than setting changes to ULONG_MAX).
>> +
>> + /* remove ignorable changes that are too far before other changes */
>> + for (xchp = *xscr; xchp && xchp->ignore; xchp = xchp->next) {
>> + xch = xchp->next;
>> +
>> + if (xch == NULL ||
>> + xch->i1 - (xchp->i1 + xchp->chg1) >= max_ignorable)
>> + *xscr = xch;
>> + }
>
> This strips leading ignorable ones away until we see an unignorable
> one. Looks sane.
>
>> + if (*xscr == NULL)
>> + return NULL;
>> +
>> + lxch = *xscr;
>
> "lxch" remembers the last one that is "interesting".
>
>> + for (xchp = *xscr, xch = xchp->next; xch; xchp = xch, xch = xch->next) {
>> + long distance = xch->i1 - (xchp->i1 + xchp->chg1);
>> + if (distance > max_common)
>> break;
>
> If we see large-enough gap, the one we processed last (in xchp) is
> the end of the current hunk. Looks sane.
>
>> + if (distance < max_ignorable &&
>> + (!xch->ignore || changes == ULONG_MAX)) {
>> + lxch = xch;
>> + changes = ULONG_MAX;
>
> The current one is made into the "last interesting one we have seen"
> and the hunk continues, if either (1) the current one is interesting
> by itself, or (2) the last one we saw does not match some
> unexplainable criteria to cause changes set to not ULONG_MAX.
>
> Puzzling.
- If we are still in interesting zone, we take it, even if it's
ignorable change. Because it's close enough.
- Otherwise, only take real changes. We are close to another change,
and we are still in the loop, so it must be interesting.
>> + } else if (changes != ULONG_MAX &&
>> + xch->i1 + changes - (lxch->i1 + lxch->chg1) > max_common) {
>> + break;
>
> If the last one we saw does not match some unexplainable criteria to
> cause changes set to not ULONG_MAX, and the distance between this
> one and the last "intersting" one is further than the context, this
> one will not be a part of the current hunk.
>
> Puzzling.
If we are no longer in "interesting zone" (changes != ULONG_MAX), it
means we will stop if the distance is too big.
"changes" is used in the calculation to consider the changes we have
already ignored (xch->i1 - (lxch->i1 + lxch->chg1) will only work if
xch and lxch are consecutive, we need to add the blank lines we
ignored).
> Could you add comment to the "changes" variable and explain what the
> variable means?
>
>> + } else if (!xch->ignore) {
>> + lxch = xch;
>> + changes = ULONG_MAX;
>
> When this change by itself is interesting, it becomes the "last
> interesting one" and the hunk continues.
Exactly, and changes goes back to "interesting".
>> + } else {
>> + if (changes == ULONG_MAX)
>> + changes = 0;
>> + changes += xch->chg2;
>
> Puzzled beyond guessing. Also it is curious why here and only here
> we look at chg2 side of the things, not i1/chg1 in this whole thing.
chg2 being the number of blank line *additions*.
I don't want to coalesce two hunks because some blank lines have been
removed between the two, so we must not change the distance
calculation because of a blank line removal. That behavior can be seen
in "ignore-blank-lines: between changes" test.
Hope that makes things clearer,
Thanks again for the thorough reading,
Antoine
next prev parent reply other threads:[~2013-06-17 19:10 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-26 17:58 [PATCH] diff: add --ignore-blank-lines option Antoine Pelisse
2013-05-26 20:35 ` Johannes Sixt
2013-05-27 7:14 ` Antoine Pelisse
2013-06-01 8:48 ` Antoine Pelisse
2013-06-04 18:26 ` Junio C Hamano
2013-06-04 19:08 ` Antoine Pelisse
2013-06-04 20:46 ` Junio C Hamano
2013-06-04 20:51 ` Antoine Pelisse
2013-06-08 20:44 ` Antoine Pelisse
2013-06-09 7:33 ` Eric Sunshine
2013-06-09 20:07 ` Junio C Hamano
2013-06-09 20:32 ` Antoine Pelisse
2013-06-09 21:28 ` Junio C Hamano
2013-06-10 21:03 ` Antoine Pelisse
2013-06-10 21:43 ` Junio C Hamano
2013-06-12 13:21 ` Antoine Pelisse
2013-06-12 17:22 ` Junio C Hamano
2013-06-15 13:01 ` Antoine Pelisse
2013-06-17 16:18 ` Junio C Hamano
2013-06-17 17:58 ` Antoine Pelisse
2013-06-17 19:09 ` Antoine Pelisse [this message]
2013-06-17 19:52 ` Junio C Hamano
2013-06-17 21:33 ` Antoine Pelisse
2013-06-17 23:27 ` Junio C Hamano
2013-06-19 18:46 ` Antoine Pelisse
2013-06-19 22:23 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALWbr2zM=rD3GE9a=Xyrvz0E5mAMsDesJu8-Zs7JH7W4U4AbeA@mail.gmail.com' \
--to=apelisse@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).