git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: "Michael Haggerty" <mhagger@alum.mit.edu>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narębski" <jnareb@gmail.com>,
	"Jacob Keller" <jacob.keller@gmail.com>
Subject: Re: [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs
Date: Thu, 4 Aug 2016 09:55:39 -0700	[thread overview]
Message-ID: <CAGZ79kYCwoo6sYefu++KyCdgHfKvRsUp98ZAur+E7E4o_FLtEw@mail.gmail.com> (raw)
In-Reply-To: <20160804075631.jakbi5dbsbxsqcpr@sigill.intra.peff.net>

On Thu, Aug 4, 2016 at 12:56 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Aug 04, 2016 at 12:00:36AM +0200, Michael Haggerty wrote:
>
>> This table shows the number of diff slider groups that were positioned
>> differently than the human-generated values, for various repositories.
>> "default" is the default "git diff" algorithm. "compaction" is Git 2.9.0
>> with the `--compaction-heuristic` option "indent" is an earlier,
>
> s/option/&./
>
>>  static int diff_detect_rename_default;
>> +static int diff_indent_heuristic; /* experimental */
>>  static int diff_compaction_heuristic; /* experimental */
>
> These two flags are mutually exclusive in the xdiff code, so we should
> probably handle that here.
>
> TBH, I do not care that much what:
>
>   [diff]
>   compactionHeuristic = true
>   indentHeuristic = true
>
> does. But right now:
>
>   git config diff.compactionHeuristic true
>   git show --indent-heuristic
>
> still prefers the compaction heuristic, which I think is objectively
> wrong.
>
> So perhaps we need a single variable:
>
>   enum {
>     DIFF_HEURISTIC_COMPACTION,
>     DIFF_HEURISTIC_INDENT
>   } diff_heuristic;
>
> and set it in last-one-wins fashion (it would be nice if the config and
> command line options were shaped the same way so it's clear to the user
> that they are exclusive, but we may have to keep --compaction-heuristic
> around for compatibility, as an alias for --diff-heuristic=compaction).
>
>> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
>> index 642cce1..ee3d812 100755
>> --- a/git-add--interactive.perl
>> +++ b/git-add--interactive.perl
>> @@ -45,6 +45,7 @@ my ($diff_new_color) =
>>  my $normal_color = $repo->get_color("", "reset");
>>
>>  my $diff_algorithm = $repo->config('diff.algorithm');
>> +my $diff_indent_heuristic = $repo->config_bool('diff.indentheuristic');
>>  my $diff_compaction_heuristic = $repo->config_bool('diff.compactionheuristic');
>
> Nice touch.
>
> Unfortunately the mutual-exclusivity handling will probably bleed over
> to here, too.
>
>> +/*
>> + * If a line is indented more than this, get_indent() just returns this value.
>> + * This avoids having to do absurd amounts of work for data that are not
>> + * human-readable text, and also ensures that the output of get_indent fits within
>> + * an int.
>> + */
>> +#define MAX_INDENT 200
>
> Speaking of absurd amounts of work, I was curious if there was a
> noticeable performance penalty for using this heuristic (just because
> it's a lot more complicated than the others). I couldn't detect any
> differences running "git log -p --no-merges -3000" on git.git with no
> heuristic, compaction, and indent. There may be other repositories that
> behave more pathologically (it looks like having 20 blank lines at the
> end of each hunk?), but I'd guess in most cases this will always be
> drowned out in the noise of doing the actual diff.
>
>> +#define START_OF_FILE_BONUS 9
>> +#define END_OF_FILE_BONUS 46
>> +#define TOTAL_BLANK_WEIGHT 4
>> +#define PRE_BLANK_WEIGHT 16
>> +#define RELATIVE_INDENT_BONUS -1
>> +#define RELATIVE_INDENT_HAS_BLANK_BONUS 15
>> +#define RELATIVE_OUTDENT_BONUS -19
>> +#define RELATIVE_OUTDENT_HAS_BLANK_BONUS 2
>> +#define RELATIVE_DEDENT_BONUS -63
>> +#define RELATIVE_DEDENT_HAS_BLANK_BONUS 50
>
> I see there is a comment below here mentioning that these are empirical
> voodoo, but it might be worth one at the top (or just moving these below
> the comment) because the comment looks like it's just associated with
> the function (and these are sufficiently bizarre that anybody reading is
> going to double-take on them).
>
>> +        return 10 * score - bonus;
>
> I don't mind this not "10" not being a #define constant, but after
> reading the exchange between you and Stefan, I think it would be nice to
> describe what it is in a comment. The rest of the function is commented
> so nicely that this one left me thinking "huh?" upon seeing the "10".

After a night of sleep I agree with Peffs statement here, it's not about the
#define, it's about the comment. (which the #define would have given in a
short cryptic way in angry capital letters).

I have just reread the scoring function and I think you could pull out the
`score=indent` assignment (it is always assigned except for indent <0)

        if (indent == -1)
               score = 0;
        else
               score = indent;
        ... lots of bonus computation below, which in its current implementation
        have lots of "score = indent;" lines as well.

Thanks,
Stefan

  reply	other threads:[~2016-08-04 16:55 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03 22:00 [PATCH 0/8] Better heuristics make prettier diffs Michael Haggerty
2016-08-03 22:00 ` [PATCH 1/8] xdl_change_compact(): rename some local variables for clarity Michael Haggerty
2016-08-04  7:06   ` Jeff King
2016-08-04 18:24     ` Junio C Hamano
2016-08-13 19:38     ` Michael Haggerty
2016-08-14 12:26       ` Jeff King
2016-08-03 22:00 ` [PATCH 2/8] xdl_change_compact(): clarify code Michael Haggerty
2016-08-03 22:11   ` Stefan Beller
2016-08-03 23:14     ` Michael Haggerty
2016-08-03 23:50       ` Stefan Beller
2016-08-04  7:13         ` Jeff King
2016-08-10 16:39         ` Michael Haggerty
2016-08-10 16:58           ` Stefan Beller
2016-08-03 22:00 ` [PATCH 3/8] xdl_change_compact(): rename i to end Michael Haggerty
2016-08-04  7:16   ` Jeff King
2016-08-03 22:00 ` [PATCH 4/8] xdl_change_compact(): do one final shift or the other, not both Michael Haggerty
2016-08-03 22:00 ` [PATCH 5/8] xdl_change_compact(): fix compaction heuristic to adjust io Michael Haggerty
2016-08-04  7:27   ` Jeff King
2016-08-10 16:58     ` Michael Haggerty
2016-08-10 17:09       ` Michael Haggerty
2016-08-11  4:16       ` Jeff King
2016-08-04 18:43   ` Junio C Hamano
2016-08-10 17:13     ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 6/8] xdl_change_compact(): keep track of the earliest end Michael Haggerty
2016-08-04 18:46   ` Junio C Hamano
2016-08-10 17:16     ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 7/8] is_blank_line: take a single xrecord_t as argument Michael Haggerty
2016-08-04 18:48   ` Junio C Hamano
2016-08-03 22:00 ` [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs Michael Haggerty
2016-08-03 22:29   ` Jacob Keller
2016-08-03 22:36     ` Michael Haggerty
2016-08-04  4:47       ` Jacob Keller
2016-08-04 19:39       ` Junio C Hamano
2016-08-10 19:01         ` Michael Haggerty
2016-08-10 21:28           ` Junio C Hamano
2016-08-03 22:30   ` Stefan Beller
2016-08-03 22:41     ` Michael Haggerty
2016-08-03 22:51       ` Stefan Beller
2016-08-03 23:30         ` Michael Haggerty
2016-08-04  0:04           ` Stefan Beller
2016-08-10 19:12             ` Michael Haggerty
2016-08-04  7:56   ` Jeff King
2016-08-04 16:55     ` Stefan Beller [this message]
2016-08-04 19:47       ` Junio C Hamano
2016-08-13  0:09       ` Michael Haggerty
2016-08-12 23:25     ` Michael Haggerty
2016-08-13  8:59       ` Jeff King
2016-08-13 15:59         ` Junio C Hamano
2016-08-14  7:21           ` Jacob Keller
2016-08-15  6:33         ` Stefan Beller
2016-08-15 20:24           ` Junio C Hamano
2016-08-04 19:52   ` Junio C Hamano
2016-08-13  0:11     ` Michael Haggerty
2016-08-03 22:08 ` [PATCH 0/8] Better heuristics make prettier diffs Michael Haggerty
2016-08-04  7:38 ` Jeff King
2016-08-04 19:54   ` Junio C Hamano
2016-08-04 20:01     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kYCwoo6sYefu++KyCdgHfKvRsUp98ZAur+E7E4o_FLtEw@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).