From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: "Michael Haggerty" <mhagger@alum.mit.edu>,
"git@vger.kernel.org" <git@vger.kernel.org>,
"Junio C Hamano" <gitster@pobox.com>,
"Jakub Narębski" <jnareb@gmail.com>,
"Jacob Keller" <jacob.keller@gmail.com>
Subject: Re: [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs
Date: Thu, 4 Aug 2016 09:55:39 -0700 [thread overview]
Message-ID: <CAGZ79kYCwoo6sYefu++KyCdgHfKvRsUp98ZAur+E7E4o_FLtEw@mail.gmail.com> (raw)
In-Reply-To: <20160804075631.jakbi5dbsbxsqcpr@sigill.intra.peff.net>
On Thu, Aug 4, 2016 at 12:56 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Aug 04, 2016 at 12:00:36AM +0200, Michael Haggerty wrote:
>
>> This table shows the number of diff slider groups that were positioned
>> differently than the human-generated values, for various repositories.
>> "default" is the default "git diff" algorithm. "compaction" is Git 2.9.0
>> with the `--compaction-heuristic` option "indent" is an earlier,
>
> s/option/&./
>
>> static int diff_detect_rename_default;
>> +static int diff_indent_heuristic; /* experimental */
>> static int diff_compaction_heuristic; /* experimental */
>
> These two flags are mutually exclusive in the xdiff code, so we should
> probably handle that here.
>
> TBH, I do not care that much what:
>
> [diff]
> compactionHeuristic = true
> indentHeuristic = true
>
> does. But right now:
>
> git config diff.compactionHeuristic true
> git show --indent-heuristic
>
> still prefers the compaction heuristic, which I think is objectively
> wrong.
>
> So perhaps we need a single variable:
>
> enum {
> DIFF_HEURISTIC_COMPACTION,
> DIFF_HEURISTIC_INDENT
> } diff_heuristic;
>
> and set it in last-one-wins fashion (it would be nice if the config and
> command line options were shaped the same way so it's clear to the user
> that they are exclusive, but we may have to keep --compaction-heuristic
> around for compatibility, as an alias for --diff-heuristic=compaction).
>
>> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
>> index 642cce1..ee3d812 100755
>> --- a/git-add--interactive.perl
>> +++ b/git-add--interactive.perl
>> @@ -45,6 +45,7 @@ my ($diff_new_color) =
>> my $normal_color = $repo->get_color("", "reset");
>>
>> my $diff_algorithm = $repo->config('diff.algorithm');
>> +my $diff_indent_heuristic = $repo->config_bool('diff.indentheuristic');
>> my $diff_compaction_heuristic = $repo->config_bool('diff.compactionheuristic');
>
> Nice touch.
>
> Unfortunately the mutual-exclusivity handling will probably bleed over
> to here, too.
>
>> +/*
>> + * If a line is indented more than this, get_indent() just returns this value.
>> + * This avoids having to do absurd amounts of work for data that are not
>> + * human-readable text, and also ensures that the output of get_indent fits within
>> + * an int.
>> + */
>> +#define MAX_INDENT 200
>
> Speaking of absurd amounts of work, I was curious if there was a
> noticeable performance penalty for using this heuristic (just because
> it's a lot more complicated than the others). I couldn't detect any
> differences running "git log -p --no-merges -3000" on git.git with no
> heuristic, compaction, and indent. There may be other repositories that
> behave more pathologically (it looks like having 20 blank lines at the
> end of each hunk?), but I'd guess in most cases this will always be
> drowned out in the noise of doing the actual diff.
>
>> +#define START_OF_FILE_BONUS 9
>> +#define END_OF_FILE_BONUS 46
>> +#define TOTAL_BLANK_WEIGHT 4
>> +#define PRE_BLANK_WEIGHT 16
>> +#define RELATIVE_INDENT_BONUS -1
>> +#define RELATIVE_INDENT_HAS_BLANK_BONUS 15
>> +#define RELATIVE_OUTDENT_BONUS -19
>> +#define RELATIVE_OUTDENT_HAS_BLANK_BONUS 2
>> +#define RELATIVE_DEDENT_BONUS -63
>> +#define RELATIVE_DEDENT_HAS_BLANK_BONUS 50
>
> I see there is a comment below here mentioning that these are empirical
> voodoo, but it might be worth one at the top (or just moving these below
> the comment) because the comment looks like it's just associated with
> the function (and these are sufficiently bizarre that anybody reading is
> going to double-take on them).
>
>> + return 10 * score - bonus;
>
> I don't mind this not "10" not being a #define constant, but after
> reading the exchange between you and Stefan, I think it would be nice to
> describe what it is in a comment. The rest of the function is commented
> so nicely that this one left me thinking "huh?" upon seeing the "10".
After a night of sleep I agree with Peffs statement here, it's not about the
#define, it's about the comment. (which the #define would have given in a
short cryptic way in angry capital letters).
I have just reread the scoring function and I think you could pull out the
`score=indent` assignment (it is always assigned except for indent <0)
if (indent == -1)
score = 0;
else
score = indent;
... lots of bonus computation below, which in its current implementation
have lots of "score = indent;" lines as well.
Thanks,
Stefan
next prev parent reply other threads:[~2016-08-04 16:55 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-03 22:00 [PATCH 0/8] Better heuristics make prettier diffs Michael Haggerty
2016-08-03 22:00 ` [PATCH 1/8] xdl_change_compact(): rename some local variables for clarity Michael Haggerty
2016-08-04 7:06 ` Jeff King
2016-08-04 18:24 ` Junio C Hamano
2016-08-13 19:38 ` Michael Haggerty
2016-08-14 12:26 ` Jeff King
2016-08-03 22:00 ` [PATCH 2/8] xdl_change_compact(): clarify code Michael Haggerty
2016-08-03 22:11 ` Stefan Beller
2016-08-03 23:14 ` Michael Haggerty
2016-08-03 23:50 ` Stefan Beller
2016-08-04 7:13 ` Jeff King
2016-08-10 16:39 ` Michael Haggerty
2016-08-10 16:58 ` Stefan Beller
2016-08-03 22:00 ` [PATCH 3/8] xdl_change_compact(): rename i to end Michael Haggerty
2016-08-04 7:16 ` Jeff King
2016-08-03 22:00 ` [PATCH 4/8] xdl_change_compact(): do one final shift or the other, not both Michael Haggerty
2016-08-03 22:00 ` [PATCH 5/8] xdl_change_compact(): fix compaction heuristic to adjust io Michael Haggerty
2016-08-04 7:27 ` Jeff King
2016-08-10 16:58 ` Michael Haggerty
2016-08-10 17:09 ` Michael Haggerty
2016-08-11 4:16 ` Jeff King
2016-08-04 18:43 ` Junio C Hamano
2016-08-10 17:13 ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 6/8] xdl_change_compact(): keep track of the earliest end Michael Haggerty
2016-08-04 18:46 ` Junio C Hamano
2016-08-10 17:16 ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 7/8] is_blank_line: take a single xrecord_t as argument Michael Haggerty
2016-08-04 18:48 ` Junio C Hamano
2016-08-03 22:00 ` [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs Michael Haggerty
2016-08-03 22:29 ` Jacob Keller
2016-08-03 22:36 ` Michael Haggerty
2016-08-04 4:47 ` Jacob Keller
2016-08-04 19:39 ` Junio C Hamano
2016-08-10 19:01 ` Michael Haggerty
2016-08-10 21:28 ` Junio C Hamano
2016-08-03 22:30 ` Stefan Beller
2016-08-03 22:41 ` Michael Haggerty
2016-08-03 22:51 ` Stefan Beller
2016-08-03 23:30 ` Michael Haggerty
2016-08-04 0:04 ` Stefan Beller
2016-08-10 19:12 ` Michael Haggerty
2016-08-04 7:56 ` Jeff King
2016-08-04 16:55 ` Stefan Beller [this message]
2016-08-04 19:47 ` Junio C Hamano
2016-08-13 0:09 ` Michael Haggerty
2016-08-12 23:25 ` Michael Haggerty
2016-08-13 8:59 ` Jeff King
2016-08-13 15:59 ` Junio C Hamano
2016-08-14 7:21 ` Jacob Keller
2016-08-15 6:33 ` Stefan Beller
2016-08-15 20:24 ` Junio C Hamano
2016-08-04 19:52 ` Junio C Hamano
2016-08-13 0:11 ` Michael Haggerty
2016-08-03 22:08 ` [PATCH 0/8] Better heuristics make prettier diffs Michael Haggerty
2016-08-04 7:38 ` Jeff King
2016-08-04 19:54 ` Junio C Hamano
2016-08-04 20:01 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGZ79kYCwoo6sYefu++KyCdgHfKvRsUp98ZAur+E7E4o_FLtEw@mail.gmail.com \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jacob.keller@gmail.com \
--cc=jnareb@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).