git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org
Cc: "Stefan Beller" <sbeller@google.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Jakub Narębski" <jnareb@gmail.com>,
	"Jacob Keller" <jacob.keller@gmail.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>
Subject: [PATCH 0/8] Better heuristics make prettier diffs
Date: Thu,  4 Aug 2016 00:00:28 +0200	[thread overview]
Message-ID: <cover.1470259583.git.mhagger@alum.mit.edu> (raw)

I've talked about this quite a bit on the list already. The idea is to
improve ugly diffs like

    @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
     }

     if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
    +if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {
                            $smtp_server = $_;

by feeding clues from the surrounding lines (namely their patterns of
indentation and blank lines) into a heuristic that more often produces
the diffs that users would rather see, like

    --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
    +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
    @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
            $initial_reply_to =~ s/(^\s+|\s+$)//g;
     }

    +if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
     if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {

See the last commit's log message for a very detailed explanation of
the heuristic, how it was optimized, and how you can get involved to
make sure that the heuristic also works well for your favorite
language.

When tested against a corpus of 2700 diffs that I optimized by hand,
this heuristic gets a "wrong" answer only about 1.7% as frequently as
the current default Git algorithm and only about 5.3% as often as `git
diff --compaction-heuristic`. (Though please don't treat these numbers
as final; I want to verify them again first.)

For now the new algorithm has to be enabled explicitly using either
`--indent-heuristic` or `git config diff.indentheuristic true`.

Michael Haggerty (8):
  xdl_change_compact(): rename some local variables for clarity
  xdl_change_compact(): clarify code
  xdl_change_compact(): rename i to end
  xdl_change_compact(): do one final shift or the other, not both
  xdl_change_compact(): fix compaction heuristic to adjust io
  xdl_change_compact(): keep track of the earliest end
  is_blank_line: take a single xrecord_t as argument
  diff: improve positioning of add/delete blocks in diffs

 Documentation/diff-options.txt |   6 +-
 diff.c                         |  11 +
 git-add--interactive.perl      |   5 +-
 xdiff/xdiff.h                  |   1 +
 xdiff/xdiffi.c                 | 458 ++++++++++++++++++++++++++++++++++-------
 5 files changed, 409 insertions(+), 72 deletions(-)

-- 
2.8.1


             reply	other threads:[~2016-08-03 22:08 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03 22:00 Michael Haggerty [this message]
2016-08-03 22:00 ` [PATCH 1/8] xdl_change_compact(): rename some local variables for clarity Michael Haggerty
2016-08-04  7:06   ` Jeff King
2016-08-04 18:24     ` Junio C Hamano
2016-08-13 19:38     ` Michael Haggerty
2016-08-14 12:26       ` Jeff King
2016-08-03 22:00 ` [PATCH 2/8] xdl_change_compact(): clarify code Michael Haggerty
2016-08-03 22:11   ` Stefan Beller
2016-08-03 23:14     ` Michael Haggerty
2016-08-03 23:50       ` Stefan Beller
2016-08-04  7:13         ` Jeff King
2016-08-10 16:39         ` Michael Haggerty
2016-08-10 16:58           ` Stefan Beller
2016-08-03 22:00 ` [PATCH 3/8] xdl_change_compact(): rename i to end Michael Haggerty
2016-08-04  7:16   ` Jeff King
2016-08-03 22:00 ` [PATCH 4/8] xdl_change_compact(): do one final shift or the other, not both Michael Haggerty
2016-08-03 22:00 ` [PATCH 5/8] xdl_change_compact(): fix compaction heuristic to adjust io Michael Haggerty
2016-08-04  7:27   ` Jeff King
2016-08-10 16:58     ` Michael Haggerty
2016-08-10 17:09       ` Michael Haggerty
2016-08-11  4:16       ` Jeff King
2016-08-04 18:43   ` Junio C Hamano
2016-08-10 17:13     ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 6/8] xdl_change_compact(): keep track of the earliest end Michael Haggerty
2016-08-04 18:46   ` Junio C Hamano
2016-08-10 17:16     ` Michael Haggerty
2016-08-03 22:00 ` [PATCH 7/8] is_blank_line: take a single xrecord_t as argument Michael Haggerty
2016-08-04 18:48   ` Junio C Hamano
2016-08-03 22:00 ` [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs Michael Haggerty
2016-08-03 22:29   ` Jacob Keller
2016-08-03 22:36     ` Michael Haggerty
2016-08-04  4:47       ` Jacob Keller
2016-08-04 19:39       ` Junio C Hamano
2016-08-10 19:01         ` Michael Haggerty
2016-08-10 21:28           ` Junio C Hamano
2016-08-03 22:30   ` Stefan Beller
2016-08-03 22:41     ` Michael Haggerty
2016-08-03 22:51       ` Stefan Beller
2016-08-03 23:30         ` Michael Haggerty
2016-08-04  0:04           ` Stefan Beller
2016-08-10 19:12             ` Michael Haggerty
2016-08-04  7:56   ` Jeff King
2016-08-04 16:55     ` Stefan Beller
2016-08-04 19:47       ` Junio C Hamano
2016-08-13  0:09       ` Michael Haggerty
2016-08-12 23:25     ` Michael Haggerty
2016-08-13  8:59       ` Jeff King
2016-08-13 15:59         ` Junio C Hamano
2016-08-14  7:21           ` Jacob Keller
2016-08-15  6:33         ` Stefan Beller
2016-08-15 20:24           ` Junio C Hamano
2016-08-04 19:52   ` Junio C Hamano
2016-08-13  0:11     ` Michael Haggerty
2016-08-03 22:08 ` [PATCH 0/8] Better heuristics make prettier diffs Michael Haggerty
2016-08-04  7:38 ` Jeff King
2016-08-04 19:54   ` Junio C Hamano
2016-08-04 20:01     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1470259583.git.mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).