git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Morten Welinder <mwelinder@gmail.com>, Kevin <ikke@ikke.info>,
	git <git@vger.kernel.org>
Subject: Re: Fwd: possible Improving diff algoritm
Date: Thu, 13 Dec 2012 01:00:08 +0100	[thread overview]
Message-ID: <50C91A88.1090306@alum.mit.edu> (raw)
In-Reply-To: <7vpq2f2az4.fsf@alter.siamese.dyndns.org>

On 12/12/2012 10:53 PM, Junio C Hamano wrote:
> Morten Welinder <mwelinder@gmail.com> writes:
> 
>> Is there a reason why picking among the choices in a sliding window
>> must be contents neutral?
> 
> Sorry, you might be getting at something interesting but I do not
> understand the question.  I have no idea what you mean by "contents
> neutral".
> 
> Picking between these two choices
> 
>          /**                         +    /**                         
>     +     * Default parent           +     * Default parent           
>     +     *                          +     *                          
>     +     * @var int                 +     * @var int                 
>     +     * @access protected        +     * @access protected        
>     +     * @index                   +     * @index                   
>     +     */                         +     */                         
>     +    protected $defaultParent;   +    protected $defaultParent;   
>     +                                +                                
>     +    /**                              /**                         
> 
> would not affect the correctness of the patch.  You may pick
> whatever you deem the most desirable, but your answer must be a
> correct patch (the definition of "correct" here is "applying that
> patch to the preimage produces the intended postimage").
> 
> And I think if you inserted a block of text B after a context C
> where the tail of B matches the tail of C like the above, you can
> shift what you treat as "inserted" up and still come up with a
> correct patch.

I have the feeling that a few crude heuristics would go a long way
towards improving diffs like this.  For example:

* Prefer to have an add/remove block that has balanced begin/end pairs
(where begin/end pairs might be opening and closing parentheses,
brackets, braces, and angle brackets, "/*" and "*/", and perhaps a
couple of other things.  For SGML-like text begin and end tags could be
matched up.

It would be possible to read these begin/end pairs from a
filetype-specific table or configuration setting, though this would add
complication and would also make it possible that diffs generated by two
different people are not identical if their configurations differ.

* Prefer to have a block where the first non-blank line of the block and
the first non-blank line after the block are indented by the same amount.

* Prefer to have a block with trailing (as opposed to leading or
embedded) blank lines--the more the better.

The beautiful thing is that even if the heuristics sometimes fail, the
correctness of the patch (in the sense that you have defined) is not
compromised.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2012-12-13  0:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAO54GHC4AXQO1MbU2qXMdcDO5mtUFhrXfXND5evc93kQhNfCrw@mail.gmail.com>
2012-12-12 15:03 ` Fwd: possible Improving diff algoritm Kevin
2012-12-12 18:29   ` Junio C Hamano
2012-12-12 18:48     ` Brian J. Murrell
2012-12-12 19:30     ` Kevin
2012-12-12 20:29     ` Junio C Hamano
2012-12-12 21:40     ` Morten Welinder
2012-12-12 21:53       ` Junio C Hamano
2012-12-12 22:34         ` Andrew Ardill
2012-12-12 23:32           ` Javier Domingo
2012-12-12 23:43             ` Junio C Hamano
2012-12-12 23:49               ` Javier Domingo
2012-12-13  0:00         ` Michael Haggerty [this message]
2012-12-13  1:55         ` Morten Welinder
2012-12-13  4:58           ` Geert Bosch
2012-12-13  6:26             ` Junio C Hamano
2012-12-14 12:20               ` Javier Domingo
2012-12-14 22:29                 ` Bernhard R. Link
2012-12-15 12:16                   ` Javier Domingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C91A88.1090306@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ikke@ikke.info \
    --cc=mwelinder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).