git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [BUG-ish] diff compaction heuristic false positive
@ 2016-06-10  7:50 Jeff King
  2016-06-10  8:31 ` Jeff King
  2016-06-10  8:31 ` [BUG-ish] diff compaction heuristic false positive Michael Haggerty
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff King @ 2016-06-10  7:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Jacob Keller, git

I found a false positive with the new compaction heuristic in v2.9:

-- >8 --

# start with a simple file
cat >file.rb <<\EOF
def foo
  do_foo_stuff()

  common_ending()
end
EOF

git add file.rb
git commit -m base

# and then add another function with a similar ending
cat >>file.rb <<\EOF

def bar
  do_bar_stuff()

  common_ending()
end
EOF

-- 8< --

I get this rather unfortunate diff:

    $ git diff
    diff --git a/file.rb b/file.rb
    index bd9d1cb..67fbeba 100644
    --- a/file.rb
    +++ b/file.rb
    @@ -1,5 +1,11 @@
     def foo
       do_foo_stuff()
     
    +  common_ending()
    +end
    +
    +def bar
    +  do_bar_stuff()
    +
       common_ending()
     end

but without the compaction heuristic (or with an older git), I get:

    $ git -c diff.compactionHeuristic=false diff
    diff --git a/file.rb b/file.rb
    index bd9d1cb..67fbeba 100644
    --- a/file.rb
    +++ b/file.rb
    @@ -3,3 +3,9 @@ def foo
     
       common_ending()
     end
    +
    +def bar
    +  do_bar_stuff()
    +
    +  common_ending()
    +end

:( The problem is that the common bits are separated from the
interesting bits by a blank line. This is simplified from a real-world
example, but I think you could come up with the same example in C, like:

  if (foo) {
        do_foo();

        something_else();
  }

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10  7:50 [BUG-ish] diff compaction heuristic false positive Jeff King
@ 2016-06-10  8:31 ` Jeff King
  2016-06-10 15:56   ` Junio C Hamano
  2016-06-10  8:31 ` [BUG-ish] diff compaction heuristic false positive Michael Haggerty
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff King @ 2016-06-10  8:31 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Jacob Keller, git

On Fri, Jun 10, 2016 at 03:50:43AM -0400, Jeff King wrote:

> I found a false positive with the new compaction heuristic in v2.9:
> [...]

And by the way, this is less "hey neat, I found a case" and more "wow,
this is a lot worse than I thought".

I diffed the old and new output for the top 10,000 commits in this
particular ruby code base. There were 45 commits with changed diffs.
Spot-checking them manually, a little over 1/3 of them featured this bad
pattern. The others looked like strict improvements.

That's a lot worse than the outcomes we saw on other code bases earlier.
1/3 bad is still a net improvement, so I dunno. Is this worth worrying
about? Should we bring back the documentation for the knob to disable
it? Should we consider making it tunable via gitattributes?

I don't think that last one really helps; the good cases _and_ the bad
ones are both in ruby code (though certainly the C code we looked at
earlier was all good).

It may also be possible to make it Just Work by using extra information
like indentation. I haven't thought hard enough about that to say.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10  7:50 [BUG-ish] diff compaction heuristic false positive Jeff King
  2016-06-10  8:31 ` Jeff King
@ 2016-06-10  8:31 ` Michael Haggerty
  2016-06-10  8:41   ` Jeff King
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Haggerty @ 2016-06-10  8:31 UTC (permalink / raw)
  To: Jeff King, Stefan Beller; +Cc: Junio C Hamano, Jacob Keller, git

On 06/10/2016 09:50 AM, Jeff King wrote:
> I found a false positive with the new compaction heuristic in v2.9:
> [...]
> I get this rather unfortunate diff:
> 
>     $ git diff
>     diff --git a/file.rb b/file.rb
>     index bd9d1cb..67fbeba 100644
>     --- a/file.rb
>     +++ b/file.rb
>     @@ -1,5 +1,11 @@
>      def foo
>        do_foo_stuff()
>      
>     +  common_ending()
>     +end
>     +
>     +def bar
>     +  do_bar_stuff()
>     +
>        common_ending()
>      end

I've often thought that indentation would be a good, fairly universal
signal for diff to use when deciding how to slide hunks around. Most
source code is indented in a way that shows its structure.

I propose the following heuristic:

* Prefer to start and end hunks following lines with the least
  indentation.

* Define the "indentation" of a blank line to be the indentation of
  the previous non-blank line minus epsilon.

* In the case of a tie, prefer to slide the hunk down as far as
  possible.

For the case above, the indentations for the candidate "before-the-hunk"
lines and the resulting hunk would be

>      def foo
> 2      do_foo_stuff()
> 2-ε
> 2      common_ending()
> 0    end
> 0-ε +
>     +def bar
>     +  do_bar_stuff()
>     +
>     +  common_ending()
>     +end

I haven't tried testing this heuristic systematically but I have the
feeling that it would be pretty effective and yet quite easy to implement.

Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10  8:31 ` [BUG-ish] diff compaction heuristic false positive Michael Haggerty
@ 2016-06-10  8:41   ` Jeff King
  2016-06-10 11:00     ` Michael Haggerty
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2016-06-10  8:41 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Stefan Beller, Junio C Hamano, Jacob Keller, git

On Fri, Jun 10, 2016 at 10:31:13AM +0200, Michael Haggerty wrote:

> I've often thought that indentation would be a good, fairly universal
> signal for diff to use when deciding how to slide hunks around. Most
> source code is indented in a way that shows its structure.
> 
> I propose the following heuristic:
> 
> * Prefer to start and end hunks following lines with the least
>   indentation.
> 
> * Define the "indentation" of a blank line to be the indentation of
>   the previous non-blank line minus epsilon.
> 
> * In the case of a tie, prefer to slide the hunk down as far as
>   possible.

Hmm. That might help this case, but the original motivation for this
heuristic was something like:

  ##
  # foo
  def foo
    something
  end

  ##
  # bar
  def bar
    something_else
  end

where we add the first function above the second. We end up with:

diff --git a/file.rb b/file.rb
index 1f9b151..f991c76 100644
--- a/file.rb
+++ b/file.rb
@@ -1,4 +1,10 @@
 ##
+# foo
+def foo
+  something
+end
+
+##
 # bar
 def bar
   something else

I.e., crediting the "##" to the wrong spot (or in C, the "/*"). I don't
think indentation helps us there (sliding-up would, but like
sliding-down, it just depends on the order of the hunks).

So I agree that adding indentation to the mix might help, but I don't
think it can replace this heuristic.

-Peff

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10  8:41   ` Jeff King
@ 2016-06-10 11:00     ` Michael Haggerty
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Haggerty @ 2016-06-10 11:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Junio C Hamano, Jacob Keller, git

On 06/10/2016 10:41 AM, Jeff King wrote:
> On Fri, Jun 10, 2016 at 10:31:13AM +0200, Michael Haggerty wrote:
> 
>> I've often thought that indentation would be a good, fairly universal
>> signal for diff to use when deciding how to slide hunks around. Most
>> source code is indented in a way that shows its structure.
>>
>> I propose the following heuristic:
>>
>> * Prefer to start and end hunks following lines with the least
>>   indentation.
>>
>> * Define the "indentation" of a blank line to be the indentation of
>>   the previous non-blank line minus epsilon.
>>
>> * In the case of a tie, prefer to slide the hunk down as far as
>>   possible.
> 
> Hmm. That might help this case, but the original motivation for this
> heuristic was something like:
> 
>   ##
>   # foo
>   def foo
>     something
>   end
> 
>   ##
>   # bar
>   def bar
>     something_else
>   end
> 
> where we add the first function above the second. We end up with:
> 
> diff --git a/file.rb b/file.rb
> index 1f9b151..f991c76 100644
> --- a/file.rb
> +++ b/file.rb
> @@ -1,4 +1,10 @@
>  ##
> +# foo
> +def foo
> +  something
> +end
> +
> +##
>  # bar
>  def bar
>    something else
> 
> I.e., crediting the "##" to the wrong spot (or in C, the "/*"). I don't
> think indentation helps us there (sliding-up would, but like
> sliding-down, it just depends on the order of the hunks).
> 
> So I agree that adding indentation to the mix might help, but I don't
> think it can replace this heuristic.

Ummm, I think the indentation heuristic would work with that example,
too, as long as we consider there to be an imaginary line "0" of the
file (i.e., preceding the first real line) that has an indentation of -ε.

Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10  8:31 ` Jeff King
@ 2016-06-10 15:56   ` Junio C Hamano
  2016-06-10 16:25     ` Stefan Beller
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2016-06-10 15:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Jacob Keller, git

Jeff King <peff@peff.net> writes:

> On Fri, Jun 10, 2016 at 03:50:43AM -0400, Jeff King wrote:
>
>> I found a false positive with the new compaction heuristic in v2.9:
>> [...]
>
> And by the way, this is less "hey neat, I found a case" and more "wow,
> this is a lot worse than I thought".
>
> I diffed the old and new output for the top 10,000 commits in this
> particular ruby code base. There were 45 commits with changed diffs.
> Spot-checking them manually, a little over 1/3 of them featured this bad
> pattern. The others looked like strict improvements.
>
> That's a lot worse than the outcomes we saw on other code bases earlier.
> 1/3 bad is still a net improvement, so I dunno. Is this worth worrying
> about? Should we bring back the documentation for the knob to disable
> it? Should we consider making it tunable via gitattributes?
>
> I don't think that last one really helps; the good cases _and_ the bad
> ones are both in ruby code (though certainly the C code we looked at
> earlier was all good).
>
> It may also be possible to make it Just Work by using extra information
> like indentation. I haven't thought hard enough about that to say.
>
> -Peff

I recall saying "we'd end up being better in some and worse in
others" at the very beginning.  How about toggling the default back
for the upcoming release, keeping the experimentation knob in the
code, and try different heuristics like the "indentation" during the
next cycle?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10 15:56   ` Junio C Hamano
@ 2016-06-10 16:25     ` Stefan Beller
  2016-06-10 16:29       ` Jacob Keller
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Beller @ 2016-06-10 16:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jacob Keller, git@vger.kernel.org

On Fri, Jun 10, 2016 at 8:56 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Fri, Jun 10, 2016 at 03:50:43AM -0400, Jeff King wrote:
>>
>>> I found a false positive with the new compaction heuristic in v2.9:
>>> [...]
>>
>> And by the way, this is less "hey neat, I found a case" and more "wow,
>> this is a lot worse than I thought".
>>
>> I diffed the old and new output for the top 10,000 commits in this
>> particular ruby code base. There were 45 commits with changed diffs.
>> Spot-checking them manually, a little over 1/3 of them featured this bad
>> pattern. The others looked like strict improvements.
>>
>> That's a lot worse than the outcomes we saw on other code bases earlier.
>> 1/3 bad is still a net improvement, so I dunno. Is this worth worrying
>> about? Should we bring back the documentation for the knob to disable
>> it? Should we consider making it tunable via gitattributes?
>>
>> I don't think that last one really helps; the good cases _and_ the bad
>> ones are both in ruby code (though certainly the C code we looked at
>> earlier was all good).
>>
>> It may also be possible to make it Just Work by using extra information
>> like indentation. I haven't thought hard enough about that to say.
>>
>> -Peff
>
> I recall saying "we'd end up being better in some and worse in
> others" at the very beginning.  How about toggling the default back
> for the upcoming release, keeping the experimentation knob in the
> code, and try different heuristics like the "indentation" during the
> next cycle?

Sure. I thought about for a while now and by now I agree with Junio.
No matter what kind of heuristic we can come up with it is easy to construct
a counter example.

That said, let's try the indentation thing, though I suspect
one of the early motivating examples (an excerpt from a  kernel config file)
would not do well with it, as it had not an indentation scheme as programming
languages do.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [BUG-ish] diff compaction heuristic false positive
  2016-06-10 16:25     ` Stefan Beller
@ 2016-06-10 16:29       ` Jacob Keller
  2016-06-10 18:13         ` Re* " Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Jacob Keller @ 2016-06-10 16:29 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Jeff King, git@vger.kernel.org

On Fri, Jun 10, 2016 at 9:25 AM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Jun 10, 2016 at 8:56 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Jeff King <peff@peff.net> writes:
>>
>>> On Fri, Jun 10, 2016 at 03:50:43AM -0400, Jeff King wrote:
>>>
>>>> I found a false positive with the new compaction heuristic in v2.9:
>>>> [...]
>>>
>>> And by the way, this is less "hey neat, I found a case" and more "wow,
>>> this is a lot worse than I thought".
>>>
>>> I diffed the old and new output for the top 10,000 commits in this
>>> particular ruby code base. There were 45 commits with changed diffs.
>>> Spot-checking them manually, a little over 1/3 of them featured this bad
>>> pattern. The others looked like strict improvements.
>>>
>>> That's a lot worse than the outcomes we saw on other code bases earlier.
>>> 1/3 bad is still a net improvement, so I dunno. Is this worth worrying
>>> about? Should we bring back the documentation for the knob to disable
>>> it? Should we consider making it tunable via gitattributes?
>>>
>>> I don't think that last one really helps; the good cases _and_ the bad
>>> ones are both in ruby code (though certainly the C code we looked at
>>> earlier was all good).
>>>
>>> It may also be possible to make it Just Work by using extra information
>>> like indentation. I haven't thought hard enough about that to say.
>>>
>>> -Peff
>>
>> I recall saying "we'd end up being better in some and worse in
>> others" at the very beginning.  How about toggling the default back
>> for the upcoming release, keeping the experimentation knob in the
>> code, and try different heuristics like the "indentation" during the
>> next cycle?
>
> Sure. I thought about for a while now and by now I agree with Junio.
> No matter what kind of heuristic we can come up with it is easy to construct
> a counter example.
>
> That said, let's try the indentation thing, though I suspect
> one of the early motivating examples (an excerpt from a  kernel config file)
> would not do well with it, as it had not an indentation scheme as programming
> languages do.
>
> Thanks,
> Stefan

I think we could use the indentation trick and it might help in this
case. I agree, let's disable this for this cycle and experiment in the
next one. Good catch, Peff.

As others have said you will always be able to produce counter
examples, that's the nature of heuristics. The idea is to see if we
can come up with something simple that mostly improves the output,
even if sometimes it might have a negative impact on the outputs. But
I think we should avoid changing behavior unless it's mostly an
improvement.

Regards,
Jake

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re* [BUG-ish] diff compaction heuristic false positive
  2016-06-10 16:29       ` Jacob Keller
@ 2016-06-10 18:13         ` Junio C Hamano
  2016-06-10 18:21           ` Stefan Beller
  2016-06-10 20:30           ` Jeff King
  0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2016-06-10 18:13 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Stefan Beller, Jeff King, git@vger.kernel.org

Jacob Keller <jacob.keller@gmail.com> writes:

> I think we could use the indentation trick and it might help in this
> case. I agree, let's disable this for this cycle and experiment in the
> next one. Good catch, Peff.
>
> As others have said you will always be able to produce counter
> examples, that's the nature of heuristics. The idea is to see if we
> can come up with something simple that mostly improves the output,
> even if sometimes it might have a negative impact on the outputs. But
> I think we should avoid changing behavior unless it's mostly an
> improvement.

OK, let's do this then for the upcoming release for now.  I am
tempted to flip it back on after the release in 'next', so that
developers would be exposed to the heuristic by default, though.

-- >8 --
Subject: [PATCH] diff: disable compaction heuristic for now

http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net
reports that a change to add a new "function" with common ending
with the existing one at the end of the file is shown like this:

    def foo
      do_foo_stuff()

   +  common_ending()
   +end
   +
   +def bar
   +  do_bar_stuff()
   +
      common_ending()
    end

when the new heuristic is in use.  In reality, the change is to add
the blank line before "def bar" and everything below, which is what
the code without the new heuristic shows.

Disable the heuristics by default and leave it as an experimental
feature for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 05ca3ce..9116d9d 100644
--- a/diff.c
+++ b/diff.c
@@ -25,7 +25,7 @@
 #endif
 
 static int diff_detect_rename_default;
-static int diff_compaction_heuristic = 1;
+static int diff_compaction_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
-- 
2.9.0-rc2-275-g493bdbb

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Re* [BUG-ish] diff compaction heuristic false positive
  2016-06-10 18:13         ` Re* " Junio C Hamano
@ 2016-06-10 18:21           ` Stefan Beller
  2016-06-10 20:30           ` Jeff King
  1 sibling, 0 replies; 16+ messages in thread
From: Stefan Beller @ 2016-06-10 18:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jacob Keller, Jeff King, git@vger.kernel.org

On Fri, Jun 10, 2016 at 11:13 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Subject: [PATCH] diff: disable compaction heuristic for now
>
> http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net
> reports that a change to add a new "function" with common ending
> with the existing one at the end of the file is shown like this:
>
>     def foo
>       do_foo_stuff()
>
>    +  common_ending()
>    +end
>    +
>    +def bar
>    +  do_bar_stuff()
>    +
>       common_ending()
>     end
>
> when the new heuristic is in use.  In reality, the change is to add
> the blank line before "def bar" and everything below, which is what
> the code without the new heuristic shows.
>
> Disable the heuristics by default and leave it as an experimental
> feature for now.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re* [BUG-ish] diff compaction heuristic false positive
  2016-06-10 18:13         ` Re* " Junio C Hamano
  2016-06-10 18:21           ` Stefan Beller
@ 2016-06-10 20:30           ` Jeff King
  2016-06-10 20:48             ` [PATCH v2] diff: disable compaction heuristic for now Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff King @ 2016-06-10 20:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

On Fri, Jun 10, 2016 at 11:13:10AM -0700, Junio C Hamano wrote:

> Jacob Keller <jacob.keller@gmail.com> writes:
> 
> > I think we could use the indentation trick and it might help in this
> > case. I agree, let's disable this for this cycle and experiment in the
> > next one. Good catch, Peff.
> >
> > As others have said you will always be able to produce counter
> > examples, that's the nature of heuristics. The idea is to see if we
> > can come up with something simple that mostly improves the output,
> > even if sometimes it might have a negative impact on the outputs. But
> > I think we should avoid changing behavior unless it's mostly an
> > improvement.
> 
> OK, let's do this then for the upcoming release for now.  I am
> tempted to flip it back on after the release in 'next', so that
> developers would be exposed to the heuristic by default, though.

Thanks for the patch, and I agree flipping it back on in "next" is a
good idea.

It may be that I am making a fuss over nothing. As you say, we always
knew that it might have false positives. Mostly I was just surprised how
frequent they were in this example (I also wondered why the same pattern
did not trigger in the C code we looked at).

> -- >8 --
> Subject: [PATCH] diff: disable compaction heuristic for now

Looks good.

We probably want a patch to the release notes to note that it's not on
by default. And we may want to advertise the experimental knob so
that people actually try it (otherwise we won't get feedback from the
masses).

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] diff: disable compaction heuristic for now
  2016-06-10 20:30           ` Jeff King
@ 2016-06-10 20:48             ` Junio C Hamano
  2016-06-10 20:53               ` Jeff King
  2016-06-10 20:55               ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2016-06-10 20:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net
reports that a change to add a new "function" with common ending
with the existing one at the end of the file is shown like this:

    def foo
      do_foo_stuff()

   +  common_ending()
   +end
   +
   +def bar
   +  do_bar_stuff()
   +
      common_ending()
    end

when the new heuristic is in use.  In reality, the change is to add
the blank line before "def bar" and everything below, which is what
the code without the new heuristic shows.

Disable the heuristics by default, and resurrect the documentation
for the option and the configuration variables, while clearly
marking the feature as still experimental.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

    Jeff King <peff@peff.net> writes:

    > We probably want a patch to the release notes to note that it's not on
    > by default. And we may want to advertise the experimental knob so
    > that people actually try it (otherwise we won't get feedback from the
    > masses).

    OK, I think that is sensible.  The interdiff is not a strict
    reversion of 77085a61 (diff: undocument the compaction heuristic
    knobs for experimentation, 2016-05-02) but stresses that the
    feature is off by default and is experimental.

 Documentation/diff-config.txt  | 5 +++++
 Documentation/diff-options.txt | 7 +++++++
 diff.c                         | 2 +-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 6eaa452..6fb70c5 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -166,6 +166,11 @@ diff.tool::
 
 include::mergetools-diff.txt[]
 
+diff.compactionHeuristic::
+	Set this option to `true` to enable an experimental heuristic that
+	shifts the hunk boundary in an attempt to make the resulting
+	patch easier to read.
+
 diff.algorithm::
 	Choose a diff algorithm.  The variants are as follows:
 +
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 3ad6404..9baf1ad 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -63,6 +63,13 @@ ifndef::git-format-patch[]
 	Synonym for `-p --raw`.
 endif::git-format-patch[]
 
+--compaction-heuristic::
+--no-compaction-heuristic::
+	These are to help debugging and tuning an experimental
+	heuristic (which is off by default) that shifts the hunk
+	boundary in an attempt to make the resulting patch easier
+	to read.
+
 --minimal::
 	Spend extra time to make sure the smallest possible
 	diff is produced.
diff --git a/diff.c b/diff.c
index 05ca3ce..9116d9d 100644
--- a/diff.c
+++ b/diff.c
@@ -25,7 +25,7 @@
 #endif
 
 static int diff_detect_rename_default;
-static int diff_compaction_heuristic = 1;
+static int diff_compaction_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
-- 
2.9.0-rc2-285-ge226c12

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] diff: disable compaction heuristic for now
  2016-06-10 20:48             ` [PATCH v2] diff: disable compaction heuristic for now Junio C Hamano
@ 2016-06-10 20:53               ` Jeff King
  2016-06-10 20:55               ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff King @ 2016-06-10 20:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

On Fri, Jun 10, 2016 at 01:48:58PM -0700, Junio C Hamano wrote:

>     Jeff King <peff@peff.net> writes:
> 
>     > We probably want a patch to the release notes to note that it's not on
>     > by default. And we may want to advertise the experimental knob so
>     > that people actually try it (otherwise we won't get feedback from the
>     > masses).
> 
>     OK, I think that is sensible.  The interdiff is not a strict
>     reversion of 77085a61 (diff: undocument the compaction heuristic
>     knobs for experimentation, 2016-05-02) but stresses that the
>     feature is off by default and is experimental.

Looks good to me. Do we want to include in "experimental" that the
option and command-line flag might go away in the future? I think
probably not. I do not mind supporting this forever (it _has_ shown its
usefulness, so I don't think it is any worse to maintain going forward
than things like --patience, which people can tweak if their diff looks
uglier than they would like).

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] diff: disable compaction heuristic for now
  2016-06-10 20:48             ` [PATCH v2] diff: disable compaction heuristic for now Junio C Hamano
  2016-06-10 20:53               ` Jeff King
@ 2016-06-10 20:55               ` Junio C Hamano
  2016-06-10 21:05                 ` Jeff King
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2016-06-10 20:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

Junio C Hamano <gitster@pobox.com> writes:

>     Jeff King <peff@peff.net> writes:
>
>     > We probably want a patch to the release notes to note that it's not on
>     > by default. And we may want to advertise the experimental knob so
>     > that people actually try it (otherwise we won't get feedback from the
>     > masses).
>
>     OK, I think that is sensible.  The interdiff is not a strict
>     reversion of 77085a61 (diff: undocument the compaction heuristic
>     knobs for experimentation, 2016-05-02) but stresses that the
>     feature is off by default and is experimental.

This is for 'master' when the topic is merged (will keep it in stash
for now).

 Documentation/RelNotes/2.9.0.txt | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/RelNotes/2.9.0.txt b/Documentation/RelNotes/2.9.0.txt
index 927cb9b..79b46e1 100644
--- a/Documentation/RelNotes/2.9.0.txt
+++ b/Documentation/RelNotes/2.9.0.txt
@@ -118,9 +118,11 @@ UI, Workflows & Features
  * HTTP transport clients learned to throw extra HTTP headers at the
    server, specified via http.extraHeader configuration variable.
 
- * Patch output from "git diff" and friends has been tweaked to be
-   more readable by using a blank line as a strong hint that the
-   contents before and after it belong to logically separate units.
+ * The "--compaction-heuristic" option to "git diff" family of
+   commands enables a heuristic to make the patch output more readable
+   by using a blank line as a strong hint that the contents before and
+   after it belong to logically separate units.  It is still
+   experimental.
 
  * A new configuration variable core.hooksPath allows customizing
    where the hook directory is.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] diff: disable compaction heuristic for now
  2016-06-10 20:55               ` Junio C Hamano
@ 2016-06-10 21:05                 ` Jeff King
  2016-06-10 21:46                   ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2016-06-10 21:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

On Fri, Jun 10, 2016 at 01:55:01PM -0700, Junio C Hamano wrote:

> This is for 'master' when the topic is merged (will keep it in stash
> for now).
> 
>  Documentation/RelNotes/2.9.0.txt | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/RelNotes/2.9.0.txt b/Documentation/RelNotes/2.9.0.txt
> index 927cb9b..79b46e1 100644
> --- a/Documentation/RelNotes/2.9.0.txt
> +++ b/Documentation/RelNotes/2.9.0.txt
> @@ -118,9 +118,11 @@ UI, Workflows & Features
>   * HTTP transport clients learned to throw extra HTTP headers at the
>     server, specified via http.extraHeader configuration variable.
>  
> - * Patch output from "git diff" and friends has been tweaked to be
> -   more readable by using a blank line as a strong hint that the
> -   contents before and after it belong to logically separate units.
> + * The "--compaction-heuristic" option to "git diff" family of
> +   commands enables a heuristic to make the patch output more readable
> +   by using a blank line as a strong hint that the contents before and
> +   after it belong to logically separate units.  It is still
> +   experimental.
>  
>   * A new configuration variable core.hooksPath allows customizing
>     where the hook directory is.

Looks good.

I think your calendar calls for release 2.9.0 on Monday. Are you going
to bump the schedule for this? I don't think it's very high risk, and
wouldn't need to.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] diff: disable compaction heuristic for now
  2016-06-10 21:05                 ` Jeff King
@ 2016-06-10 21:46                   ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2016-06-10 21:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Keller, Stefan Beller, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> Looks good.
>
> I think your calendar calls for release 2.9.0 on Monday. Are you going
> to bump the schedule for this? I don't think it's very high risk, and
> wouldn't need to.

I didn't plan to.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-06-10 21:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-10  7:50 [BUG-ish] diff compaction heuristic false positive Jeff King
2016-06-10  8:31 ` Jeff King
2016-06-10 15:56   ` Junio C Hamano
2016-06-10 16:25     ` Stefan Beller
2016-06-10 16:29       ` Jacob Keller
2016-06-10 18:13         ` Re* " Junio C Hamano
2016-06-10 18:21           ` Stefan Beller
2016-06-10 20:30           ` Jeff King
2016-06-10 20:48             ` [PATCH v2] diff: disable compaction heuristic for now Junio C Hamano
2016-06-10 20:53               ` Jeff King
2016-06-10 20:55               ` Junio C Hamano
2016-06-10 21:05                 ` Jeff King
2016-06-10 21:46                   ` Junio C Hamano
2016-06-10  8:31 ` [BUG-ish] diff compaction heuristic false positive Michael Haggerty
2016-06-10  8:41   ` Jeff King
2016-06-10 11:00     ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).