git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug report: "Use of uninitialized value $_ in print"
@ 2018-03-02  1:19 Sam Kuper
  2018-03-02  7:04 ` Jonathan Nieder
  2018-03-02 10:42 ` Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Sam Kuper @ 2018-03-02  1:19 UTC (permalink / raw)
  To: git

First, background. I encountered a bug on Debian Stretch, using this
git version:

$ git --version
git version 2.11.0


The bug is that in the midst of running

git -c interactive.diffFilter="git diff --word-diff --color" add --patch

and after having answered "y" to some patches and "n" to others, git
suddenly spewed the following output:


Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 74.
Stage this hunk [y,n,q,a,d,/,K,j,J,g,e,?]? n
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
Use of uninitialized value $_ in print at
/usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.


I hope that this bug report can help the git core maintainers to
either fix the problem upstream, or to co-ordinate a fix with the
Debian git maintainer(s) if the bug does not exist upstream.

Thanks for the great DVCS :)

P.S. I am not subscribed to this mailing list, so please CC me in your
reply if you want me to see it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02  1:19 Bug report: "Use of uninitialized value $_ in print" Sam Kuper
@ 2018-03-02  7:04 ` Jonathan Nieder
  2018-03-02 10:46   ` Jeff King
  2018-03-02 17:28   ` Sam Kuper
  2018-03-02 10:42 ` Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Jonathan Nieder @ 2018-03-02  7:04 UTC (permalink / raw)
  To: Sam Kuper; +Cc: git

Hi,

Sam Kuper wrote:

> First, background. I encountered a bug on Debian Stretch, using this
> git version:
>
> $ git --version
> git version 2.11.0
>
> The bug is that in the midst of running
>
> git -c interactive.diffFilter="git diff --word-diff --color" add --patch
>
> and after having answered "y" to some patches and "n" to others, git
> suddenly spewed the following output:
>
> Use of uninitialized value $_ in print at
> /usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 74.
> Stage this hunk [y,n,q,a,d,/,K,j,J,g,e,?]? n
> Use of uninitialized value $_ in print at
> /usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
[...]

Strange.  The relevant line, for reference:

$ dpkg-deb --fsys-tarfile git_2.11.0-3+deb9u2_amd64.deb |
  tar Oxf - ./usr/lib/git-core/git-add--interactive |
  sed -n '1370,1372 p'

		for (@{$hunk[$ix]{DISPLAY}}) {
			print; <---- this one
		}

This is a foreach loop, so it's supposed to have set $_ to each value
in the list @{$hunk[$ix]{DISPLAY}).  So why is Perl considering it
uninitialized?

Is this reproducible for you?  Do you have more details about how I
can reproduce it?

What arch are you on?  What perl version do you use?  Can you report
this using "reportbug git"?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02  1:19 Bug report: "Use of uninitialized value $_ in print" Sam Kuper
  2018-03-02  7:04 ` Jonathan Nieder
@ 2018-03-02 10:42 ` Jeff King
  2018-03-02 17:30   ` Sam Kuper
  2018-03-02 17:48   ` Junio C Hamano
  1 sibling, 2 replies; 23+ messages in thread
From: Jeff King @ 2018-03-02 10:42 UTC (permalink / raw)
  To: Sam Kuper; +Cc: git

On Fri, Mar 02, 2018 at 01:19:35AM +0000, Sam Kuper wrote:

> The bug is that in the midst of running
> 
> git -c interactive.diffFilter="git diff --word-diff --color" add --patch

That's not how interactive.diffFilter is supposed to work.

It's meant to have the output of an existing diff piped into it.
Generating the diff anew will appear to work for some simple cases, but
fall down for others:

  1. Any of the flavors besides "add -p" will be running the wrong diff
     (e.g., "reset -p" runs a diff between the index and HEAD).

  2. Any pathspec limiters would be ignored (e.g., "add -p file").

  3. Your invocation in particular is a problem because it uses
     --word-diff, which will not have a one-to-one line correspondence
     with the bare diff.

     add--interactive handles pretty-printing by running the diff
     command twice: once with no special options, and once with
     "--color" and piped through the diffFilter. It assumes that the two
     match each other line for line, so it shows you the "DISPLAY"
     variant, but then ultimately applies the "TEXT" variant.

And that last one is the cause of the errors you see:

> Use of uninitialized value $_ in print at
> /usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 74.

The "DISPLAY" run for your case generates fewer lines than the "TEXT"
run, and we complain on trying to show those extra lines.

Unfortunately, I don't think there's an easy way to do what you want
(show word-diffs but apply the full diff).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02  7:04 ` Jonathan Nieder
@ 2018-03-02 10:46   ` Jeff King
  2018-03-02 16:53     ` Junio C Hamano
  2018-03-02 17:28   ` Sam Kuper
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2018-03-02 10:46 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Sam Kuper, git

On Thu, Mar 01, 2018 at 11:04:34PM -0800, Jonathan Nieder wrote:

> > Use of uninitialized value $_ in print at
> > /usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 75.
> [...]
> 
> Strange.  The relevant line, for reference:
> 
> $ dpkg-deb --fsys-tarfile git_2.11.0-3+deb9u2_amd64.deb |
>   tar Oxf - ./usr/lib/git-core/git-add--interactive |
>   sed -n '1370,1372 p'
> 
> 		for (@{$hunk[$ix]{DISPLAY}}) {
> 			print; <---- this one
> 		}
> 
> This is a foreach loop, so it's supposed to have set $_ to each value
> in the list @{$hunk[$ix]{DISPLAY}).  So why is Perl considering it
> uninitialized?

Because the array is full of "undef". See parse_diff(), which does this:

  my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
  ...
  @colored = run_cmd_pipe(@display_cmd);
  ...
  for (my $i = 0; $i < @diff; $i++) {
          ...
          push @{$hunk[-1]{TEXT}}, $diff[$i];
          push @{$hunk[-1]{DISPLAY}},
               (@colored ? $colored[$i] : $diff[$i]);
  }

If the @colored output is shorter than the normal @diff output, we'll
push a bunch of "undef" onto the DISPLAY array (the ternary there is
because sometimes @colored is completely empty if the user did not ask
for color).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 10:46   ` Jeff King
@ 2018-03-02 16:53     ` Junio C Hamano
  2018-03-02 16:55       ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-02 16:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Sam Kuper, git

Jeff King <peff@peff.net> writes:

> Because the array is full of "undef". See parse_diff(), which does this:
>
>   my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
>   ...
>   @colored = run_cmd_pipe(@display_cmd);
>   ...
>   for (my $i = 0; $i < @diff; $i++) {
>           ...
>           push @{$hunk[-1]{TEXT}}, $diff[$i];
>           push @{$hunk[-1]{DISPLAY}},
>                (@colored ? $colored[$i] : $diff[$i]);
>   }
>
> If the @colored output is shorter than the normal @diff output, we'll
> push a bunch of "undef" onto the DISPLAY array (the ternary there is
> because sometimes @colored is completely empty if the user did not ask
> for color).

An obvious sanity check would be to ensure @colored == @diff

                push @{$hunk[-1]{DISPLAY}},
        -            (@colored ? $colored[$i] : $diff[$i]);
        +            (@colored && @colored == @diff ? $colored[$i] : $diff[$i]);
         }

or something like that?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 16:53     ` Junio C Hamano
@ 2018-03-02 16:55       ` Jeff King
  2018-03-02 17:15         ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2018-03-02 16:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Sam Kuper, git

On Fri, Mar 02, 2018 at 08:53:34AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Because the array is full of "undef". See parse_diff(), which does this:
> >
> >   my @diff = run_cmd_pipe("git", @diff_cmd, "--", $path);
> >   ...
> >   @colored = run_cmd_pipe(@display_cmd);
> >   ...
> >   for (my $i = 0; $i < @diff; $i++) {
> >           ...
> >           push @{$hunk[-1]{TEXT}}, $diff[$i];
> >           push @{$hunk[-1]{DISPLAY}},
> >                (@colored ? $colored[$i] : $diff[$i]);
> >   }
> >
> > If the @colored output is shorter than the normal @diff output, we'll
> > push a bunch of "undef" onto the DISPLAY array (the ternary there is
> > because sometimes @colored is completely empty if the user did not ask
> > for color).
> 
> An obvious sanity check would be to ensure @colored == @diff
> 
>                 push @{$hunk[-1]{DISPLAY}},
>         -            (@colored ? $colored[$i] : $diff[$i]);
>         +            (@colored && @colored == @diff ? $colored[$i] : $diff[$i]);
>          }
> 
> or something like that?

That's probably a reasonable sanity check, but I think we need to abort
and not just have a too-small DISPLAY array. Because later code like the
hunk-splitting is going to assume that there's a 1:1 line
correspondence. We definitely don't want to end up in a situation where
we show one thing but apply another.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 16:55       ` Jeff King
@ 2018-03-02 17:15         ` Junio C Hamano
  2018-03-03  5:57           ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-02 17:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Sam Kuper, git

Jeff King <peff@peff.net> writes:

> That's probably a reasonable sanity check, but I think we need to abort
> and not just have a too-small DISPLAY array. Because later code like the
> hunk-splitting is going to assume that there's a 1:1 line
> correspondence. We definitely don't want to end up in a situation where
> we show one thing but apply another.

Yes, agreed completely.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02  7:04 ` Jonathan Nieder
  2018-03-02 10:46   ` Jeff King
@ 2018-03-02 17:28   ` Sam Kuper
  1 sibling, 0 replies; 23+ messages in thread
From: Sam Kuper @ 2018-03-02 17:28 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jeff King

On 02/03/2018, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Is this reproducible for you?

Yes. It seems to occur consistently, given the same input.

> Do you have more details about how I can reproduce it?

Unfortunately, the particular git repo I encountered it on is private,
otherwise I would point you to it.

I haven't attempted to create a MWE. Am I correct that doing so is now
not needed, in the light of Jeff King's email below?


> What arch are you on?  What perl version do you use?  Can you report
> this using "reportbug git"?

All perfectly decent questions. For a modicum of security/privacy, I
would prefer to avoid answering them unless necessary.

Am I right in thinking that these answers are no longer needed, in the
light of Jeff King's email below?



On 02/03/2018, Jeff King <peff@peff.net> wrote:
>   3. Your invocation in particular is a problem because it uses
>      --word-diff, which will not have a one-to-one line correspondence
>      with the bare diff.
>
>      add--interactive handles pretty-printing by running the diff
>      command twice: once with no special options, and once with
>      "--color" and piped through the diffFilter. It assumes that the two
>      match each other line for line, so it shows you the "DISPLAY"
>      variant, but then ultimately applies the "TEXT" variant.
>
> And that last one is the cause of the errors you see:
>
>> Use of uninitialized value $_ in print at
>> /usr/lib/git-core/git-add--interactive line 1371, <STDIN> line 74.
>
> The "DISPLAY" run for your case generates fewer lines than the "TEXT"
> run, and we complain on trying to show those extra lines.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 10:42 ` Jeff King
@ 2018-03-02 17:30   ` Sam Kuper
  2018-03-03  6:02     ` Jeff King
  2018-03-02 17:48   ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Sam Kuper @ 2018-03-02 17:30 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 02/03/2018, Jeff King <peff@peff.net> wrote:
> Unfortunately, I don't think there's an easy way to do what you want
> (show word-diffs but apply the full diff).

Oh :(

That would be a *very* useful feature to have, especially where
multiple small (e.g. single character or single word) changes are
sprinkled throughout a large file.

Should I start a separate thread for this as a feature request?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 10:42 ` Jeff King
  2018-03-02 17:30   ` Sam Kuper
@ 2018-03-02 17:48   ` Junio C Hamano
  2018-03-02 20:34     ` Sam Kuper
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-02 17:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Sam Kuper, git

Jeff King <peff@peff.net> writes:

> Unfortunately, I don't think there's an easy way to do what you want
> (show word-diffs but apply the full diff).

The current "word-diff" discards the information on where the lines
end, and it is pretty much hopeless/useless in the context of "add
-p".  It would be a good addition to revamp it so that it keeps the
original lines in pre/post images but only colors/highlights the
byte ranges in such a way that (1) on a pre-image line, paint only
the byte range that do not appear in the post-image, and (2) on a
post-image line, paint only the byte range that do not appear in the
pre-image.  

E.g. Given these two

    diff --git a/1 b/2
    index aa49234b68..1cd9f6b5fd 100644
    --- a/1
    +++ b/2
    @@ -1,2 +1 @@
    -a b c e
    -f g i j
    +a b d f g h

the current word-diff would give (I cannot do colors here, so I'll
upcase the colored letters):

    @@ -1,2 +1 @@
    a b C ED f g I JH

as the output, but if we produced

    @@ -1,2 +1 @@
    -a b C E
    -f g I J
    +a b D f g H

instead, then colored "add -p" would become easier to see.

And that would be useful outside the context of "add -p", which is a
huge plus.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 17:48   ` Junio C Hamano
@ 2018-03-02 20:34     ` Sam Kuper
  2018-03-02 21:53       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Sam Kuper @ 2018-03-02 20:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

On 02/03/2018, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>> Unfortunately, I don't think there's an easy way to do what you want
>> (show word-diffs but apply the full diff).
>
> The current "word-diff" discards the information on where the lines
> end, and it is pretty much hopeless/useless in the context of "add
> -p".

This is unfortunate.

I am familiar with the model-view-controller ("MVC") paradigm used in
some software packages. I had hoped that Git followed this paradigm
somewhat, and cleanly separated the underlying model of the diff (i.e.
a representation that would be generated in all cases where a diff is
needed), and the view of the diff (i.e. the visual representation,
e.g. word-diff, line diff, colored, non-colored, etc, as requested by
the user).

Such separation of concerns strikes me as being the best approach
here. It could be a lot of work, but I would be grateful if the Git
developers/maintainers could work towards it as an end goal for this
aspect of Git's architecture.

Unfortunately, I am not very familiar with the Git codebase, nor
well-versed in its primary languages, so I can't be of much help here
:(

> It would be a good addition to revamp it so that it keeps the
> original lines in pre/post images but only colors/highlights the
> byte ranges in such a way that (1) on a pre-image line, paint only
> the byte range that do not appear in the post-image, and (2) on a
> post-image line, paint only the byte range that do not appear in the
> pre-image.
>
> E.g. Given these two
>
>     diff --git a/1 b/2
>     index aa49234b68..1cd9f6b5fd 100644
>     --- a/1
>     +++ b/2
>     @@ -1,2 +1 @@
>     -a b c e
>     -f g i j
>     +a b d f g h
>
> the current word-diff would give (I cannot do colors here, so I'll
> upcase the colored letters):
>
>     @@ -1,2 +1 @@
>     a b C ED f g I JH
>
> as the output, but if we produced
>
>     @@ -1,2 +1 @@
>     -a b C E
>     -f g I J
>     +a b D f g H
>
> instead, then colored "add -p" would become easier to see.
>
> And that would be useful outside the context of "add -p", which is a
> huge plus.

This would be much better than the current situation, where the visual
representation gives misleading feedback about the underlying diff,
and where the error I reported crops up.

However, in my opinion your proposed solution would still be not as
good as separating the two concerns, as described earlier in this
email, on two fronts:

1. It would yield, IIUC, less flexibility to create new kinds of view
based on a consistent, standardised underlying model.

2. It is harder to read, for some types of input (e.g. prose) than the
view generated by the existing word-diff algorithm. Your approach, as
outlined in your example above, requires the reader to visually jump
(saccade) between two lines that are separated by an intervening line,
in order to see what has changed. The existing word-diff is much
easier to use: it puts the changes right next to each other, avoiding
line-skipping and allowing horizontal saccades (which are much more
familiar to people used to languages written in left-to-right or
right-to-left scripts).

I don't want to sound negative. As I say, I think your solution is a
big improvement on what currently exists. But I would see it as an
intermediate solution - a stopgap - rather than an endpoint of
development.

In other words, if your solution would be much quicker to implement
than the one I proposed, then please go ahead with yours first, and
please add mine to the longer-term to-do list :)

Thank you again for helping to make Git such a great tool, and for
working tirelessly to improve it further!

P.S. There is a related StackOverflow question here:
https://stackoverflow.com/questions/49058817/git-add-patch-with-word-diff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 20:34     ` Sam Kuper
@ 2018-03-02 21:53       ` Junio C Hamano
  2018-03-03  6:18         ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-02 21:53 UTC (permalink / raw)
  To: Sam Kuper; +Cc: Jeff King, git

Sam Kuper <sam.kuper@uclmail.net> writes:

> 1. It would yield, IIUC, less flexibility to create new kinds of view
> based on a consistent, standardised underlying model.
>
> 2. It is harder to read, for some types of input (e.g. prose) than the
> view generated by the existing word-diff algorithm.

The loss of line-end by the lossy "word-diff" output does not matter
if you never split hunks, but to be able to split a hunk at an
in-between context line (which you can already do) and new features
like per-line selection that are emerging, keeping 1:1 line
correspondence between what is shown and what is applied is a must.

Unless you are volunteering to design (notice that I am not saying
"implement") both diff generation/coloration side _and_ patch
application side, that is.  In which case, you may be able to come
up with a magic ;-)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 17:15         ` Junio C Hamano
@ 2018-03-03  5:57           ` Jeff King
  2018-03-03  5:58             ` [PATCH 1/2] t3701: add a test for interactive.diffFilter Jeff King
                               ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Jeff King @ 2018-03-03  5:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Sam Kuper, git

On Fri, Mar 02, 2018 at 09:15:44AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > That's probably a reasonable sanity check, but I think we need to abort
> > and not just have a too-small DISPLAY array. Because later code like the
> > hunk-splitting is going to assume that there's a 1:1 line
> > correspondence. We definitely don't want to end up in a situation where
> > we show one thing but apply another.
> 
> Yes, agreed completely.

Let's add this sanity check while we're thinking about it. Here's a
series.

  [1/2]: t3701: add a test for interactive.diffFilter
  [2/2]: add--interactive: detect bogus diffFilter output

 git-add--interactive.perl  |  8 ++++++++
 t/t3701-add-interactive.sh | 20 ++++++++++++++++++++
 2 files changed, 28 insertions(+)

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] t3701: add a test for interactive.diffFilter
  2018-03-03  5:57           ` Jeff King
@ 2018-03-03  5:58             ` Jeff King
  2018-03-03  5:58             ` [PATCH 2/2] add--interactive: detect bogus diffFilter output Jeff King
  2018-03-05 21:54             ` Bug report: "Use of uninitialized value $_ in print" Jonathan Nieder
  2 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2018-03-03  5:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Sam Kuper, git

This feature was added in 01143847db (add--interactive:
allow custom diff highlighting programs, 2016-02-27) but
never tested. Let's add a basic test.

Note that we only apply the filter when color is enabled,
so we have to use test_terminal. This is an open limitation
explicitly mentioned in the original commit. So take this
commit as testing the status quo, and not making a statement
on whether we'd want to enhance that in the future.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t3701-add-interactive.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 058698df6a..64fe56c3d5 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -392,6 +392,18 @@ test_expect_success TTY 'diffs can be colorized' '
 	grep "$(printf "\\033")" output
 '
 
+test_expect_success TTY 'diffFilter filters diff' '
+	git reset --hard &&
+
+	echo content >test &&
+	test_config interactive.diffFilter "sed s/^/foo:/" &&
+	printf y | test_terminal git add -p >output 2>&1 &&
+
+	# avoid depending on the exact coloring or content of the prompts,
+	# and just make sure we saw our diff prefixed
+	grep foo:.*content output
+'
+
 test_expect_success 'patch-mode via -i prompts for files' '
 	git reset --hard &&
 
-- 
2.16.2.708.g0b2ed7f536


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-03  5:57           ` Jeff King
  2018-03-03  5:58             ` [PATCH 1/2] t3701: add a test for interactive.diffFilter Jeff King
@ 2018-03-03  5:58             ` Jeff King
  2018-03-05 20:53               ` Junio C Hamano
  2018-03-05 21:33               ` Ævar Arnfjörð Bjarmason
  2018-03-05 21:54             ` Bug report: "Use of uninitialized value $_ in print" Jonathan Nieder
  2 siblings, 2 replies; 23+ messages in thread
From: Jeff King @ 2018-03-03  5:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Sam Kuper, git

It's important that the diff-filter only filter the
individual lines, and that there remain a one-to-one mapping
between the input and output lines. Otherwise, things like
hunk-splitting will behave quite unexpectedly (e.g., you
think you are splitting at one point, but it has a different
effect in the text patch we apply).

We can't detect all problematic cases, but we can at least
catch the obvious case where we don't even have the correct
number of lines.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-add--interactive.perl  | 8 ++++++++
 t/t3701-add-interactive.sh | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 964c3a7542..ff02008abe 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -705,6 +705,14 @@ sub parse_diff {
 	}
 	my (@hunk) = { TEXT => [], DISPLAY => [], TYPE => 'header' };
 
+	if (@colored && @colored != @diff) {
+		print STDERR
+		  "fatal: mismatched output from interactive.diffFilter\n",
+		  "hint: Your filter must maintain a one-to-one correspondence\n",
+		  "hint: between its input and output lines.\n";
+		exit 1;
+	}
+
 	for (my $i = 0; $i < @diff; $i++) {
 		if ($diff[$i] =~ /^@@ /) {
 			push @hunk, { TEXT => [], DISPLAY => [],
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 64fe56c3d5..9bb17f91b2 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -404,6 +404,14 @@ test_expect_success TTY 'diffFilter filters diff' '
 	grep foo:.*content output
 '
 
+test_expect_success TTY 'detect bogus diffFilter output' '
+	git reset --hard &&
+
+	echo content >test &&
+	test_config interactive.diffFilter "echo too-short" &&
+	printf y | test_must_fail test_terminal git add -p
+'
+
 test_expect_success 'patch-mode via -i prompts for files' '
 	git reset --hard &&
 
-- 
2.16.2.708.g0b2ed7f536

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 17:30   ` Sam Kuper
@ 2018-03-03  6:02     ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2018-03-03  6:02 UTC (permalink / raw)
  To: Sam Kuper; +Cc: git

On Fri, Mar 02, 2018 at 05:30:28PM +0000, Sam Kuper wrote:

> On 02/03/2018, Jeff King <peff@peff.net> wrote:
> > Unfortunately, I don't think there's an easy way to do what you want
> > (show word-diffs but apply the full diff).
> 
> Oh :(
> 
> That would be a *very* useful feature to have, especially where
> multiple small (e.g. single character or single word) changes are
> sprinkled throughout a large file.

You might want to try diff-highlight from Git's contrib/ directory. It
was the reason I added interactive.diffFilter in the first place. E.g.:

  cd contrib/diff-highlight
  make
  cp diff-highlight /to/your/$PATH
  git config --global interactive.diffFilter diff-highlight

It's not quite the same as a word-diff, because it doesn't find matches
across word boundaries. So if you re-wrap a paragraph, for example, it
won't be much help. But I do find it useful (and also as a general
filter for log and diff output).

> Should I start a separate thread for this as a feature request?

I think this thread can serve.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-02 21:53       ` Junio C Hamano
@ 2018-03-03  6:18         ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2018-03-03  6:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sam Kuper, git

On Fri, Mar 02, 2018 at 01:53:32PM -0800, Junio C Hamano wrote:

> Sam Kuper <sam.kuper@uclmail.net> writes:
> 
> > 1. It would yield, IIUC, less flexibility to create new kinds of view
> > based on a consistent, standardised underlying model.
> >
> > 2. It is harder to read, for some types of input (e.g. prose) than the
> > view generated by the existing word-diff algorithm.
> 
> The loss of line-end by the lossy "word-diff" output does not matter
> if you never split hunks, but to be able to split a hunk at an
> in-between context line (which you can already do) and new features
> like per-line selection that are emerging, keeping 1:1 line
> correspondence between what is shown and what is applied is a must.
> 
> Unless you are volunteering to design (notice that I am not saying
> "implement") both diff generation/coloration side _and_ patch
> application side, that is.  In which case, you may be able to come
> up with a magic ;-)

IIRC, we do the word-diff by first finding the individual hunks with a
normal line diff, and then doing a word diff inside those hunks. So one
easy option would be to add a --color-words option that gets passed
along to the underlying display diff and just disables hunk splitting. I
think we'd also need to start parsing the DISPLAY hunks more carefully
to make sure we re-sync at hunk boundaries.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-03  5:58             ` [PATCH 2/2] add--interactive: detect bogus diffFilter output Jeff King
@ 2018-03-05 20:53               ` Junio C Hamano
  2018-03-05 20:56                 ` Jeff King
  2018-03-05 21:33               ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-05 20:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Sam Kuper, git

Jeff King <peff@peff.net> writes:

> It's important that the diff-filter only filter the
> individual lines, and that there remain a one-to-one mapping
> between the input and output lines. Otherwise, things like
> hunk-splitting will behave quite unexpectedly (e.g., you
> think you are splitting at one point, but it has a different
> effect in the text patch we apply).
>
> We can't detect all problematic cases, but we can at least
> catch the obvious case where we don't even have the correct
> number of lines.

Will queue.  We could probably also make sure that each of the
corresponding line pair begins with the same '-/ /+' letter, but we
need to draw a line and stop somewhere, and I think the number of
lines is probably a good enough place.

>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  git-add--interactive.perl  | 8 ++++++++
>  t/t3701-add-interactive.sh | 8 ++++++++
>  2 files changed, 16 insertions(+)
>
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index 964c3a7542..ff02008abe 100755
> --- a/git-add--interactive.perl
> +++ b/git-add--interactive.perl
> @@ -705,6 +705,14 @@ sub parse_diff {
>  	}
>  	my (@hunk) = { TEXT => [], DISPLAY => [], TYPE => 'header' };
>  
> +	if (@colored && @colored != @diff) {
> +		print STDERR
> +		  "fatal: mismatched output from interactive.diffFilter\n",
> +		  "hint: Your filter must maintain a one-to-one correspondence\n",
> +		  "hint: between its input and output lines.\n";
> +		exit 1;
> +	}
> +
>  	for (my $i = 0; $i < @diff; $i++) {
>  		if ($diff[$i] =~ /^@@ /) {
>  			push @hunk, { TEXT => [], DISPLAY => [],
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 64fe56c3d5..9bb17f91b2 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -404,6 +404,14 @@ test_expect_success TTY 'diffFilter filters diff' '
>  	grep foo:.*content output
>  '
>  
> +test_expect_success TTY 'detect bogus diffFilter output' '
> +	git reset --hard &&
> +
> +	echo content >test &&
> +	test_config interactive.diffFilter "echo too-short" &&
> +	printf y | test_must_fail test_terminal git add -p
> +'
> +
>  test_expect_success 'patch-mode via -i prompts for files' '
>  	git reset --hard &&

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-05 20:53               ` Junio C Hamano
@ 2018-03-05 20:56                 ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2018-03-05 20:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Sam Kuper, git

On Mon, Mar 05, 2018 at 12:53:07PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > It's important that the diff-filter only filter the
> > individual lines, and that there remain a one-to-one mapping
> > between the input and output lines. Otherwise, things like
> > hunk-splitting will behave quite unexpectedly (e.g., you
> > think you are splitting at one point, but it has a different
> > effect in the text patch we apply).
> >
> > We can't detect all problematic cases, but we can at least
> > catch the obvious case where we don't even have the correct
> > number of lines.
> 
> Will queue.  We could probably also make sure that each of the
> corresponding line pair begins with the same '-/ /+' letter, but we
> need to draw a line and stop somewhere, and I think the number of
> lines is probably a good enough place.

I think that would break things like diff-so-fancy, which actually
removes the -/+ in favor of pure coloring (not something I care for
myself, but it seems a legitimate use case). So it's probably best not
to look at the content, not just from a "we can only go so far"
perspective but also because we actively don't know what the filter's
output is supposed to look like.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-03  5:58             ` [PATCH 2/2] add--interactive: detect bogus diffFilter output Jeff King
  2018-03-05 20:53               ` Junio C Hamano
@ 2018-03-05 21:33               ` Ævar Arnfjörð Bjarmason
  2018-03-05 22:09                 ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-03-05 21:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jonathan Nieder, Sam Kuper, git


On Sat, Mar 03 2018, Jeff King jotted:

> +	if (@colored && @colored != @diff) {

nit: should just be:

    if (@colored != @diff) {

It's not possible for @arrays in scalar context to be undefined.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Bug report: "Use of uninitialized value $_ in print"
  2018-03-03  5:57           ` Jeff King
  2018-03-03  5:58             ` [PATCH 1/2] t3701: add a test for interactive.diffFilter Jeff King
  2018-03-03  5:58             ` [PATCH 2/2] add--interactive: detect bogus diffFilter output Jeff King
@ 2018-03-05 21:54             ` Jonathan Nieder
  2 siblings, 0 replies; 23+ messages in thread
From: Jonathan Nieder @ 2018-03-05 21:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Sam Kuper, git

Jeff King wrote:
> On Fri, Mar 02, 2018 at 09:15:44AM -0800, Junio C Hamano wrote:
>> Jeff King <peff@peff.net> writes:

>>> That's probably a reasonable sanity check, but I think we need to abort
>>> and not just have a too-small DISPLAY array. Because later code like the
>>> hunk-splitting is going to assume that there's a 1:1 line
>>> correspondence. We definitely don't want to end up in a situation where
>>> we show one thing but apply another.
>>
>> Yes, agreed completely.
>
> Let's add this sanity check while we're thinking about it. Here's a
> series.
>
>   [1/2]: t3701: add a test for interactive.diffFilter
>   [2/2]: add--interactive: detect bogus diffFilter output
>
>  git-add--interactive.perl  |  8 ++++++++
>  t/t3701-add-interactive.sh | 20 ++++++++++++++++++++
>  2 files changed, 28 insertions(+)

With or without the tweak Ævar Arnfjörð Bjarmason suggested,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.  It's probably also worth adding Sam's reported-by to patch 2/2:
Reported-by: Sam Kuper <sam.kuper@uclmail.net>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-05 21:33               ` Ævar Arnfjörð Bjarmason
@ 2018-03-05 22:09                 ` Junio C Hamano
  2018-03-05 22:15                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-03-05 22:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, Jonathan Nieder, Sam Kuper, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sat, Mar 03 2018, Jeff King jotted:
>
>> +	if (@colored && @colored != @diff) {
>
> nit: should just be:
>
>     if (@colored != @diff) {
>
> It's not possible for @arrays in scalar context to be undefined.

It is true that @array can not be undef, but your rewrite I think is
wrong.

The first "do the comparison only @colored is true" is not about
definedness.  It is "@colored can be an empty array when the user is
not using separate 'show these colored lines to the end user, feed
these noncolored lines to git-apply command' feature, so @colored==0
and @diff > 0 is perfectly fine".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] add--interactive: detect bogus diffFilter output
  2018-03-05 22:09                 ` Junio C Hamano
@ 2018-03-05 22:15                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-03-05 22:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jonathan Nieder, Sam Kuper, Git Mailing List

On Mon, Mar 5, 2018 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Sat, Mar 03 2018, Jeff King jotted:
>>
>>> +    if (@colored && @colored != @diff) {
>>
>> nit: should just be:
>>
>>     if (@colored != @diff) {
>>
>> It's not possible for @arrays in scalar context to be undefined.
>
> It is true that @array can not be undef, but your rewrite I think is
> wrong.
>
> The first "do the comparison only @colored is true" is not about
> definedness.  It is "@colored can be an empty array when the user is
> not using separate 'show these colored lines to the end user, feed
> these noncolored lines to git-apply command' feature, so @colored==0
> and @diff > 0 is perfectly fine".

Yes, sorry for the noise. I misread the intent of the code.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-03-05 22:15 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-02  1:19 Bug report: "Use of uninitialized value $_ in print" Sam Kuper
2018-03-02  7:04 ` Jonathan Nieder
2018-03-02 10:46   ` Jeff King
2018-03-02 16:53     ` Junio C Hamano
2018-03-02 16:55       ` Jeff King
2018-03-02 17:15         ` Junio C Hamano
2018-03-03  5:57           ` Jeff King
2018-03-03  5:58             ` [PATCH 1/2] t3701: add a test for interactive.diffFilter Jeff King
2018-03-03  5:58             ` [PATCH 2/2] add--interactive: detect bogus diffFilter output Jeff King
2018-03-05 20:53               ` Junio C Hamano
2018-03-05 20:56                 ` Jeff King
2018-03-05 21:33               ` Ævar Arnfjörð Bjarmason
2018-03-05 22:09                 ` Junio C Hamano
2018-03-05 22:15                   ` Ævar Arnfjörð Bjarmason
2018-03-05 21:54             ` Bug report: "Use of uninitialized value $_ in print" Jonathan Nieder
2018-03-02 17:28   ` Sam Kuper
2018-03-02 10:42 ` Jeff King
2018-03-02 17:30   ` Sam Kuper
2018-03-03  6:02     ` Jeff King
2018-03-02 17:48   ` Junio C Hamano
2018-03-02 20:34     ` Sam Kuper
2018-03-02 21:53       ` Junio C Hamano
2018-03-03  6:18         ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).