From: Phillip Wood <phillip.wood@talktalk.net>
To: Jeff King <peff@peff.net>, phillip.wood@dunelm.org.uk
Cc: Brian Henderson <henderson.bj@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 7/7] diff-highlight: detect --graph by indent
Date: Thu, 22 Mar 2018 10:59:31 +0000 [thread overview]
Message-ID: <0e2317fc-cfec-ba7f-8ded-59cc6b7f9a63@talktalk.net> (raw)
In-Reply-To: <20180321055900.GB15674@sigill.intra.peff.net>
On 21/03/18 05:59, Jeff King wrote:
> This patch fixes a corner case where diff-highlight may
> scramble some diffs when combined with --graph.
>
> Commit 7e4ffb4c17 (diff-highlight: add support for --graph
> output, 2016-08-29) taught diff-highlight to skip past the
> graph characters at the start of each line with this regex:
>
> ($COLOR?\|$COLOR?\s+)*
>
> I.e., any series of pipes separated by and followed by
> arbitrary whitespace. We need to match more than just a
> single space because the commit in question may be indented
> to accommodate other parts of the graph drawing. E.g.:
>
> * commit 1234abcd
> | ...
> | diff --git ...
>
> has only a single space, but for the last commit before a
> fork:
>
> | | |
> | * | commit 1234abcd
> | |/ ...
> | | diff --git
>
> the diff lines have more spaces between the pipes and the
> start of the diff.
>
> However, when we soak up all of those spaces with the
> $GRAPH regex, we may accidentally include the leading space
> for a context line. That means we may consider the actual
> contents of a context line as part of the diff syntax. In
> other words, something like this:
>
> normal context line
> -old line
> +new line
> -this is a context line with a leading dash
>
> would cause us to see that final context line as a removal
> line, and we'd end up showing the hunk in the wrong order:
>
> normal context line
> -old line
> -this is a context line with a leading dash
> +new line
>
> Instead, let's a be a little more clever about parsing the
> graph. We'll look for the actual "*" line that marks the
> start of a commit, and record the indentation we see there.
> Then we can skip past that indentation when checking whether
> the line is a hunk header, removal, addition, etc.
>
> There is one tricky thing: the indentation in bytes may be
> different for various lines of the graph due to coloring.
> E.g., the "*" on a commit line is generally shown without
> color, but on the actual diff lines, it will be replaced
> with a colorized "|" character, adding several bytes. We
> work around this here by counting "visible" bytes. This is
> unfortunately a bit more expensive, making us about twice as
> slow to handle --graph output. But since this is meant to be
> used interactively anyway, it's tolerably fast (and the
> non-graph case is unaffected).
>
> One alternative would be to search for hunk header lines and
> use their indentation (since they'd have the same colors as
> the diff lines which follow). But that just opens up
> different corner cases. If we see:
>
> | | @@ 1,2 1,3 @@
>
> we cannot know if this is a real diff that has been
> indented due to the graph, or if it's a context line that
> happens to look like a diff header. We can only be sure of
> the indent on the "*" lines, since we know those don't
> contain arbitrary data (technically the user could include a
> bunch of extra indentation via --format, but that's rare
> enough to disregard).
>
> Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> contrib/diff-highlight/DiffHighlight.pm | 78 +++++++++++++++----
> .../diff-highlight/t/t9400-diff-highlight.sh | 28 +++++++
> 2 files changed, 91 insertions(+), 15 deletions(-)
>
> diff --git a/contrib/diff-highlight/DiffHighlight.pm b/contrib/diff-highlight/DiffHighlight.pm
> index e07cd5931d..536754583b 100644
> --- a/contrib/diff-highlight/DiffHighlight.pm
> +++ b/contrib/diff-highlight/DiffHighlight.pm
> @@ -21,34 +21,82 @@ my $RESET = "\x1b[m";
> my $COLOR = qr/\x1b\[[0-9;]*m/;
> my $BORING = qr/$COLOR|\s/;
>
> -# The patch portion of git log -p --graph should only ever have preceding | and
> -# not / or \ as merge history only shows up on the commit line.
> -my $GRAPH = qr/$COLOR?\|$COLOR?\s+/;
> -
> my @removed;
> my @added;
> my $in_hunk;
> +my $graph_indent = 0;
>
> our $line_cb = sub { print @_ };
> our $flush_cb = sub { local $| = 1 };
>
> -sub handle_line {
> +# Count the visible width of a string, excluding any terminal color sequences.
> +sub visible_width {
> local $_ = shift;
> + my $ret = 0;
> + while (length) {
> + if (s/^$COLOR//) {
> + # skip colors
> + } elsif (s/^.//) {
> + $ret++;
> + }
> + }
> + return $ret;
> +}
> +
> +# Return a substring of $str, omitting $len visible characters from the
> +# beginning, where terminal color sequences do not count as visible.
> +sub visible_substr {
> + my ($str, $len) = @_;
> + while ($len > 0) {
> + if ($str =~ s/^$COLOR//) {
> + next
> + }
> + $str =~ s/^.//;
> + $len--;
> + }
> + return $str;
> +}
> +
> +sub handle_line {
> + my $orig = shift;
> + local $_ = $orig;
> +
> + # match a graph line that begins a commit
> + if (/^(?:$COLOR?\|$COLOR?[ ])* # zero or more leading "|" with space
> + $COLOR?\*$COLOR?[ ] # a "*" with its trailing space
> + (?:$COLOR?\|$COLOR?[ ])* # zero or more trailing "|"
> + [ ]* # trailing whitespace for merges
Hi Peff, thanks for looking at this. I've only had a quick look through
but I wonder if this will be confused by commit messages that contain
* bullet points
* like this
Best Wishes
Phillip
> + /x) {
> + my $graph_prefix = $&;
> +
> + # We must flush before setting graph indent, since the
> + # new commit may be indented differently from what we
> + # queued.
> + flush();
> + $graph_indent = visible_width($graph_prefix);
> +
> + } elsif ($graph_indent) {
> + if (length($_) < $graph_indent) {
> + $graph_indent = 0;
> + } else {
> + $_ = visible_substr($_, $graph_indent);
> + }
> + }
>
> if (!$in_hunk) {
> - $line_cb->($_);
> - $in_hunk = /^$GRAPH*$COLOR*\@\@ /;
> + $line_cb->($orig);
> + $in_hunk = /^$COLOR*\@\@ /;
> }
> - elsif (/^$GRAPH*$COLOR*-/) {
> - push @removed, $_;
> + elsif (/^$COLOR*-/) {
> + push @removed, $orig;
> }
> - elsif (/^$GRAPH*$COLOR*\+/) {
> - push @added, $_;
> + elsif (/^$COLOR*\+/) {
> + push @added, $orig;
> }
> else {
> flush();
> - $line_cb->($_);
> - $in_hunk = /^$GRAPH*$COLOR*[\@ ]/;
> + $line_cb->($orig);
> + $in_hunk = /^$COLOR*[\@ ]/;
> }
>
> # Most of the time there is enough output to keep things streaming,
> @@ -225,8 +273,8 @@ sub is_pair_interesting {
> my $suffix_a = join('', @$a[($sa+1)..$#$a]);
> my $suffix_b = join('', @$b[($sb+1)..$#$b]);
>
> - return $prefix_a !~ /^$GRAPH*$COLOR*-$BORING*$/ ||
> - $prefix_b !~ /^$GRAPH*$COLOR*\+$BORING*$/ ||
> + return visible_substr($prefix_a, $graph_indent) !~ /^$COLOR*-$BORING*$/ ||
> + visible_substr($prefix_b, $graph_indent) !~ /^$COLOR*\+$BORING*$/ ||
> $suffix_a !~ /^$BORING*$/ ||
> $suffix_b !~ /^$BORING*$/;
> }
> diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh
> index bf0c270d60..f6f5195d00 100755
> --- a/contrib/diff-highlight/t/t9400-diff-highlight.sh
> +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh
> @@ -310,4 +310,32 @@ test_expect_success 'diff-highlight ignores combined diffs' '
> test_cmp expect actual
> '
>
> +test_expect_success 'diff-highlight handles --graph with leading dash' '
> + cat >file <<-\EOF &&
> + before
> + the old line
> + -leading dash
> + EOF
> + git add file &&
> + git commit -m before &&
> +
> + sed s/old/new/ <file >file.tmp &&
> + mv file.tmp file &&
> + git add file &&
> + git commit -m after &&
> +
> + cat >expect <<-EOF &&
> + --- a/file
> + +++ b/file
> + @@ -1,3 +1,3 @@
> + before
> + -the ${CW}old${CR} line
> + +the ${CW}new${CR} line
> + -leading dash
> + EOF
> + git log --graph -p -1 | "$DIFF_HIGHLIGHT" >actual.raw &&
> + trim_graph <actual.raw | sed -n "/^---/,\$p" >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
>
next prev parent reply other threads:[~2018-03-22 10:59 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-19 10:21 [BUG] log --graph corrupts patch Phillip Wood
2018-03-20 6:09 ` Jeff King
2018-03-20 9:58 ` Phillip Wood
2018-03-20 15:58 ` Jeff King
2018-03-21 5:47 ` Jeff King
2018-03-21 5:47 ` [PATCH 1/7] diff-highlight: correct test graph diagram Jeff King
2018-03-21 5:48 ` [PATCH 2/7] diff-highlight: use test_tick in graph test Jeff King
2018-03-21 5:48 ` [PATCH 3/7] diff-highlight: prefer "echo" to "cat" in tests Jeff King
2018-03-21 5:49 ` [PATCH 4/7] diff-highlight: test interleaved parallel lines of history Jeff King
2018-03-21 5:49 ` [PATCH 5/7] diff-highlight: test graphs with --color Jeff King
2018-03-21 5:56 ` [PATCH 6/7] diff-highlight: use flush() helper consistently Jeff King
2018-03-21 5:59 ` [PATCH 7/7] diff-highlight: detect --graph by indent Jeff King
2018-03-22 10:59 ` Phillip Wood [this message]
2018-03-22 11:18 ` Jeff King
2018-03-21 17:23 ` [BUG] log --graph corrupts patch Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e2317fc-cfec-ba7f-8ded-59cc6b7f9a63@talktalk.net \
--to=phillip.wood@talktalk.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=henderson.bj@gmail.com \
--cc=peff@peff.net \
--cc=phillip.wood@dunelm.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).