git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: George King <george.w.king@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Subject: Difficulty with parsing colorized diff output
Date: Fri, 7 Dec 2018 19:09:58 -0500	[thread overview]
Message-ID: <799879BD-A2F0-487C-AA05-8054AC62C5BD@gmail.com> (raw)

Hello, I have a rather elaborate diff highlighter that I have implemented as a post-processor to regular git output. I am writing to discuss some difficult aspects of git diff's color output that I am observing with version 2.19.2. This is not a regression report; I am trying to implement a new feature and am stymied by these details.

My goal is to detect SGR color sequences, e.g. '\x1b[32m', that exist in the source text, and have my highlighter print escaped representations of those. For example, I have checked in files that are expected test outputs for tools that emit color codes, and diffs of those get very confusing.

Figuring out which color codes are from the source text and which were added by git is proving very difficult. The obvious solution is to turn git diff coloring off, but as far as I can see this also turns off all coloring for logs, which is undesirable.

Then I tried to remove just the color codes that git adds to the diff. This almost works, but there are some irregularities. Most lines begin with a style/color code and end with a reset code, which would be a perfect indicator that git is using colors. However:

* Context lines do not begin with reset code, but do end with a reset code. It would be preferable in my opinion if they had both (like every other line), or none at all.

* Added lines have excess codes after the plus sign. The entire prefix is, `\x1b[32m+\x1b[m\x1b[32m` translating to GREEN PLUS RESET GREEN. Emitting codes after the plus sign makes the parsing more complex and idiosyncratic.


In summary, I would like to suggest the following improvements:

* Remove the excess codes after the plus sign.

* When git diff is adding colors, ensure that every line begins with an SGR code and ends with the RESET code.

* Add a config feature to turn on log coloring while leaving diff coloring off.


I would be willing to attempt a fix for this myself, but I'd like to hear what the maintainers think first, and would appreciate any hints as to where I should start looking in the code base.


If anyone is curious about the implementation it is called `same-same` and lives here: https://github.com/gwk/pithy/blob/master/pithy/bin/same_same.py

I configure it like this in .gitconfig:

[core]
  pager = same-same | LESSANSIENDCHARS=mK less --RAW-CONTROL-CHARS
[interactive]
  diffFilter = same-same -interactive | LESSANSIENDCHARS=mK less --RAW-CONTROL-CHARS


Thank you,
George


             reply	other threads:[~2018-12-08  0:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-08  0:09 George King [this message]
2018-12-08  7:16 ` Difficulty with parsing colorized diff output Jeff King
2018-12-11  3:26   ` Stefan Beller
2018-12-11 10:17     ` Jeff King
2018-12-11 14:47       ` George King
2018-12-11 16:28       ` Ævar Arnfjörð Bjarmason
2018-12-11 16:41         ` George King
2018-12-11 18:55           ` George King
2018-12-12 13:52             ` Jeff King
2018-12-12 12:49           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=799879BD-A2F0-487C-AA05-8054AC62C5BD@gmail.com \
    --to=george.w.king@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).