git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Difficulty with parsing colorized diff output
@ 2018-12-08  0:09 George King
  2018-12-08  7:16 ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: George King @ 2018-12-08  0:09 UTC (permalink / raw)
  To: Git Mailing List

Hello, I have a rather elaborate diff highlighter that I have implemented as a post-processor to regular git output. I am writing to discuss some difficult aspects of git diff's color output that I am observing with version 2.19.2. This is not a regression report; I am trying to implement a new feature and am stymied by these details.

My goal is to detect SGR color sequences, e.g. '\x1b[32m', that exist in the source text, and have my highlighter print escaped representations of those. For example, I have checked in files that are expected test outputs for tools that emit color codes, and diffs of those get very confusing.

Figuring out which color codes are from the source text and which were added by git is proving very difficult. The obvious solution is to turn git diff coloring off, but as far as I can see this also turns off all coloring for logs, which is undesirable.

Then I tried to remove just the color codes that git adds to the diff. This almost works, but there are some irregularities. Most lines begin with a style/color code and end with a reset code, which would be a perfect indicator that git is using colors. However:

* Context lines do not begin with reset code, but do end with a reset code. It would be preferable in my opinion if they had both (like every other line), or none at all.

* Added lines have excess codes after the plus sign. The entire prefix is, `\x1b[32m+\x1b[m\x1b[32m` translating to GREEN PLUS RESET GREEN. Emitting codes after the plus sign makes the parsing more complex and idiosyncratic.


In summary, I would like to suggest the following improvements:

* Remove the excess codes after the plus sign.

* When git diff is adding colors, ensure that every line begins with an SGR code and ends with the RESET code.

* Add a config feature to turn on log coloring while leaving diff coloring off.


I would be willing to attempt a fix for this myself, but I'd like to hear what the maintainers think first, and would appreciate any hints as to where I should start looking in the code base.


If anyone is curious about the implementation it is called `same-same` and lives here: https://github.com/gwk/pithy/blob/master/pithy/bin/same_same.py

I configure it like this in .gitconfig:

[core]
  pager = same-same | LESSANSIENDCHARS=mK less --RAW-CONTROL-CHARS
[interactive]
  diffFilter = same-same -interactive | LESSANSIENDCHARS=mK less --RAW-CONTROL-CHARS


Thank you,
George


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-12 13:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-08  0:09 Difficulty with parsing colorized diff output George King
2018-12-08  7:16 ` Jeff King
2018-12-11  3:26   ` Stefan Beller
2018-12-11 10:17     ` Jeff King
2018-12-11 14:47       ` George King
2018-12-11 16:28       ` Ævar Arnfjörð Bjarmason
2018-12-11 16:41         ` George King
2018-12-11 18:55           ` George King
2018-12-12 13:52             ` Jeff King
2018-12-12 12:49           ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).