about summary refs log tree commit homepage
path: root/lib/PublicInbox/ViewDiff.pm
DateCommit message (Collapse)
2020-01-27viewdiff: rewrite and simplify
Instead of going line-by-line, use split() with a giant regexp to capture groups of contiguous lines. This offloads state management to the regexp itself and makes it FAR easier to keep track of <span> and </span> pairings. Performance seems roughly on par after this change for the meta@public-inbox archives. It seems a tiny bit faster for git@vger with xt/perf-msgview.t, likely due to the longer messages and larger contiguous groups of lines having the same prefix (or no prefix at all) and drastically reduces the number of subroutine calls and Perl ops executed.
2020-01-27viewdiff: use autovivification for long_path hash
No sense in wasting code to do something the interpreter already does for us.
2020-01-27viewdiff: add "b=" param when missing "diff --git" line
<2841d2de-32ad-eae8-6039-9251a40bb00e@tngtech.com> as posted to git@vger contained an otherwise valid diff without a "diff --git" line. Generate a "b=" parameter in that case using the "+++" line instead of the "diff --git" line. SearchIdx.pm no longer uses the "diff --git" line for filename information, either.
2020-01-27viewdiff: add "b=" param with non-standard diff prefix
<20180228012207.GB251290@aiede.svl.corp.google.com> (posted to git@vger) uses "i" and "w" prefixes instead of the standard "a" and "b" prefixes, ensure we emit a "b=$FILENAME" param for the solver endpoint to improve search accuracy, syntax highlighting, and information density in the URL itself.
2020-01-27linkify: move to_html over from ViewDiff
We use the same idiom in many places for doing two-step linkification and HTML escaping. Get rid of an outdated comment in flush_quote while we're at it.
2020-01-06treewide: "require" + "use" cleanup and docs
There's a bunch of leftover "require" and "use" statements we no longer need and can get rid of, along with some excessive imports via "use". IO::Handle usage isn't always obvious, so add comments describing why a package loads it. Along the same lines, document the tmpdir support as the reason we depend on File::Temp 0.19, even though every Perl 5.10.1+ user has it. While we're at it, favor "use" over "require", since it it gives us extra compile-time checking.
2020-01-04viewdiff: do not anchor spaces after filenames in diffstat
Viewing a CSS-less page in a browser which underlines links can show a long line of underscores after diffstats. Not all browsers underline links by default, though.
2019-07-05viewdiff: do not anchor using diffstat comments
Diffstat summary comments were added to git last year and we need to filter them out to get anchors working properly. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> https://public-inbox.org/meta/20190704231123.GF20404@szeder.dev/
2019-06-04solver|viewdiff: restrict digit matches to ASCII
git would not generate non-ASCII digits to describe hunk offsets, so don't waste more time than necessary to make sense of non-ASCII digit chars for line offsets.
2019-05-31viewdiff: avoid repeat variable expansion
This is worth a 1-2% speedup in t/perf-msgview.t rendering 2620 messages currently in https://public-inbox.org/meta/
2019-05-16Revert "view: perform highlighting for space-prefixed diffs"
This was buggy and was causing non-diff text to have extra leading spaces. The diff parsing code needs to be cleaned up, so this will be fixed, later. This reverts commit 1a67b91c1326efa372d1ec957e2494849d894f0b.
2019-05-16view: perform highlighting for space-prefixed diffs
"git format-patch --interdiff" and similar can prefix diffs with leading white space. Teach our diff parser to account for it and set appropriate CSS classes for them.
2019-04-26viewdiff: do not break out of DSTATE_CTX on /^$/
It seems a common case for mangled patches is editors or MUAs dropping trailing whitespace, and lines matching /^ $/ gets the space dropped to only match /^$/.
2019-04-15viewdiff: document constants
We'll be building off of this for showing diffs in the coderepo views.
2019-02-04viewdiff: group path match to not be confused by "/dev/null"
Leaving out parentheses caused transitions to state="del" or state="add" to be misidentified. cf. https://public-inbox.org/meta/20190204105454.GG10587@szeder.dev/ Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
2019-02-01viewdiff: support renames and long paths in diffstat anchors
This is best-effort, but works well-enough in practice for projects which use shell-friendly filenames as well as the long path names for some Linux kernel selftests.
2019-02-01viewdiff: escape HTML ampersand for renames
For URLs we generate, we need to escape '&' in query parameters for correctness.
2019-02-01view: diffstat anchors for multi-message/attachment views
diffstat <-> ^diff anchors work within the same attachment or message while in HTML views which display multiple messages.
2019-02-01viewdiff: diffstat links to diff anchors
This can be helpful for reviewing larger patches which span across several files on the permalink (/$MESSAGE_ID/) HTML page. More work will be needed to get this working for the /T/ and /t/ pages which show multiple emails, as the filename-based anchors will conflict at the moment.
2019-01-20viewdiff: do not link to 0{7,40} blobs (again)
We must reset diff context when starting a new file; and we must check for all-zeroes object_ids as the post-image correctly.
2019-01-20viewdiff: quote attributes for Atom feed
We still need to use XHTML the Atom feed, and XHTML requires attributes to be quoted, whereas HTML 5 does not.
2019-01-20viewdiff: cleanup state transitions a bit
This makes things less error-prone and allows us to only highlight the "@@ -\S+ \+\S+ @@" part of the hunk header line, without highlighting the function context. This more closely matches the coloring behavior of git-diff(1)
2019-01-20viewdiff: support diff-highlighting w/o coderepo
Having diff highlighting alone is still useful, even if blob-resolution/recreation is too expensive or unfeasible.
2019-01-20view: enforce trailing slash for /$INBOX/$OID/s/ endpoints
As with our use of the trailing slash in $MESSAGE_ID/T/ and '$MESSAGE_ID/t/' endpoints, this for 'wget -r --mirror' compatibility as well as allowing sysadmins to quickly stand up a static directory with "index.html" in it to reduce load.
2019-01-20view: enable naming hints for raw blob downloads
Meaningful names in URLs are nice, and it can make life easier for supporting syntax-highlighting
2019-01-19view: wire up diff and vcs viewers with solver