* [PATCH 0/2] viewdiff: linkification fixes
@ 2020-05-06 10:40 Eric Wong
2020-05-06 10:40 ` [PATCH 1/2] viewdiff: assume diffstat and diff order are identical Eric Wong
2020-05-06 10:40 ` [PATCH 2/2] viewdiff: stricter highlighting and linkification check Eric Wong
0 siblings, 2 replies; 3+ messages in thread
From: Eric Wong @ 2020-05-06 10:40 UTC (permalink / raw)
To: meta
Diffstat linkification of long file names is no longer hash
order dependent, since I noticed some HTML rendering differences
between PublicInbox::MIME and PublicInbox::Eml (its
non-Email::MIME replacement).
I also noticed some wasted work in patch series cover letters
which included diffstats, as well as over-linkifying
tables in the cover letter which feature no other
diff features.
Eric Wong (2):
viewdiff: assume diffstat and diff order are identical
viewdiff: stricter highlighting and linkification check
lib/PublicInbox/View.pm | 7 +++++--
lib/PublicInbox/ViewDiff.pm | 27 ++++++++++++---------------
2 files changed, 17 insertions(+), 17 deletions(-)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] viewdiff: assume diffstat and diff order are identical
2020-05-06 10:40 [PATCH 0/2] viewdiff: linkification fixes Eric Wong
@ 2020-05-06 10:40 ` Eric Wong
2020-05-06 10:40 ` [PATCH 2/2] viewdiff: stricter highlighting and linkification check Eric Wong
1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2020-05-06 10:40 UTC (permalink / raw)
To: meta
For non-malicious messages, we can assume the diffstat and actual
diff appear in the same order. Thus we can store {-long_paths} as
an arrayref and only compare the first element when we encounter
a truncated path.
This should make HTML rendering stable when there's basename
conflicts in message such as
https://lore.kernel.org/backports/1393202754-12919-13-git-send-email-hauke@hauke-m.de/
This diffstat anchor linkification can still be defeated by
users who make actual path names beginning with "...", but we
won't waste CPU cycles on it, either.
---
lib/PublicInbox/ViewDiff.pm | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/lib/PublicInbox/ViewDiff.pm b/lib/PublicInbox/ViewDiff.pm
index 3d6058a9..34df8ad4 100644
--- a/lib/PublicInbox/ViewDiff.pm
+++ b/lib/PublicInbox/ViewDiff.pm
@@ -82,10 +82,8 @@ sub anchor0 ($$$$) {
$fn =~ s/{(?:.+) => (.+)}/$1/ or $fn =~ s/.* => (.+)/$1/;
$fn = git_unquote($fn);
- # long filenames will require us to walk backwards in anchor1
- if ($fn =~ s!\A\.\.\./?!!) {
- $ctx->{-long_path}->{$fn} = qr/\Q$fn\E\z/s;
- }
+ # long filenames will require us to check in anchor1()
+ push(@{$ctx->{-long_path}}, $fn) if $fn =~ s!\A\.\.\./?!!;
if (my $attr = to_attr($ctx->{-apfx}.$fn)) {
$ctx->{-anchors}->{$attr} = 1;
@@ -105,17 +103,14 @@ sub anchor1 ($$) {
my $ok = delete $ctx->{-anchors}->{$attr};
- # unlikely, check the end of all long path names we captured:
+ # unlikely, check the end of long path names we captured,
+ # assume diffstat and diff output follow the same order,
+ # and ignore different ordering (could be malicious input)
unless ($ok) {
- my $lp = $ctx->{-long_path} or return;
- foreach my $fn (keys %$lp) {
- $pb =~ $lp->{$fn} or next;
-
- delete $lp->{$fn};
- $attr = to_attr($ctx->{-apfx}.$fn) or return;
- $ok = delete $ctx->{-anchors}->{$attr} or return;
- last;
- }
+ my $fn = shift(@{$ctx->{-long_path}}) or return;
+ $pb =~ /\Q$fn\E\z/s or return;
+ $attr = to_attr($ctx->{-apfx}.$fn) or return;
+ $ok = delete $ctx->{-anchors}->{$attr} or return;
}
$ok ? "<a\nhref=#i$attr\nid=$attr>diff</a> --git" : undef
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] viewdiff: stricter highlighting and linkification check
2020-05-06 10:40 [PATCH 0/2] viewdiff: linkification fixes Eric Wong
2020-05-06 10:40 ` [PATCH 1/2] viewdiff: assume diffstat and diff order are identical Eric Wong
@ 2020-05-06 10:40 ` Eric Wong
1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2020-05-06 10:40 UTC (permalink / raw)
To: meta
Sometimes senders draw ASCII tables and such which we
get fooled into attempting highlighting and diffstat
anchoring.
We now require 3 consecutive diff header lines:
/^--- /, /^\Q+++\E /, and /^@@ /
to enable diff highlighting (whether generated with git or not).
The presence of a line matching /^diff / is not sufficient or
even useful to us for highlighting diffs, since that could just
be part of a line-wrapped sentence.
However, we'll now check for the presence of a line matching
/^diff --git / before enabling diffstat anchors. Otherwise
cover letters for a patch series may fool us into creating
anchors for diffstats.
---
lib/PublicInbox/View.pm | 7 +++++--
lib/PublicInbox/ViewDiff.pm | 4 +++-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 5144a130..f7a8ae32 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -536,11 +536,14 @@ sub add_text_body { # callback for msg_iter
# always support diff-highlighting, but we can't linkify hunk
# headers for solver unless some coderepo are configured:
my $diff;
- if ($s =~ /^(?:diff|---|\+{3}) /ms) {
+ if ($s =~ /^--- [^\n]+\n\+{3} [^\n]+\n@@ /ms) {
# diffstat anchors do not link across attachments or messages:
$idx[0] = $upfx . $idx[0] if $upfx ne '';
$ctx->{-apfx} = join('/', @idx);
- $ctx->{-anchors} = {}; # attr => filename
+
+ # do attr => filename mappings for diffstats in git diffs:
+ $ctx->{-anchors} = {} if $s =~ /^diff --git /sm;
+
$diff = 1;
delete $ctx->{-long_path};
my $spfx;
diff --git a/lib/PublicInbox/ViewDiff.pm b/lib/PublicInbox/ViewDiff.pm
index 34df8ad4..6fe9a0d7 100644
--- a/lib/PublicInbox/ViewDiff.pm
+++ b/lib/PublicInbox/ViewDiff.pm
@@ -165,10 +165,12 @@ sub diff_before_or_after ($$) {
my ($ctx, $x) = @_;
my $linkify = $ctx->{-linkify};
my $dst = $ctx->{obuf};
+ my $anchors = exists($ctx->{-anchors}) ? 1 : 0;
for my $y (split(/(^---\n)/sm, $$x)) {
if ($y =~ /\A---\n\z/s) {
$$dst .= "---\n"; # all HTML is "\r\n" => "\n"
- } elsif ($y =~ /^ [0-9]+ files? changed, /sm) {
+ $anchors |= 2;
+ } elsif ($anchors == 3 && $y =~ /^ [0-9]+ files? changed, /sm) {
# ok, looks like a diffstat, go line-by-line:
for my $l (split(/^/m, $y)) {
if ($l =~ /^ (.+)( +\| .*\z)/s) {
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-05-06 10:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-06 10:40 [PATCH 0/2] viewdiff: linkification fixes Eric Wong
2020-05-06 10:40 ` [PATCH 1/2] viewdiff: assume diffstat and diff order are identical Eric Wong
2020-05-06 10:40 ` [PATCH 2/2] viewdiff: stricter highlighting and linkification check Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).