From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id B7FDB1F61A for ; Mon, 24 Feb 2020 07:33:28 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 1/3] hval: ascii_html: drop CRLF => LF conversion Date: Mon, 24 Feb 2020 07:33:26 +0000 Message-Id: <20200224073328.16230-2-e@yhbt.net> In-Reply-To: <20200224073328.16230-1-e@yhbt.net> References: <20200224073328.16230-1-e@yhbt.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Instead, we add CRLF conversion to the only remaining place which needs it, ViewVCS. This save many redundant ops in in many places. The only other place where this mattered was in View::add_text_body, but we already started doing CRLF conversions when we added diff parsing and link generation for ViewVCS. Otherwise, all other places we used this was for header viewing and Email::MIME doesn't preserve CRLF in headers. --- lib/PublicInbox/Hval.pm | 1 - lib/PublicInbox/ViewVCS.pm | 2 +- t/plack.t | 27 +++++++++++++++++++++++++++ 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm index 5f7ab513..79005d21 100644 --- a/lib/PublicInbox/Hval.pm +++ b/lib/PublicInbox/Hval.pm @@ -55,7 +55,6 @@ sub src_escape ($) { sub ascii_html { my ($s) = @_; - $s =~ s/\r\n/\n/sg; # fixup bad line endings $s =~ s/([<>&'"\x7f\x00-\x1f])/$xhtml_map{$1}/sge; $enc_ascii->encode($s, Encode::HTMLCREF); } diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm index 1379bd58..2f8e1c4f 100644 --- a/lib/PublicInbox/ViewVCS.pm +++ b/lib/PublicInbox/ViewVCS.pm @@ -164,7 +164,7 @@ sub solve_result { # TODO: detect + convert to ensure validity utf8::decode($$blob); - my $nl = ($$blob =~ tr/\n/\n/); + my $nl = ($$blob =~ s/\r?\n/\n/sg); my $pad = length($nl); $l->linkify_1($$blob); diff --git a/t/plack.t b/t/plack.t index fe767aed..ea318792 100644 --- a/t/plack.t +++ b/t/plack.t @@ -109,6 +109,23 @@ EOF like($mime->body_raw, qr/hi =3D bye=/, 'our test used QP correctly'); $im->add($mime); + my $crlf = < +To: $addr +Message-Id: +Subject: carriage + return + in + long + subject +Date: Fri, 02 Oct 1993 00:00:00 +0000 + +:( +EOF + $crlf =~ s/\n/\r\n/sg; + $im->add(Email::MIME->new($crlf)); + $im->done; } @@ -120,6 +137,16 @@ test_psgi($app, sub { } }); +test_psgi($app, sub { + my ($cb) = @_; + my $res = $cb->(GET('http://example.com/test/crlf@example.com/')); + is($res->code, 200, 'retrieved CRLF as HTML'); + unlike($res->content, qr/\r/, 'no CR in HTML'); + $res = $cb->(GET('http://example.com/test/crlf@example.com/raw')); + is($res->code, 200, 'retrieved CRLF raw'); + like($res->content, qr/\r/, 'CR preserved in raw message'); +}); + # redirect with newsgroup test_psgi($app, sub { my ($cb) = @_;