user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: [PATCH 1/3] hval: ascii_html: drop CRLF => LF conversion
Date: Mon, 24 Feb 2020 07:33:26 +0000	[thread overview]
Message-ID: <20200224073328.16230-2-e@yhbt.net> (raw)
In-Reply-To: <20200224073328.16230-1-e@yhbt.net>

Instead, we add CRLF conversion to the only remaining place
which needs it, ViewVCS.  This save many redundant ops in in
many places.

The only other place where this mattered was in
View::add_text_body, but we already started doing CRLF
conversions when we added diff parsing and link generation for
ViewVCS.  Otherwise, all other places we used this was for
header viewing and Email::MIME doesn't preserve CRLF in headers.
---
 lib/PublicInbox/Hval.pm    |  1 -
 lib/PublicInbox/ViewVCS.pm |  2 +-
 t/plack.t                  | 27 +++++++++++++++++++++++++++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index 5f7ab513..79005d21 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -55,7 +55,6 @@ sub src_escape ($) {
 
 sub ascii_html {
 	my ($s) = @_;
-	$s =~ s/\r\n/\n/sg; # fixup bad line endings
 	$s =~ s/([<>&'"\x7f\x00-\x1f])/$xhtml_map{$1}/sge;
 	$enc_ascii->encode($s, Encode::HTMLCREF);
 }
diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index 1379bd58..2f8e1c4f 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -164,7 +164,7 @@ sub solve_result {
 
 	# TODO: detect + convert to ensure validity
 	utf8::decode($$blob);
-	my $nl = ($$blob =~ tr/\n/\n/);
+	my $nl = ($$blob =~ s/\r?\n/\n/sg);
 	my $pad = length($nl);
 
 	$l->linkify_1($$blob);
diff --git a/t/plack.t b/t/plack.t
index fe767aed..ea318792 100644
--- a/t/plack.t
+++ b/t/plack.t
@@ -109,6 +109,23 @@ EOF
 	like($mime->body_raw, qr/hi =3D bye=/, 'our test used QP correctly');
 	$im->add($mime);
 
+	my $crlf = <<EOF;
+From: Me
+  <me\@example.com>
+To: $addr
+Message-Id: <crlf\@example.com>
+Subject: carriage
+  return
+  in
+  long
+  subject
+Date: Fri, 02 Oct 1993 00:00:00 +0000
+
+:(
+EOF
+	$crlf =~ s/\n/\r\n/sg;
+	$im->add(Email::MIME->new($crlf));
+
 	$im->done;
 }
 
@@ -120,6 +137,16 @@ test_psgi($app, sub {
 	}
 });
 
+test_psgi($app, sub {
+	my ($cb) = @_;
+	my $res = $cb->(GET('http://example.com/test/crlf@example.com/'));
+	is($res->code, 200, 'retrieved CRLF as HTML');
+	unlike($res->content, qr/\r/, 'no CR in HTML');
+	$res = $cb->(GET('http://example.com/test/crlf@example.com/raw'));
+	is($res->code, 200, 'retrieved CRLF raw');
+	like($res->content, qr/\r/, 'CR preserved in raw message');
+});
+
 # redirect with newsgroup
 test_psgi($app, sub {
 	my ($cb) = @_;

  reply	other threads:[~2020-02-24  7:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24  7:33 [PATCH 0/3] avoid redundant CRLF handling Eric Wong
2020-02-24  7:33 ` Eric Wong [this message]
2020-02-24  7:33 ` [PATCH 2/3] viewdiff: remove optional CR handling Eric Wong
2020-02-24  7:33 ` [PATCH 3/3] examples/nginx_proxy: convert CRLF to LF Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200224073328.16230-2-e@yhbt.net \
    --to=e@yhbt.net \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).