From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 942DF2022C; Thu, 18 Aug 2016 01:39:41 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Cc: Thomas Ferris Nicolaisen , Johannes Schindelin Subject: [PATCH 3/3] view: try assuming UTF-8 for bogus charsets Date: Thu, 18 Aug 2016 01:39:41 +0000 Message-Id: <20160818013941.8673-4-e@80x24.org> In-Reply-To: <20160818013941.8673-1-e@80x24.org> References: <20160818013941.8673-1-e@80x24.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit List-Id: For some reason, Alpine will set X-UNKNOWN for valid UTF-8. Since we favor UTF-8 HTML anyways, try forcing Email::MIME to handle text/plain as UTF-8 which might show up better. At least this change renders properly by showing "•" (•) instead of "â ¢" (•) Reported-by: Thomas Ferris Nicolaisen --- lib/PublicInbox/View.pm | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 3f0e122..6997c1c 100644 --- a/lib/PublicInbox/View.pm +++ b/lib/PublicInbox/View.pm @@ -457,8 +457,14 @@ sub add_text_body { my $err = $@; if ($err) { if ($ct =~ m!\btext/plain\b!i) { + # Try to assume UTF-8 because Alpine seems to + # do wacky things and set charset=X-UNKNOWN + $part->charset_set('UTF-8'); + $s = eval { $part->body_str }; + + # If forcing charset=UTF-8 failed, # attach_link will warn further down... - $s = $part->body; + $s = $part->body if $@; } else { return attach_link($upfx, $ct, $p, $fn); } -- EW