From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C225C1F45E for ; Fri, 14 Feb 2020 07:05:22 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] t/msg_iter: test for X-UNKNOWN charset from Alpine Date: Fri, 14 Feb 2020 07:05:22 +0000 Message-Id: <20200214070522.25535-1-e@yhbt.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: A long overdue test for behavior established in 2016. Fixes: 1b28cc7f00a866cb ("view: try assuming UTF-8 for bogus charsets") --- MANIFEST | 1 + t/msg_iter.t | 20 ++++++++++++++++++++ t/x-unknown-alpine.eml | 21 +++++++++++++++++++++ 3 files changed, 42 insertions(+) create mode 100644 t/x-unknown-alpine.eml diff --git a/MANIFEST b/MANIFEST index 5acd8531..48df274e 100644 --- a/MANIFEST +++ b/MANIFEST @@ -299,6 +299,7 @@ t/watch_maildir.t t/watch_maildir_v2.t t/www_listing.t t/www_static.t +t/x-unknown-alpine.eml t/xcpdb-reshard.t xt/git-http-backend.t xt/git_async_cmp.t diff --git a/t/msg_iter.t b/t/msg_iter.t index de9c39fa..e33bfc69 100644 --- a/t/msg_iter.t +++ b/t/msg_iter.t @@ -4,6 +4,7 @@ use strict; use warnings; use Test::More; use Email::MIME; +use PublicInbox::Hval qw(ascii_html); use_ok('PublicInbox::MsgIter'); { @@ -58,5 +59,24 @@ use_ok('PublicInbox::MsgIter'); is(index($raw, '$$$'), -1, 'no unescaped $$$'); } +{ + my $f = 't/x-unknown-alpine.eml'; + my $mime = Email::MIME->new(do { + open my $fh, '<', $f or die "open($f): $!"; + local $/; + binmode $fh; + <$fh>; + }); + my $raw = ''; + msg_iter($mime, sub { + my ($part, $level, @ex) = @{$_[0]}; + my ($s, $err) = msg_part_text($part, 'text/plain'); + $raw .= $s; + }); + like($raw, qr!^\thttps://!ms, 'tab expanded with X-UNKNOWN'); + like(ascii_html($raw), qr/• bullet point/s, + 'got bullet point when X-UNKNOWN assumes UTF-8'); +} + done_testing(); 1; diff --git a/t/x-unknown-alpine.eml b/t/x-unknown-alpine.eml new file mode 100644 index 00000000..75b0bc55 --- /dev/null +++ b/t/x-unknown-alpine.eml @@ -0,0 +1,21 @@ +Date: Sat, 13 Aug 2016 12:14:15 +0200 (CEST) +From: Alpine User +To: +Subject: charset=X-UNKNOWN test +Message-ID: +User-Agent: Alpine 2.20 (DEB 67 2015-01-07) +MIME-Version: 1.0 +Content-Type: multipart/mixed; BOUNDARY="8323329-703494712-1471083256=:4924" + + This message is in MIME format. The first part should be readable text, + while the remaining parts are likely unreadable without MIME-aware tools. + +--8323329-703494712-1471083256=:4924 +Content-Type: text/plain; charset=X-UNKNOWN +Content-Transfer-Encoding: QUOTED-PRINTABLE + +=09https://example.com/ + + =E2=80=A2 bullet point + +--8323329-703494712-1471083256=:4924--