From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 857911F619; Tue, 25 Feb 2020 09:28:06 +0000 (UTC) Date: Tue, 25 Feb 2020 09:28:06 +0000 From: Eric Wong To: Leah Neukirchen Cc: meta@public-inbox.org Subject: weird From: lines [was: Two small issues when importing old archives] Message-ID: <20200225092806.GB382@dcvr> References: <87h7zfemur.fsf@vuxu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87h7zfemur.fsf@vuxu.org> List-Id: Leah Neukirchen wrote: > 2) Weird From: lines crash the whole import > > From: "=?iso-8859-1?Q?Jochen_K=FCpper?= > This funny line broke import_maildir: > > fatal: Missing > in ident string: =?iso-8859-1?Q?Jochen_K=FCpper?= usenet <"=?iso-8859-1?Q?Jochen_K=FCpper?= 1101853296 +0100 > fast-import: dumping crash report to /var/lib/public-inbox/repositories/ding.git/fast_import_crash_31402 > EOF from fast-import: at /usr/share/perl5/vendor_perl/PublicInbox/Import.pm line 96, <$r> line 54681. > > I fixed it manually. (But I think it's actually a valid mail address, > even in this botched state.) I'm not sure what added the ">", it's > not in the original mail. > > (I use public-inbox-1.3.0/git-2.25.0 on Void Linux.) Gah, this looks like it's because Email::Address::XS leaves a "<" in the name... Perhaps Import should delete all [<>] characters unconditionally? (or swap in appropriate Unicode homographs and assume users have the necessary glyphs...) ---------8<---------- Subject: [RFC] t/address.t: dump failing case "PublicInbox::Address" (w/o "PP") is Email::Address::XS 1.04 from Debian 10: PublicInbox::Address names: $VAR1 = [ '=?iso-8859-1?Q?Jochen_K=FCpper?= ('User , e@example.org')], 'address extraction works as expected'); + my $odd = '"=?iso-8859-1?Q?Jochen_K=FCpper?= ($odd)]); + diag "$pkg emails: " . Dumper([$emails->($odd)]); + is_deeply(['user@example.com'], [$emails->('')], 'comment after domain accepted before >');