about summary refs log tree commit homepage
path: root/lib/PublicInbox/Filter
diff options
context:
space:
mode:
authorEric Wong <e@80x24.org>2021-02-12 00:05:50 -0700
committerEric Wong <e@80x24.org>2021-02-12 22:58:29 -0400
commitbab0a3dfd9e3bd58abf60531096b780673f8e6c7 (patch)
tree9ec7be2115457d300ec7e85a85cb080be15ca56d /lib/PublicInbox/Filter
parentbbeccd3d252c926e649b946a8a46dd14e6e92182 (diff)
downloadpublic-inbox-bab0a3dfd9e3bd58abf60531096b780673f8e6c7.tar.gz
PublicInbox::MboxReader->(mboxrd|mboxo) only deletes the last
trailing newline, not every single trailing newline like
InboxWritable->import_mbox does.

Testing PublicInbox::MboxReader->mboxrd (next commit) with
scripts/import_vger_from_mbox on the LKML archive I got 2018 for
v2 development; this difference was responsible for a single
spam message(*) from out of 2722831 not being filtered correctly
and returning a different result.

(*) dated 2014-08-25
Diffstat (limited to 'lib/PublicInbox/Filter')
-rw-r--r--lib/PublicInbox/Filter/Vger.pm2
1 files changed, 1 insertions, 1 deletions
diff --git a/lib/PublicInbox/Filter/Vger.pm b/lib/PublicInbox/Filter/Vger.pm
index 0b1f5dd3..5b3c0277 100644
--- a/lib/PublicInbox/Filter/Vger.pm
+++ b/lib/PublicInbox/Filter/Vger.pm
@@ -24,7 +24,7 @@ sub scrub {
         # the vger appender seems to only work on the raw string,
         # so in multipart (e.g. GPG-signed) messages, the list trailer
         # becomes invisible to MIME-aware email clients.
-        if ($s =~ s/$l0\n$l1\n$l2\n$l3\n($l4\n)?\z//os) {
+        if ($s =~ s/$l0\n$l1\n$l2\n$l3\n(?:$l4\n)?\n*\z//os) {
                 $mime = PublicInbox::Eml->new(\$s);
         }
         $self->ACCEPT($mime);