From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7EE2B1F55B; Sat, 16 May 2020 22:53:53 +0000 (UTC) Date: Sat, 16 May 2020 22:53:53 +0000 From: Eric Wong To: "Eric W. Biederman" Cc: meta@public-inbox.org Subject: [PATCH] confine Email::MIME use even further Message-ID: <20200516225353.GA22331@dcvr> References: <87ftc3mrq6.fsf@x220.int.ebiederm.org> <20200513221715.GA11718@dcvr> <877dxelmsr.fsf@x220.int.ebiederm.org> <87ftc0c3r4.fsf_-_@x220.int.ebiederm.org> <87a728c3p3.fsf_-_@x220.int.ebiederm.org> <87o8qoao09.fsf@x220.int.ebiederm.org> <20200515225644.GB4131@dcvr> <87mu6888cw.fsf@x220.int.ebiederm.org> <20200516191220.GA24349@dcvr> <877dxb8wwa.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <877dxb8wwa.fsf@x220.int.ebiederm.org> List-Id: "Eric W. Biederman" wrote: > Eric Wong writes: > > > "Eric W. Biederman" wrote: > >> Eric Wong writes: > >> > "Eric W. Biederman" wrote: > >> >> > The email messages are placed without modification into the public > >> >> > inbox repository so minimize changes of corruption or of loosing > >> >> > valuable information. I use the command imap_fetch for all of my > >> >> > email and not just a mailling list mirror so I don't want automation > >> >> > to accidentally cause something important to be lost. > >> > > >> > Btw, Email::MIME usage is gone from 1.5.0 due to nasty > >> > performance problems and replaced by PublicInbox::Eml. Eml > >> > should be completely non-destructive unless somebody sends an > >> > abusive message which exceeds the new safety limits; in which > >> > case it won't OOM or burn CPU like E::M did. > >> > > >> > That said, {-public_inbox_raw} still works and Eml looks > >> > like a drop-in replacement as far as imap_fetch is concerned. > >> > >> I almost did that. But I looked and saw PublicInbox::MIME still present > >> and a number of other references to Email::MIME so I wasn't certain > >> exactly how that was being handled. But since Email::MIME still > >> worked I didn't mess with that. > > > > I think the Import .pod documentation is the only place aside > > from some random comments and maintainer tests in xt/*, right? > > I am looking at 1.5.0 so you may have made a bit more progress > but Import.pm still uses Email::MIME, > and PublicInbox::MIME still uses Email::MIME as a base class. Yeah, PublicInbox::MIME only existed to workaround old bugs in Email::MIME, so we used it everywhere for years and and will keep it in old tests. Below is a patch to remove most references to Email::MIME; but I guess PublicInbox::Eml will need POD docs at some point... > >> > Btw, any reason you create the SSLSocket yourself instead of > >> > passing (Ssl => \@SSL_Socket_options) to IMAPClient->new? > >> > >> When I read the documentation it looked like that was the way to do > >> things. Even now when I reread the documentation that looks like the > >> way to go. Especially if I wanted to be certain the connection was > >> encrypted. > > > > There seems more than one way to do it, but `Starttls' and `Ssl' > > are just as documented from what I tell (in v3.38). > > Socket/RawSocket seem useful for using an external command to > > connect/launch an IMAP tunnel or server; so it'll be used to > > mimic the `imap.tunnel' support of git-imap-send. > > Now that you point it out I can see it. Commands like starttls > are a bit dangerous as they are subject to man in the middle attacks. > > But I think that is the difference of just tossing something together > for yourself versus making something that works with everyone's setup. > > The one challenge I ran into was getting ssl verification to work on > RHEL7. Apparently IO::Socket::SSL::default_ca() does not exist in > the old version of perl that comes with RHEL7. Which is why I have > the %ca and the eval. Ouch. Yes, I remember that being a problem for testing NNTPS, too. Net::NNTP doesn't support old IO::Socket::SSL, either. Don't feel obligated to figure this out; but how did IO::Socket::SSL work before it got default_ca()? Did it force the user to configure that on their own, set it behind-the-scenes as a default, or did it (*gasp*) skip verification? -----------8<----------- Subject: [PATCH] confine Email::MIME use even further To avoid confusing future readers and users, recommend PublicInbox::Eml in our Import POD and refer to PublicInbox::Eml comments at the top of PublicInbox::MIME. mime_load() confined to t/eml.t, since we won't be using it anywhere else in our tests. --- lib/PublicInbox/Import.pm | 22 +++++++++++++--------- lib/PublicInbox/MIME.pm | 4 +++- lib/PublicInbox/TestCommon.pm | 10 +--------- t/eml.t | 6 ++++++ 4 files changed, 23 insertions(+), 19 deletions(-) diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm index fc61d062..792570c8 100644 --- a/lib/PublicInbox/Import.pm +++ b/lib/PublicInbox/Import.pm @@ -648,7 +648,10 @@ version 1.0 =head1 SYNOPSIS - use Email::MIME; + use PublicInbox::Eml; + # PublicInbox::Eml exists as of public-inbox 1.5.0, + # Email::MIME was used in older versions + use PublicInbox::Git; use PublicInbox::Import; @@ -664,7 +667,7 @@ version 1.0 "Date: Thu, 01 Jan 1970 00:00:00 +0000\n" . "Message-ID: \n". "\ntest message"; - my $parsed = Email::MIME->new($message); + my $parsed = PublicInbox::Eml->new($message); my $ret = $im->add($parsed); if (!defined $ret) { warn "duplicate: ", @@ -675,7 +678,7 @@ version 1.0 $im->done; # to remove a message - my $junk = Email::MIME->new($message); + my $junk = PublicInbox::Eml->new($message); my ($mark, $orig) = $im->remove($junk); if ($mark eq 'MISSING') { print "not found\n"; @@ -690,8 +693,8 @@ version 1.0 =head1 DESCRIPTION -An importer and remover for public-inboxes which takes L -messages as input and stores them in a git repository as +An importer and remover for public-inboxes which takes C +or L messages as input and stores them in a git repository as documented in L, except it does not allow duplicate Message-IDs. @@ -709,7 +712,7 @@ Initialize a new PublicInbox::Import object. =head2 add - my $parsed = Email::MIME->new($message); + my $parsed = PublicInbox::Eml->new($message); $im->add($parsed); Adds a message to to the git repository. This will acquire @@ -720,12 +723,13 @@ is called, but L may be called on them. =head2 remove - my $junk = Email::MIME->new($message); + my $junk = PublicInbox::Eml->new($message); my ($code, $orig) = $im->remove($junk); Removes a message from the repository. On success, it returns a ':'-prefixed numeric code representing the git-fast-import -mark and the original messages as an Email::MIME object. +mark and the original messages as a PublicInbox::Eml +(or Email::MIME) object. If the message could not be found, the code is "MISSING" and the original message is undef. If there is a mismatch where the "Message-ID" is matched but the subject and body do not match, @@ -749,7 +753,7 @@ The mail archives are hosted at L =head1 COPYRIGHT -Copyright (C) 2016 all contributors L +Copyright (C) 2016-2020 all contributors L License: AGPL-3.0+ L diff --git a/lib/PublicInbox/MIME.pm b/lib/PublicInbox/MIME.pm index 9077386a..831a3d19 100644 --- a/lib/PublicInbox/MIME.pm +++ b/lib/PublicInbox/MIME.pm @@ -4,7 +4,9 @@ # The license for this file differs from the rest of public-inbox. # # We no longer load this in any of our code outside of maintainer -# tests for compatibility. +# tests for compatibility. PublicInbox::Eml is favored throughout +# our codebase for performance and safety reasons, though we maintain +# Email::MIME-compatibility in mail injection and indexing code paths. # # It monkey patches the "parts_multipart" subroutine with patches # from Matthew Horsfall at: diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm index d952ee6d..79e597f5 100644 --- a/lib/PublicInbox/TestCommon.pm +++ b/lib/PublicInbox/TestCommon.pm @@ -9,15 +9,7 @@ use Fcntl qw(FD_CLOEXEC F_SETFD F_GETFD :seek); use POSIX qw(dup2); use IO::Socket::INET; our @EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods - run_script start_script key2sub xsys xqx mime_load eml_load); - -sub mime_load ($) { - my ($path) = @_; - open(my $fh, '<', $path) or die "open $path: $!"; - # test should've called: require_mods('Email::MIME') - require PublicInbox::MIME; - PublicInbox::MIME->new(\(do { local $/; <$fh> })); -} + run_script start_script key2sub xsys xqx eml_load); sub eml_load ($) { my ($path, $cb) = @_; diff --git a/t/eml.t b/t/eml.t index b7f58ac7..1892b001 100644 --- a/t/eml.t +++ b/t/eml.t @@ -12,6 +12,12 @@ SKIP: { }; use_ok $_ for @classes; +sub mime_load ($) { + my ($path) = @_; + open(my $fh, '<', $path) or die "open $path: $!"; + PublicInbox::MIME->new(\(do { local $/; <$fh> })); +} + { my $eml = PublicInbox::Eml->new(\(my $str = "a: b\n\nhi\n")); is($str, "hi\n", '->new modified body like Email::Simple');