user/dev discussion of public-inbox itself
 help / color / Atom feed
From: Eric Wong <e@yhbt.net>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: meta@public-inbox.org
Subject: [PATCH] confine Email::MIME use even further
Date: Sat, 16 May 2020 22:53:53 +0000
Message-ID: <20200516225353.GA22331@dcvr> (raw)
In-Reply-To: <877dxb8wwa.fsf@x220.int.ebiederm.org>

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> Eric Wong <e@yhbt.net> writes:
> 
> > "Eric W. Biederman" <ebiederm@xmission.com> wrote:
> >> Eric Wong <e@yhbt.net> writes:
> >> > "Eric W. Biederman" <ebiederm@xmission.com> wrote:
> >> >> > The email messages are placed without modification into the public
> >> >> > inbox repository so minimize changes of corruption or of loosing
> >> >> > valuable information.  I use the command imap_fetch for all of my
> >> >> > email and not just a mailling list mirror so I don't want automation
> >> >> > to accidentally cause something important to be lost.
> >> >
> >> > Btw, Email::MIME usage is gone from 1.5.0 due to nasty
> >> > performance problems and replaced by PublicInbox::Eml.  Eml
> >> > should be completely non-destructive unless somebody sends an
> >> > abusive message which exceeds the new safety limits; in which
> >> > case it won't OOM or burn CPU like E::M did.
> >> >
> >> > That said, {-public_inbox_raw} still works and Eml looks
> >> > like a drop-in replacement as far as imap_fetch is concerned.
> >> 
> >> I almost did that. But I looked and saw PublicInbox::MIME still present
> >> and a number of other references to Email::MIME so I wasn't certain
> >> exactly how that was being handled.  But since Email::MIME still
> >> worked I didn't mess with that.
> >
> > I think the Import .pod documentation is the only place aside
> > from some random comments and maintainer tests in xt/*, right?
> 
> I am looking at 1.5.0 so you may have made a bit more progress
> but Import.pm still uses Email::MIME,
> and PublicInbox::MIME still uses Email::MIME as a base class.

Yeah, PublicInbox::MIME only existed to workaround old bugs in
Email::MIME, so we used it everywhere for years and and will
keep it in old tests.

Below is a patch to remove most references to Email::MIME;
but I guess PublicInbox::Eml will need POD docs at some point...

> >> > Btw, any reason you create the SSLSocket yourself instead of
> >> > passing (Ssl => \@SSL_Socket_options) to IMAPClient->new?
> >> 
> >> When I read the documentation it looked like that was the way to do
> >> things.  Even now when I reread the documentation that looks like the
> >> way to go.  Especially if I wanted to be certain the connection was
> >> encrypted.
> >
> > There seems more than one way to do it, but `Starttls' and `Ssl'
> > are just as documented from what I tell (in v3.38).
> > Socket/RawSocket seem useful for using an external command to
> > connect/launch an IMAP tunnel or server; so it'll be used to
> > mimic the `imap.tunnel' support of git-imap-send.
> 
> Now that you point it out I can see it.  Commands like starttls
> are a bit dangerous as they are subject to man in the middle attacks.
> 
> But I think that is the difference of just tossing something together
> for yourself versus making something that works with everyone's setup.
> 
> The one challenge I ran into was getting ssl verification to work on
> RHEL7.  Apparently IO::Socket::SSL::default_ca() does not exist in
> the old version of perl that comes with RHEL7.  Which is why I have
> the %ca and the eval.

Ouch.  Yes, I remember that being a problem for testing NNTPS,
too.  Net::NNTP doesn't support old IO::Socket::SSL, either.

Don't feel obligated to figure this out; but how did
IO::Socket::SSL work before it got default_ca()?

Did it force the user to configure that on their own,
set it behind-the-scenes as a default, or did it (*gasp*)
skip verification?

-----------8<-----------
Subject: [PATCH] confine Email::MIME use even further

To avoid confusing future readers and users, recommend
PublicInbox::Eml in our Import POD and refer to PublicInbox::Eml
comments at the top of PublicInbox::MIME.

mime_load() confined to t/eml.t, since we won't be using
it anywhere else in our tests.
---
 lib/PublicInbox/Import.pm     | 22 +++++++++++++---------
 lib/PublicInbox/MIME.pm       |  4 +++-
 lib/PublicInbox/TestCommon.pm | 10 +---------
 t/eml.t                       |  6 ++++++
 4 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index fc61d062..792570c8 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -648,7 +648,10 @@ version 1.0
 
 =head1 SYNOPSIS
 
-	use Email::MIME;
+	use PublicInbox::Eml;
+	# PublicInbox::Eml exists as of public-inbox 1.5.0,
+	# Email::MIME was used in older versions
+
 	use PublicInbox::Git;
 	use PublicInbox::Import;
 
@@ -664,7 +667,7 @@ version 1.0
 		"Date: Thu, 01 Jan 1970 00:00:00 +0000\n" .
 		"Message-ID: <m\@example.org>\n".
 		"\ntest message";
-	my $parsed = Email::MIME->new($message);
+	my $parsed = PublicInbox::Eml->new($message);
 	my $ret = $im->add($parsed);
 	if (!defined $ret) {
 		warn "duplicate: ",
@@ -675,7 +678,7 @@ version 1.0
 	$im->done;
 
 	# to remove a message
-	my $junk = Email::MIME->new($message);
+	my $junk = PublicInbox::Eml->new($message);
 	my ($mark, $orig) = $im->remove($junk);
 	if ($mark eq 'MISSING') {
 		print "not found\n";
@@ -690,8 +693,8 @@ version 1.0
 
 =head1 DESCRIPTION
 
-An importer and remover for public-inboxes which takes L<Email::MIME>
-messages as input and stores them in a git repository as
+An importer and remover for public-inboxes which takes C<PublicInbox::Eml>
+or L<Email::MIME> messages as input and stores them in a git repository as
 documented in L<https://public-inbox.org/public-inbox-v1-format.txt>,
 except it does not allow duplicate Message-IDs.
 
@@ -709,7 +712,7 @@ Initialize a new PublicInbox::Import object.
 
 =head2 add
 
-	my $parsed = Email::MIME->new($message);
+	my $parsed = PublicInbox::Eml->new($message);
 	$im->add($parsed);
 
 Adds a message to to the git repository.  This will acquire
@@ -720,12 +723,13 @@ is called, but L</remove> may be called on them.
 
 =head2 remove
 
-	my $junk = Email::MIME->new($message);
+	my $junk = PublicInbox::Eml->new($message);
 	my ($code, $orig) = $im->remove($junk);
 
 Removes a message from the repository.  On success, it returns
 a ':'-prefixed numeric code representing the git-fast-import
-mark and the original messages as an Email::MIME object.
+mark and the original messages as a PublicInbox::Eml
+(or Email::MIME) object.
 If the message could not be found, the code is "MISSING"
 and the original message is undef.  If there is a mismatch where
 the "Message-ID" is matched but the subject and body do not match,
@@ -749,7 +753,7 @@ The mail archives are hosted at L<https://public-inbox.org/meta/>
 
 =head1 COPYRIGHT
 
-Copyright (C) 2016 all contributors L<mailto:meta@public-inbox.org>
+Copyright (C) 2016-2020 all contributors L<mailto:meta@public-inbox.org>
 
 License: AGPL-3.0+ L<http://www.gnu.org/licenses/agpl-3.0.txt>
 
diff --git a/lib/PublicInbox/MIME.pm b/lib/PublicInbox/MIME.pm
index 9077386a..831a3d19 100644
--- a/lib/PublicInbox/MIME.pm
+++ b/lib/PublicInbox/MIME.pm
@@ -4,7 +4,9 @@
 # The license for this file differs from the rest of public-inbox.
 #
 # We no longer load this in any of our code outside of maintainer
-# tests for compatibility.
+# tests for compatibility.  PublicInbox::Eml is favored throughout
+# our codebase for performance and safety reasons, though we maintain
+# Email::MIME-compatibility in mail injection and indexing code paths.
 #
 # It monkey patches the "parts_multipart" subroutine with patches
 # from Matthew Horsfall <wolfsage@gmail.com> at:
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index d952ee6d..79e597f5 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -9,15 +9,7 @@ use Fcntl qw(FD_CLOEXEC F_SETFD F_GETFD :seek);
 use POSIX qw(dup2);
 use IO::Socket::INET;
 our @EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
-	run_script start_script key2sub xsys xqx mime_load eml_load);
-
-sub mime_load ($) {
-	my ($path) = @_;
-	open(my $fh, '<', $path) or die "open $path: $!";
-	# test should've called: require_mods('Email::MIME')
-	require PublicInbox::MIME;
-	PublicInbox::MIME->new(\(do { local $/; <$fh> }));
-}
+	run_script start_script key2sub xsys xqx eml_load);
 
 sub eml_load ($) {
 	my ($path, $cb) = @_;
diff --git a/t/eml.t b/t/eml.t
index b7f58ac7..1892b001 100644
--- a/t/eml.t
+++ b/t/eml.t
@@ -12,6 +12,12 @@ SKIP: {
 };
 use_ok $_ for @classes;
 
+sub mime_load ($) {
+	my ($path) = @_;
+	open(my $fh, '<', $path) or die "open $path: $!";
+	PublicInbox::MIME->new(\(do { local $/; <$fh> }));
+}
+
 {
 	my $eml = PublicInbox::Eml->new(\(my $str = "a: b\n\nhi\n"));
 	is($str, "hi\n", '->new modified body like Email::Simple');

      reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-29 14:40 I have figured out IMAP IDLE Eric W. Biederman
2019-10-29 22:31 ` Eric Wong
2019-10-29 23:12   ` WWW::Curl [was: I have figured out IMAP IDLE] Eric Wong
2019-11-03 16:28   ` I have figured out IMAP IDLE Eric W. Biederman
2020-05-13 19:31 ` Eric Wong
2020-05-13 21:48   ` Eric W. Biederman
2020-05-13 22:17     ` Eric Wong
2020-05-14 12:32       ` Eric W. Biederman
2020-05-14 16:15         ` Eric Wong
2020-05-15 21:00         ` [PATCH 1/2] IMAPTracker: Add a helper to track our place in reading imap mailboxes Eric W. Biederman
2020-05-15 21:02           ` [PATCH 2/2] imap_fetch: Add a command to continuously fetch from an imap mailbox Eric W. Biederman
2020-05-15 21:26             ` Eric W. Biederman
2020-05-15 22:56               ` Eric Wong
2020-05-16 10:47                 ` Eric W. Biederman
2020-05-16 19:12                   ` Eric Wong
2020-05-16 20:09                     ` Eric W. Biederman
2020-05-16 22:53                       ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200516225353.GA22331@dcvr \
    --to=e@yhbt.net \
    --cc=ebiederm@xmission.com \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git