From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 14CB71F461; Sun, 19 May 2019 22:04:18 +0000 (UTC) Date: Sun, 19 May 2019 22:04:17 +0000 From: Eric Wong To: "Eric W. Biederman" Cc: meta@public-inbox.org Subject: Re: [PATCH] PublicInbox::Import Extend add with a optional raw message parameter Message-ID: <20190519220417.ejyqwyy6xwkq6lts@dcvr> References: <87h89tzvp5.fsf@xmission.com> <20190518080300.q2klxia2uymnoxyi@dcvr> <87blzzu7d9.fsf@xmission.com> <20190518213951.ficerfsawms4z7dh@dcvr> <87woimpb09.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87woimpb09.fsf@xmission.com> List-Id: "Eric W. Biederman" wrote: > Eric Wong writes: > > > "Eric W. Biederman" wrote: > >> Eric Wong writes: > >> > >> > "Eric W. Biederman" wrote: > > > >> I will take a second look at the code. I did not make it all of the > >> way through last time. I just know the bazillions of warnings I could > >> not easily track down with old emails did not make me comfortable > >> with that code base as anything other than a suggestion. > > > > Ah, I added ce18b29d175ef5f01f05d59c95bcf8e0cd40e611 > > ("index: warn with info about the message as context") exactly > > for that. > > Knowing what the code is doing in those modules would be interesting. Yes, I considered adding Carp::longmess to it, but didn't need it since (AFAIK) all the warnings were easily found by reading the Email::* sources. thanks for the info on IMAP modules > Oh that and IO::Socket::SSL. Something that is currently missing (or > at least missing until recently I haven't check in the last couple of > months) is ssl support for our nntp sockets. So do imap or prevent > mischief for our nntp streams we need to use tls. Which IO::Socket::SSL > seems to do. Yup, IO::Socket::SSL will be what we use for STARTTLS support, not sure when I'll get around to it. Fwiw, it's also used by Perlbal for non-blocking HTTPS with Danga::Socket. I plan on keeping PublicInbox::DS close to Danga::Socket, for now[1])... Using the TLS support in the kernel might be a fun option to add, too; but portable stuff, first. > But again I grabbed the first implementation that seems to work. Going > through those modules in detail to make certain nothing goofy is going > on might be wise. Yes; at least choosing IO::Socket::SSL seems straightforward. Looking at Perl IMAP libraries in the past has made my head spin in the past; and lack of IDLE support (or the cost of having an entire process wait on it) has turned me off of going further, so far. In any case, I think anything we do for IMAP should use git's credential system (same as git-imap-send) to minimize setup. > >> I think perhaps we could move all of the scrubbing into > >> PublicInbox::Filter::Base.pm::scrub. Instead of having a hard coded > >> drop_unwanted_headers in PublicInbox::Import. That would make it very > >> straight forward to just make this a knob that the user controls > >> for how they want their email received/imported. > > > > Not calling drop_unwanted_headers can have dangerous side-effects > > (training loops, bugs in other consumers, private data exposure), > > so I'm very hesitant to move it to Filter::Base > > Usually it is my experience that dropping headers is more likely > to cause loops than keeping them. But I definitely understand > the private data exposure angle. However since this is my primary > archive I don't want to loose the information, in case I need it to > debug something. > > > @PublicInbox::MDA::BAD_HEADERS is exposed via `our', so it could > > remain stable-but-undocumented API if people really feel the > > need to tweak it. At most, we'd add a comment for that variable > > asking potential hackers not to move/rename it. > > > >> If that general idea sounds palatable I will investigate to see > >> if I can move the caching of the raw email into PublicInbox::MIME, > >> see about moving the dropping of headers into an appropriate config > >> knob. > > > > I think using $mime->{-public_inbox_raw} will be sufficient for > > now. Thanks. > > Then I will move in that direction. It seems a straight forward > and simple change to make. Alright, so treating @PublicInbox::MDA::BAD_HEADERS as a stable interface ought to be enough? Keeping qw(bytes lines content-length) could throw off consumers. The HTTP mbox.gz code filters them out, too; but I just noticed the NNTP code doesn't, maybe it should... [1] I've also been thinking about making PublicInbox::DS switch to EPOLLONESHOT/EV_ONESHOT to map better to my mind when distinguishing between SSL_ERROR_WANT_READ/WRITE.. And it would give me an excuse to pick up and dogfood an epoll API to reduce syscall overhead: https://lore.kernel.org/lkml/1393206162-18151-1-git-send-email-n1ght.4nd.d4y@gmail.com/