Date | Commit message (Collapse) |
|
read_all can be expanded to support FIFOs/pipes/sockets where
read-until-EOF behavior is desired. We can also rely on
wantarray to support splitting on EOL markers, but it's
hard-coded to support only `$/ eq "\n"' since (AFAIK)
it's the only way we use the wantarray form `readline'.
|
|
List-Unsubscribe headers with unique identifiers (such as those
generated by our examples/unsubscribe.milter) should not
end up in public archives. Add a new config knob to strip
List-Unsubscribe headers if they have the
`List-Unsubscribe-Post: List-Unsubscribe=One-Click'
header.
Unfortunately, this breaks DKIM signatures if the signature
covers either of these List-Unsubscribe* headers. However,
breaking DKIM is the lesser evil compared to any archive reader
being able to stop archival by an independent archivist.
As much as I would like this to be the default, it probably
affects few users at the moment since very few mailing lists
use unique identifiers in List-Unsubscribe (but that number
has grown, recently).
|
|
When learning and injecting new messages ham, we want to avoid
wasting cycles importing the same message into an inbox twice
(once for the To/Cc match and once for the List-Id match). Our
existing %seen hash turned out to be ineffective since
PublicInbox::Inbox refs get re-blessed to PublicInbox::InboxWritable.
So we stop letting class name influence the hash key for tracking by
using the reference address instead. We can get the reference address
by performing an arithmetic operation (+ 0) instead of having to
pay the cost of importing Scalar::Util::refaddr.
|
|
v2 never suffered from this bug, apparently, but -learn didn't
seem able to handle indexlevel=basic (nor respect `medium')
for v1 inboxes. I only noticed this bug because I converted
some ancient v1 inboxes to `basic' to save space.
|
|
Aside from our prior import bugs (fixed in a0c07cba0e5d8b6a
(mda: drop leading "From " lines again, 2016-06-26)), we'll
always have to be dealing with mutt piping messages to us and
`git format-patch' output. So just share the regexp so we
can use it everywhere.
In may be desirable to allow importing messages with a leading
"From " line for FUSE, even.
Additionally, some instances of this regexp needlessly added
optional `\r?' (CR) checks ahead of the `\n' (LF) element; but
they're pointless anyways since [^\n]* is enough to exclude all
non-LF bytes.
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
Reading from regular files (even on STDIN) can fail
when dealing with flakey storage.
|
|
{pi_config} may be confused with the documented `PI_CONFIG'
environment variable, and we'll favor vowel-removal to be
consistent with our usage of object references.
The `pi_' prefix may stay in some places, for now; since a
separate namespace may come into this codebase for local/private
client-tooling.
For InboxIdle, we'll also remove an invalid comment about
holding a reference to the PublicInbox::Config object, too.
|
|
"use Getopt::Long" doesn't seem too slow on a hot page cache,
and it's probably used frequently enough to be in cache.
We'll also start reducing the amount of markup in the .pod and
favoring verbatim text in documentation for readability in
source form, since the bold text seems excessive.
|
|
It's useful to mark they're meant to be executable, even
if the shebang is useless.
|
|
I found myself wanting to remove a message from all inboxes
while working on a test case in another branch. I figure this
could also be useful for globally removing messages which are in
the grey area or too big for spamc.
|
|
There is obviously a typo here, so fix it and add a test
case to guard against future regressions.
Fixes: 74a3206babe0572a ("mda: support multiple List-ID matches")
|
|
PublicInbox::Eml has enough functionality to replace the
Email::MIME-based PublicInbox::MIME.
|
|
I did not know to use the return value of `do' back in the day.
There's probably no practical difference in these cases, but
`eval' is overkill for these uses and may hide actual errors.
We can get rid of a few redundant `scalar' ops and pass scalar
refs to Email::MIME->new to avoid copies in a few more places,
too.
|
|
I didn't wait until September to do it, this year!
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
|
|
While it's not RFC2919-conformant, mail software can
theoretically set multiple List-ID headers. Deliver to all
inboxes which match a given List-ID since that's likely the
intended.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Link: https://public-inbox.org/meta/87pniltscf.fsf@x220.int.ebiederm.org/
|
|
It's now possible to inject false-positive ham into an inbox
the same way -mda does via List-ID.
|
|
We'll be reusing it for List-ID processing in the next commit.
|
|
Users may be zeroes or blanks.
|
|
Use <foo|bar> since that seems to be the favored notation
for required command args (taking a hint from git(1) manpage).
While we're at it, remove the space after '<' for the redirect
to match git.git coding style.
|
|
It's assumed that "spam" can end up anywhere due to Bcc:, so we
need to scan every single inbox. However, "rm" is usually more
targeted and and "ham" obviously only belongs in some inboxes.
|
|
It's possible to specify these headers multiple times, and
PublicInbox::MDA->precheck takes that into account, so
-learn should, too.
|
|
|
|
Oops, I mainly rely on public-inbox-watch for spam training
and completely forgot this tool existed :x
|
|
It works around some bugs in older Email::MIME which we'll
find useful.
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
We need to use the correct subject when doing global scanning,
too. In fact, the per-recipient spam training path is entirely
redundant at this point.
|
|
Sometimes an email is an innocent removal "rm" for a
misdirected, off-topic post, while most removed messages are
"spam". Allow anybody to look at history and easily distinguish
the reason for removing the message.
|
|
This matches the behavior of the -watch daemon since
6d534038285ddd760709ba76ea007f9108200097
("watch: watchspam affects all configured inboxes")
|
|
Do not consider this interface stable, but I just needed a
way to remove mis-imported multipart messages so
public-inbox-watch could pick them up again from my Maildir.
|
|
This should fix problems with multipart messages where
text/plain parts lack a header.
cf. git clone --mirror https://github.com/rjbs/Email-MIME.git
refs/pull/28/head
In the future, we may still introduce as streaming
interface to reduce memory usage on large emails.
|
|
Oops :x
|
|
Oops...
While we're at it, drop blank lines before the "From ", too,
since it could happen.
|
|
This should hopefully make it easier to try other anti-spam
systems (or none at all) in the future.
|
|
This prevents multiple update processes from stepping over
each other while called under the lock, and also allows the
new -watch process to update the index iff indexing was
desired.
|
|
This removes the Email::Filter dependency as well as the
signature-breaking scrubber code. We now prefer to
reject unacceptable messages and grudgingly (and blindly)
mirror messages we're not the primary endpoint for.
|
|
We'll be relying on our spawn implementation, for now;
since it'll be consistent with the rest of our code and
can optionally take advantage of vfork.
|
|
User input is imperfect, do not pollute our mail logs with
warnings we cannot fix. This is documented in the
Email::MIME::ContentType manpage so it should remain supported.
|
|
git has stricter requirements for ident names (no '<>')
which Email::Address allows.
Even in 1.908, Email::Address also has an incomplete fix for
CVE-2015-7686 with a DoS-able regexp for comments. Since we
don't care for or need all the RFC compliance of Email::Address,
avoiding it entirely may be preferable.
Email::Address will still be installed as a requirement for
Email::MIME, but it is only used by the
Email::MIME::header_str_set which we do not use
|
|
A public-inbox is NOT necessarily a mailing list, but it
could serve as an input point for zero, one, or infinite
mailing lists :D
|
|
By converting to using ourt git-fast-import-based Import
module. This should allow us to be more easily installed.
|
|
It can confuse Email::MIME if we have it.
|
|
This seems to match more closely with what is expected of Perl
packages based on how blib is used. Hopefully makes the top-level
source tree less cluttered and things easier-to-find.
|