Date | Commit message (Collapse) |
|
The old name may be confused with "Content-ID" as described in
RFC 2392, so use an alternate name to avoid confusing future
readers.
|
|
PublicInbox::Eml has enough functionality to replace the
Email::MIME-based PublicInbox::MIME.
|
|
This allows us to simplify some of our existing code and make
future changes easier.
I doubt anybody goes through the trouble to have a Perl
installation without zlib support. The zlib source code is even
bundled with Perl since 5.9.3 for systems without existing zlib
development headers and libraries.
Of course, zlib is also a requirement of git, too; and we're not
going to stop using git :)
[squashed: "wwwaltid: use gzipfilter up front"]
|
|
In normal mail paths, we can rely on MTAs being configured with
reasonable limits in the -watch and -mda mail injection paths.
However, the MTA is bypassed in a git-only delivery path, a BOFH
could inject a large message and DoS users attempting to mirror
a public-inbox.
This doesn't protect unindexed WWW interfaces from Email::MIME
memory explosions on v1 inboxes. Probably nobody cares about
unindexed WWW interfaces anymore, especially now that Xapian is
optional for indexing.
|
|
It hasn't been needed since commit 089cca37fa036411
("config: ignore missing config files"). And we
actually want to propagate errors when we can't
start new processes or if git(1) is missing.
|
|
I did not know to use the return value of `do' back in the day.
There's probably no practical difference in these cases, but
`eval' is overkill for these uses and may hide actual errors.
We can get rid of a few redundant `scalar' ops and pass scalar
refs to Email::MIME->new to avoid copies in a few more places,
too.
|
|
There's nothing Maildir-specific about the function, so
`maildir_path_load' was a bad name. So give it a more
appropriate name and use it in our tests.
This save ourselves some code and inconsistency by reusing an
existing internal library routine in more places. We can drop
the "From_" line in some of our (formerly) mbox sample files.
|
|
It's more convenient to specify `-c' / `--compact' on the
command-line when reindexing than it is to invoke
public-inbox-compact(1) separately.
This is especially convenient in low-space situations when
public-inbox-index is operating on multiple inboxes
sequentially, as compaction can happen immediately after
indexing each inbox, instead of waiting until all inboxes are
indexed.
|
|
Since v2 inboxes contain multiple git repositories, avoid the
use of the word "repository" when referring to inboxes as a
whole in most places.
|
|
We don't want to blow up users storage too badly when converting
v1 to v2 or break because they don't have Xapian bindings installed.
|
|
I didn't wait until September to do it, this year!
|
|
The (currently undocumented) "--no-index" flag did not trigger
the V2Writable->done call necessary to make the import
successful.
Fixes: eea47b676127bcdb ("convert: preserve highwater mark from v1 msgmap")
|
|
Relying on implicit "@_" for shift fails with
TestCommon::_run_sub iff GetOptions modifies @ARGV.
|
|
Looking at git history, they were never used.
|
|
If we're reusing the msgmap from a v1 inbox, we also need to
ensure the highwater mark doesn't get doubled in the v1->v2
conversion by internally triggering the equivalent of
"--reindex" on a fresh v2 inbox.
This was needed to convert an indexed v1 inbox which featured
messages with multiple Message-IDs in it. Fresh, unindexed
clones of v1 inboxes would not have been affected by this.
|
|
We already load PublicInbox::Import via
PublicInbox::InboxWritable, so it's not an extra module
to load. This can give us a slight speedup in tests.
|
|
This allows us to simplify version checking by avoiding
"//" or "||" operators sprinkled around.
|
|
Some users just want to run -mda, -watch, and/or -nntpd.
Let them run just those without forcing them to pull in a
bunch of dependencies.
|
|
There's a bunch of leftover "require" and "use" statements we no
longer need and can get rid of, along with some excessive
imports via "use".
IO::Handle usage isn't always obvious, so add comments
describing why a package loads it. Along the same lines,
document the tmpdir support as the reason we depend on
File::Temp 0.19, even though every Perl 5.10.1+ user has it.
While we're at it, favor "use" over "require", since it it gives
us extra compile-time checking.
|
|
And update callers to use it, as it makes the code a bit cleaner.
Probably irrelvant, but it should be faster, too, as
"perl -I lib -w -MO=Deparse $FILE" shows REJECT() calls are
constant-folded.
|
|
We can use "use" to get the namespace into the "BEGIN" phase of
the interpreter. While we're at it, use \&coderef syntax
explicitly instead of globbing everything.
|
|
This is distributed with Perl 5.10.1 and onwards, so it should
not be an installation burden for any users. I'm planning to
move away from tempdir() entirely and use File::Temp->newdir to
remove dependencies on END{} blocks.
|
|
We've been using this in -edit, and will be using it in some
more scripts and tests to optimize for run_mode=2 with
run_script.
Keeping this in the *Writable modules since I don't see it being
useful for the WWW and NNTP read-only interfaces which use
PublicInbox::Inbox.
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
We also need to rely on ->DESTROY instead of END{}
to unlink the lock file on sub exit.
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
|
|
PublicInbox::Admin::config() just adds an extra layer of
indirection which we barely rely on. So get rid of this
global variable and make it easier to run tests in the
future without relying on global state.
|
|
Instead of relying on END{} blocks, rely on ->DESTROY
so the temporary files go out-of-scope and system
resources get released, sooner.
|
|
Avoid 'Variable "%s" will not stay shared' warnings
when the contents of this script eval'ed into a sub.
|
|
We only need to parse the command-line once.
|
|
InboxWritable caching the result of ->importer leads to a
circular references with returned (V2Writable|Import) object
holds onto the calling InboxWritable object.
With public-inbox-watch, this leads to a memory leak if a user
is reloading via SIGHUP after a message is imported (it would
only become noticeable with SIGHUPs after every message imported).
I would not expect anybody to to notice this in real-world
usage. I only noticed this since I was making -xcpdb suitable
for long-lived process use (e.g. "mod_perl style") and a flock
remained unreleased on v1 inboxes after resharding.
WatchMaildir (used by -watch) already handles caching of the
importer object itself, and all of our other real-world uses of
->importer are short-lived or designed for batch scripts, so
there's no need to cache the importer result internally.
|
|
We need to check every print to a regular file for errors,
because storage devices inevitably fail.
|
|
exit($?) is never correct, since ($? >> 8) is needed to extract
the correct exit code, as other information (e.g. such as signal)
is encoded in $? in addition to the exit code.
|
|
While it's not RFC2919-conformant, mail software can
theoretically set multiple List-ID headers. Deliver to all
inboxes which match a given List-ID since that's likely the
intended.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Link: https://public-inbox.org/meta/87pniltscf.fsf@x220.int.ebiederm.org/
|
|
Multiple List-ID headers will be supported in the next commit
|
|
And use it for mda, since "0" could be a usable directory
if somebody insists on using relative paths...
|
|
We don't want to waste cycles parsing the message for MIME bits
if it's spam.
|
|
It makes it easier to document the default -mda behavior is
stricter than normal, including "public-inbox-learn ham"
|
|
It's now possible to inject false-positive ham into an inbox
the same way -mda does via List-ID.
|
|
We'll be reusing it for List-ID processing in the next commit.
|
|
Users may be zeroes or blanks.
|
|
Use <foo|bar> since that seems to be the favored notation
for required command args (taking a hint from git(1) manpage).
While we're at it, remove the space after '<' for the redirect
to match git.git coding style.
|
|
It's assumed that "spam" can end up anywhere due to Bcc:, so we
need to scan every single inbox. However, "rm" is usually more
targeted and and "ham" obviously only belongs in some inboxes.
|
|
It's possible to specify these headers multiple times, and
PublicInbox::MDA->precheck takes that into account, so
-learn should, too.
|
|
* origin/inboxdir:
config: remove redundant inboxdir check
config: support "inboxdir" in addition to "mainrepo"
examples/grok-pull.post_update_hook: use "inbox_dir"
|
|
"mainrepo" ws a bad name and artifact from the early days when I
intended for there to be a "spamrepo" (now just the
ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be
especially confusing, since v2 needs at least two git
repositories (epoch + all.git) to function and we shouldn't
confuse users by having them point to a git repository for v2.
Much of our documentation already references "INBOX_DIR" for
command-line arguments, so use "inboxdir" as the
git-config(1)-friendly variant for that.
"mainrepo" remains supported indefinitely for compatibility.
Users may need to revert to old versions, or may be referring
to old documentation and must not be forced to change config
files to account for this change.
So if you're using "mainrepo" today, I do NOT recommend changing
it right away because other bugs can lurk.
Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
|
|
Since -mda now supports List-ID to better support mirroring of
existing mailing lists, it probably makes sense to support
disabling the precheck function to provide more accurate (though
potentially spammier) mirrors of lists
|
|
This also adds watchheader tests for -watch, which we never
had before :x
|
|
First, we use flock(2) to wait on parallel public-inbox-init(1)
invocations while we make multiple changes using git-config(1).
This flock allows -init processes to wait on each other if using
reasonable POSIX filesystems.
Then, we also need a git-config(1)-compatible lock to prevent
user-invoked git-config(1) processes from clobbering our
changes while we're holding the flock.
|