about summary refs log tree commit homepage
path: root/lib/PublicInbox/Admin.pm
DateCommit message (Collapse)
2020-05-17index: v2: parallelize if --reindex or --jobs is specified
`--reindex' involves chomping down lots of mail, so it benefits from parallelization just like the initial indexing. It's also a bit surprising to specify `--jobs/-j' without parallel processes, so ensure we turn on parallelization there, too. We can simplify initialization here, as well, since neither `eval' nor `V2Writable->new' should be in this code.
2020-05-09replace most uses of PublicInbox::MIME with Eml
PublicInbox::Eml has enough functionality to replace the Email::MIME-based PublicInbox::MIME.
2020-04-21index: support --max-size / publicinbox.indexMaxSize
In normal mail paths, we can rely on MTAs being configured with reasonable limits in the -watch and -mda mail injection paths. However, the MTA is bypassed in a git-only delivery path, a BOFH could inject a large message and DoS users attempting to mirror a public-inbox. This doesn't protect unindexed WWW interfaces from Email::MIME memory explosions on v1 inboxes. Probably nobody cares about unindexed WWW interfaces anymore, especially now that Xapian is optional for indexing.
2020-04-20drop needless `eval {}' around Config->new
It hasn't been needed since commit 089cca37fa036411 ("config: ignore missing config files"). And we actually want to propagate errors when we can't start new processes or if git(1) is missing.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-01-27inbox: add ->version method
This allows us to simplify version checking by avoiding "//" or "||" operators sprinkled around.
2020-01-06admin: do not lazy-load Inbox or Config packages
No point in lazy-loading these, since they're always loaded anyways and would not have portability problems on systems with minimal dependencies.
2019-12-30spawn: support chdir via -C option
This simplifies our admin module a bit and allows solver to be used with v1 inboxes using git versions prior to v1.8.5 (but still >= git v1.8.0).
2019-12-30spawn: allow passing GLOB handles for redirects
We can save callers the trouble of {-hold} and {-dev_null} refs as well as the trouble of calling fileno().
2019-12-24search: support SWIG-generated Xapian.pm
Xapian upstream is slowly phasing out the XS-based Search::Xapian in favor of the SWIG-generated "Xapian" package. While Debian and both FreeBSD have Search::Xapian, OpenBSD only includes the "Xapian" binding. More information about the status of the "Xapian" Perl module here: https://trac.xapian.org/ticket/523
2019-12-12Date::Parse is now optional
-mda should not be dealing with broken Date: headers nowadays, and deprioritize it in our documentation and internal checks.
2019-11-16admin: get rid of singleton $CFG var
PublicInbox::Admin::config() just adds an extra layer of indirection which we barely rely on. So get rid of this global variable and make it easier to run tests in the future without relying on global state.
2019-11-14inboxwritable: drop {-importer} cyclic reference
InboxWritable caching the result of ->importer leads to a circular references with returned (V2Writable|Import) object holds onto the calling InboxWritable object. With public-inbox-watch, this leads to a memory leak if a user is reloading via SIGHUP after a message is imported (it would only become noticeable with SIGHUPs after every message imported). I would not expect anybody to to notice this in real-world usage. I only noticed this since I was making -xcpdb suitable for long-lived process use (e.g. "mod_perl style") and a flock remained unreleased on v1 inboxes after resharding. WatchMaildir (used by -watch) already handles caching of the importer object itself, and all of our other real-world uses of ->importer are short-lived or designed for batch scripts, so there's no need to cache the importer result internally.
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-16admin: show failing directory
Since public-inbox-index may be run against a large list of (intended) inboxes from the command-line, it's helpful to show which directory fails the resolution.
2019-09-14admin: warn and ignore inaccessible inboxes
For whatever reason, inbox directories can go missing temporarily or permanently. Tell the admin about them and continue on our way.
2019-06-16xcpdb: don't warn on --jobs != --reshard
It's slightly confusing since we dedicate one job to dealing with fast-import + SQLite indexing; and it's not worth complaining about when it happens.
2019-06-14v2writable: rename {partitions} field to {shards}
Our internal data structure should be consistent with Xapian terminology.
2019-06-14admin|xapcmd: user-facing messages say "shard"
We're slowly getting rid of the word "partition" when it comes to remain consistent with Xapian docs.
2019-06-09admin: expose ->config
No point in forcing admin programs to reparse the config themselves; and we won't support multiple instances of it; unlike the WWW code.
2019-06-09admin: beef up resolve_inboxes to handle purge options
We'll be using this in -edit, and maybe other admin-oriented tools for UI-consistency.
2019-06-09admin: remove warning arg for unconfigured inboxes
We no longer make -index warn on it, no other code uses it; and working on unconfigured inboxes is totally reasonable for admins who are setting things up.
2019-05-29searchidx: store indexlevel=medium as metadata
And use it from Admin. It's easy to tell what indexlevel=basic is from unconfigured inboxes, but distinguishing between 'medium' and 'full' would require stat()-ing position.* files which is fragile and Xapian-implementation-dependent. So use the metadata facility of Xapian and store it in the main partition so Admin tools can deal better with unconfigured inboxes copied using generic tools like cp(1) or rsync(1).
2019-05-29index: support --verbose option
It doesn't implement progress of batches, yet, but it wires up the parsing of the command-line while preserving output compatibility. This output is NOT meant to be stable.
2019-05-23xcpdb: use fine-grained locking
Copying an entire Xapian DB takes a long time, so update our reindexing code to support partial reindexing, snapshot the pre-copydatabase git revisions, perform the lengthy copy, and do a partial reindex when the copy + renames are done.
2019-05-23admin: move index_inbox over
We will be reindexing after copydatabase
2019-05-23admin: hoist out resolve_inboxes for -compact and -index
Both of these index-affecting commands should work similarly on the command-line. public-inbox-index no longer complains about unconfigured ~/.public-inbox/config; but often I found myself being annoyed by that, anyways...
2019-05-15admin: improve warnings and errors for missing modules
Since we lazy-load Xapian now, some errors may become more cryptic or buried. Try to improve that by making Admin show better errors.
2019-01-11hoist out resolve_repo_dir from -index
We'll be using it in future admin tools, and making this easier-to-test.