about summary refs log tree commit homepage
path: root/Documentation/public-inbox-config.pod
DateCommit message (Collapse)
2020-08-27doc: move watch config docs to -watch manpage
The -config manpage is a bit long and the -watch stuff is isolated from the rest of it while we start documenting NNTP and IMAP support. I'm not entirely happy with the way IMAP and NNTP are configured, it's still good enough for small setups. This also fixes a long-standing misplaced comment about `publicinboxwatch.spamcheck' affecting all configured inboxes, that comment was actually for `publicinboxwatch.watchspam'. We'll omit documenting NNTP for `watchspam', for now, given the lack of \Seen flags in NNTP and I'm not sure if it's even useful. There may not be any newsgroups for sharing confirmed spam, either...
2020-08-20init: support --newsgroup option
We can reduce the need to edit the config file for NNTP group names this way.
2020-08-07index: v2: --sequential-shard option
This gives better page cache utilization for Xapian indexing on slow storage by improving locality for random I/O activity on the Xapian DB. Instead of doing a single-pass to index both SQLite and Xapian; this indexes them separately. The first pass is identical to indexlevel=basic: it indexes both over.sqlite3 and msgmap.sqlite3. Subsequent passes only operate on a single Xapian shard for documents belonging to that shard. Given enough shards, each individual shard can be made small enough to fit into the kernel page cache and avoid HDD seeks for read activity. Doing rough tests with a busy system with a 7200 RPM HDD with ext4, full indexing of LKML (9 epochs) goes from ~80 hours (-j0) to ~30 hours (-j8) with 16GB RAM with 7 shards configured and fsync(2) disabled (--no-sync) and `--batch-size=10m'.
2020-04-25doc: note some changes for 1.5
As an established project (:P), it's important to document when new features appear in manpages. Users may be reading new documentation online which doesn't reflect an older version they have installed.
2020-04-21index: support --max-size / publicinbox.indexMaxSize
In normal mail paths, we can rely on MTAs being configured with reasonable limits in the -watch and -mda mail injection paths. However, the MTA is bypassed in a git-only delivery path, a BOFH could inject a large message and DoS users attempting to mirror a public-inbox. This doesn't protect unindexed WWW interfaces from Email::MIME memory explosions on v1 inboxes. Probably nobody cares about unindexed WWW interfaces anymore, especially now that Xapian is optional for indexing.
2020-04-20watchmaildir: support multiple watchheader values
The watchheader key supports only a single value. Supporting multiple watchheader values was mentioned in discussion [1] of 8d3e3bd8 (doc: explain publicinbox.<name>.watchheader, 2019-10-09), and it wasn't clear if there was a need. One scenario in which matching multiple headers would be convenient is when someone wants to set up public-inbox archives for some small projects but does _not_ want to run mailing lists for them, instead allowing others to follow the project by any of the pull mechanisms. Using a common underlying address, an address alias for each project is configured via a third-party email provider, with messages for each alias being exposed as a separate public-inbox archive. In this setup, messages for an inbox cannot be selected by a List-ID header but can be identified by the inbox's address in either the To or Cc header. To support such a use case, update the watchheader handling to consider multiple values, accepting a message if it matches any value. While selecting a message based on matching _any_ rather than _all_ values is motivated by the above scenario, it's worth noting that the "any" behavior is consistent with how multiple listid config values are handled. [1] https://public-inbox.org/meta/20191010085118.r3amey4cayazfycb@dcvr/
2020-04-12doc: escape internal ">" in listid code snippet
A code snippet in the listid description is incorrectly rendered as "publicinbox.$NAME.watchheader=List-Id:<foo.example.com"> Escape the closing bracket around the List-Id value to avoid this. Also escape the opening bracket for symmetry/readability.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-02-04doc: spellling fixes for manpages
The wording for publicinbox.nntpserver was awkward, too, and I took this as opportunity to hopefully clarify it and favor "hostname" for Internet addresses, because we already use "address" to mean "email address" in the config.
2020-01-25s/news.gmane.org/news.gmane.io/
gmane still has a NNTP server, so update links to point to it. cf. https://lars.ingebrigtsen.no/2020/01/06/whatever-happened-to-news-gmane-org/
2020-01-02doc: fix a few spelling errors in user-facing docs
Found by codespell, there's a few more in comments and some debatable ones, but user-facing stuff is more important.
2019-11-09doc: drop a repeated word
2019-11-08doc: actually document publicinboxwatch.watchspam
Instead of copy-pasting the documentation for `spamcheck'.
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-15mda, watch: wire up List-ID header support
This also adds watchheader tests for -watch, which we never had before :x
2019-10-10doc: explain publicinbox.<name>.watchheader
It wasn't clear to me exactly what this does -- in particular, what happens if it isn't specified? Does it support multiple values? A very brief explanation can answer both of these questions without making somebody look at the code.
2019-09-15doc: update config manpage for "publicinbox.grokmanifest"
It's a bit of an esoteric option, but maybe somebody out there can find it useful.
2019-09-09run update-copyrights from gnulib for 2019
2019-09-09doc config: document indexlevel directive
It was never documented, before.
2019-06-09edit: new tool to perform edits
This wrapper around V2Writable->replace provides a user-interface for editing messages as single-message mboxes (or the raw text via $EDITOR).
2019-05-08doc: use bullet list for wwwlisting options
Otherwise, pod2man complains about "=item 404" not starting with a letter and thinking it's part of a numbered list.
2019-05-05Merge remote-tracking branch 'origin/wwwlisting'
* origin/wwwlisting: www: support listing of inboxes start depending on Perl 5.10.1+
2019-04-25cgit: improve handling of cgit data path
Document `publicinbox.cgitdata' config directive, but allow it to be unspecified and/or missing for installations which do not wish to serve static data at all. For users installing cgit from source to their home directory, we can usually infer the cgit data path based on the cgit.cgi binary path, even.
2019-04-19www: support listing of inboxes
We will still return a 404 by default to '/' for compatibility with users of Plack::App::Cascade or similar. Inboxes are sorted by modification times to help users detect activity (similar to the /$INBOX/ topic view). New configuration options: * publicinbox.wwwlisting - configure the listing type * publicinbox.<name>.hide - hide a particular inbox from the listing See changes to public-inbox-config.pod for full descriptions of the new options. Requested-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/871sdfzy80.fsf@gmail.com/
2019-04-18doc: config: fix braino/typo :x
2019-04-15doc/config: update cgit.cgi scan location
We account for the upstream default location as well as the Debian-installed one.
2019-04-04cgit: use a dedicated named limiter
I mainly need this to enforce RLIMIT_CPU (and RLIMIT_CORE) when requests come which generate giant, unrealistic diffs. Per-coderepo limiters may be added in the future. But for now, I need to prevent cgit from monopolizing resources on my dinky server.
2019-04-04qspawn: wire up RLIMIT_* handling to limiters
This allows users to configure RLIMIT_{CORE,CPU,DATA} using our "limiter" config directive when spawning external processes.
2019-04-04www: wire up cgit as a 404 handler if cgitrc is configured
Requests intended for cgit are unlikely to conflict with requests to inboxes. So we can safely hand those requests off to cgit.cgi.
2019-04-04support publicinbox.cgitrc directive
We can save admins the trouble of declaring [coderepo "..."] sections in the public-inbox config by parsing the cgitrc directly. Macro expansion (e.g. $HTTP_HOST) expansion is not supported, yet; but may be in the future.
2019-01-31doc/config: document "replyto" configuration knob
I hate it, but it's necessary to support some mirrors.
2019-01-31doc/config: user documentation for limiters
I've relied on this feature to keep the VPS behind https://public-inbox.org/git/ from OOM-ing since 2016, so document it to ensure others can make use of low-end servers like I do. More limiters may become configurable for viewvcs and solver functionality (or we continue using the default one).
2019-01-30doc/config: document coderepo and css bits
New features ought to be documented
2018-07-29mda: allow configuring globally without spamc support
This reuses some of the configuration from -watch, but remains independent since some configurations will use -watch for some inboxes and -mda for others. The default remains "spamc" for -mda users so nothing changes without explicit configuration. Per-inbox configurations may also be supported in the future.
2018-03-29public-inbox-convert: tool for converting old to new inboxes
This should make it easier to let users perform comparisons and migrate to v2 if needed.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2017-01-02watch: watchspam affects all configured inboxes
If a message is spam in one mailbox, it is spam in all others a particular user/group will care about.
2016-12-17feed: support publicinbox.<name>.feedmax
This allows users to customize by using smaller or larger Atom feeds than the default value of 25 entries.
2016-09-07doc: new docs for user-level commands
Hopefully more folks can download and run public-inbox, nowadays.