about summary refs log tree commit homepage
path: root/Documentation/public-inbox-config.pod
DateCommit message (Collapse)
2024-03-16Fix some typos and language nits in docs and comments
2024-02-14doc: fix formatting for CLI switch aliases
`=item' elements in Pod need to be surrounded by empty lines. It's an unfortunate waste of vertical space, but Pod is still better than *roff and usually available out-of-the-box.
2024-02-14doc: config: cgit=rewrite isn't implemented, yet
It'll probably be done for another release, I doubt most cgit users are willing to completely replace it with our coderepo viewer just yet...
2023-12-01doc: config: fix grammar for nameIsUrl
Kyle Meyer <kyle@kyleam.com> wrote: > Eric Wong writes: > > +Treat the name of the public inbox as it's unqualified URL when > > s/it's/its/ Thanks, will push this fix out: -------8<------ Subject: [PATCH] doc: config: fix grammar for nameIsUrl Reported-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87bkbazp5g.fsf@kyleam.com/
2023-11-30www_listing: support publicInbox.nameIsUrl
This is a convenient (and slightly memory-saving) alternative to specifying a `publicinbox.*.url' entry for every single inbox when using publicinbox.wwwListing.
2023-11-14config: avoid eidx_key and newsgroup conflicts
Start lowercasing newsgroup names automatically since uppercase names are incompatible with IMAP and POP3 and also causes problems with both -extindex and -cindex. We'll also warn on eidx_key and newsgroup conflicts to avoid sometimes subtle breakage when using -extindex and -cindex.
2023-11-11mda|learn|watch: support dropUniqueUnsubscribe config
List-Unsubscribe headers with unique identifiers (such as those generated by our examples/unsubscribe.milter) should not end up in public archives. Add a new config knob to strip List-Unsubscribe headers if they have the `List-Unsubscribe-Post: List-Unsubscribe=One-Click' header. Unfortunately, this breaks DKIM signatures if the signature covers either of these List-Unsubscribe* headers. However, breaking DKIM is the lesser evil compared to any archive reader being able to stop archival by an independent archivist. As much as I would like this to be the default, it probably affects few users at the moment since very few mailing lists use unique identifiers in List-Unsubscribe (but that number has grown, recently).
2023-09-11treewide: favor Xapian (SWIG binding) over Search::Xapian
The Xapian SWIG bindings are favored by Xapian upstream for ease-of-maintenance compared to the XS version. While Debian lags on this front, the SWIG bindings are widely available on all *BSDs.
2023-08-28Fix some typos/grammar/errors in docs and comments
2022-10-09www_coderepo: wire up snapshots from summary
This also ensures we won't waste CPU cycles on snapshots which aren't configured if somebody attempts them by guessing URLs.
2022-10-07www: support publicinbox.cgit knob
For backwards-compatibility, this defaults to `first'. When set to `fallback', PublicInbox::WwwCoderepo is favored and cgit is only used as a fallback. Eventually, `rewrite' will also be supported to rewrite cgit URLs to WwwCoderepo ones. Of course, WwwCoderepo is still missing search and other key features, but that's being worked on...
2022-07-30doc|www: flesh out POP3 documentation for servers and users
Hopefully it makes sense to new users deploying or using POP3...
2022-07-20public-inbox-pop3d - a mostly read-only POP3 server
Old account expiry has not been implemented, but it seems to work well with both mpop(1) and getmail(1). The strictness of mpop was particularly helpful in ironing out bugs in our implementation of (dreaded) message sequence numbers. "EXPIRE 0" (RFC 2449) can theoretically save numerous "DELE" commands, but that's untested by real-world clients. mpop supports PIPELINING which is effective in hiding latency, and the core networking functionality is already well-tested from our NNTP and IMAP implementations. Configuration requires "publicinbox.pop3state" to point to a directory writable by the otherwise read-only daemon. See public-inbox-pop3d(1) manpage for more usage details.
2021-11-03doc: extindex: document current behavior + knobs
I'm not really sure if extindex writing to the config file is a good idea (since -index doesn't, as -init exists). Just document what it does and let the user handle it, since the config file shouldn't be daunting to new users.
2021-09-16www: support publicinbox.imapserver
This allows PublicInbox::WWW hosts to advertise the existence of IMAP servers in addition to NNTP servers.
2021-07-22extsearch: support publicinbox.*.boost parameter
This behaves identically the lei external "boost" parameter in prioritizing raw messages for extindex. Relying exclusively on the config file order doesn't work well for mirrors since it's impossible to guarantee config file ordering via grokmirror hooks. Config file ordering remains the default if boost is unconfigured, or in case of ties. Note: I chose the name "boost" rather than "priority" or "rank" since I always get confused by whether higher or lower numbers take precedence when it comes to kernel scheduling. "weight" is also a part of Xapian API terminology, which we currently do not expose to configuration (but may in the future).
2021-05-04treewide: update to v3 Tor onions
v2 onions are insecure, deprecated and going away. v3 names are unfortunately longer and more difficult to remember, but should be more resistant to attack than v2 ones.
2021-04-18doc config: mention obfuscation-related options
Obfuscation has been available since v1.0.0. Help those that want to use the feature figure out how.
2021-03-29doc config: don't render a to-do comment
In the public-inbox-config manpage, the match=domain item under publicinbox.wwwlisting has a to-do comment that gets rendered as "support showing cgit listing". That's potential confusing to readers, especially given that the "TODO" is dropped. Change the markup so that the comment isn't rendered.
2021-02-01doc: note optional BSD::Resource use
We've actually been capable of using this since 2019(*) in our spawn code for PSGI limiters. And it's been used since 2016 in our tests. It's a dependency of SpamAssassin, and Danga::Socket used it, too. (*) commit 721368cd04bfbd03c0d9173fff633ae34f16409a ("spawn: support RLIMIT_CPU, RLIMIT_DATA and RLIMIT_CORE")
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-08-27doc: move watch config docs to -watch manpage
The -config manpage is a bit long and the -watch stuff is isolated from the rest of it while we start documenting NNTP and IMAP support. I'm not entirely happy with the way IMAP and NNTP are configured, it's still good enough for small setups. This also fixes a long-standing misplaced comment about `publicinboxwatch.spamcheck' affecting all configured inboxes, that comment was actually for `publicinboxwatch.watchspam'. We'll omit documenting NNTP for `watchspam', for now, given the lack of \Seen flags in NNTP and I'm not sure if it's even useful. There may not be any newsgroups for sharing confirmed spam, either...
2020-08-20init: support --newsgroup option
We can reduce the need to edit the config file for NNTP group names this way.
2020-08-07index: v2: --sequential-shard option
This gives better page cache utilization for Xapian indexing on slow storage by improving locality for random I/O activity on the Xapian DB. Instead of doing a single-pass to index both SQLite and Xapian; this indexes them separately. The first pass is identical to indexlevel=basic: it indexes both over.sqlite3 and msgmap.sqlite3. Subsequent passes only operate on a single Xapian shard for documents belonging to that shard. Given enough shards, each individual shard can be made small enough to fit into the kernel page cache and avoid HDD seeks for read activity. Doing rough tests with a busy system with a 7200 RPM HDD with ext4, full indexing of LKML (9 epochs) goes from ~80 hours (-j0) to ~30 hours (-j8) with 16GB RAM with 7 shards configured and fsync(2) disabled (--no-sync) and `--batch-size=10m'.
2020-04-25doc: note some changes for 1.5
As an established project (:P), it's important to document when new features appear in manpages. Users may be reading new documentation online which doesn't reflect an older version they have installed.
2020-04-21index: support --max-size / publicinbox.indexMaxSize
In normal mail paths, we can rely on MTAs being configured with reasonable limits in the -watch and -mda mail injection paths. However, the MTA is bypassed in a git-only delivery path, a BOFH could inject a large message and DoS users attempting to mirror a public-inbox. This doesn't protect unindexed WWW interfaces from Email::MIME memory explosions on v1 inboxes. Probably nobody cares about unindexed WWW interfaces anymore, especially now that Xapian is optional for indexing.
2020-04-20watchmaildir: support multiple watchheader values
The watchheader key supports only a single value. Supporting multiple watchheader values was mentioned in discussion [1] of 8d3e3bd8 (doc: explain publicinbox.<name>.watchheader, 2019-10-09), and it wasn't clear if there was a need. One scenario in which matching multiple headers would be convenient is when someone wants to set up public-inbox archives for some small projects but does _not_ want to run mailing lists for them, instead allowing others to follow the project by any of the pull mechanisms. Using a common underlying address, an address alias for each project is configured via a third-party email provider, with messages for each alias being exposed as a separate public-inbox archive. In this setup, messages for an inbox cannot be selected by a List-ID header but can be identified by the inbox's address in either the To or Cc header. To support such a use case, update the watchheader handling to consider multiple values, accepting a message if it matches any value. While selecting a message based on matching _any_ rather than _all_ values is motivated by the above scenario, it's worth noting that the "any" behavior is consistent with how multiple listid config values are handled. [1] https://public-inbox.org/meta/20191010085118.r3amey4cayazfycb@dcvr/
2020-04-12doc: escape internal ">" in listid code snippet
A code snippet in the listid description is incorrectly rendered as "publicinbox.$NAME.watchheader=List-Id:<foo.example.com"> Escape the closing bracket around the List-Id value to avoid this. Also escape the opening bracket for symmetry/readability.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2020-02-04doc: spellling fixes for manpages
The wording for publicinbox.nntpserver was awkward, too, and I took this as opportunity to hopefully clarify it and favor "hostname" for Internet addresses, because we already use "address" to mean "email address" in the config.
2020-01-25s/news.gmane.org/news.gmane.io/
gmane still has a NNTP server, so update links to point to it. cf. https://lars.ingebrigtsen.no/2020/01/06/whatever-happened-to-news-gmane-org/
2020-01-02doc: fix a few spelling errors in user-facing docs
Found by codespell, there's a few more in comments and some debatable ones, but user-facing stuff is more important.
2019-11-09doc: drop a repeated word
2019-11-08doc: actually document publicinboxwatch.watchspam
Instead of copy-pasting the documentation for `spamcheck'.
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-15mda, watch: wire up List-ID header support
This also adds watchheader tests for -watch, which we never had before :x
2019-10-10doc: explain publicinbox.<name>.watchheader
It wasn't clear to me exactly what this does -- in particular, what happens if it isn't specified? Does it support multiple values? A very brief explanation can answer both of these questions without making somebody look at the code.
2019-09-15doc: update config manpage for "publicinbox.grokmanifest"
It's a bit of an esoteric option, but maybe somebody out there can find it useful.
2019-09-09run update-copyrights from gnulib for 2019
2019-09-09doc config: document indexlevel directive
It was never documented, before.
2019-06-09edit: new tool to perform edits
This wrapper around V2Writable->replace provides a user-interface for editing messages as single-message mboxes (or the raw text via $EDITOR).
2019-05-08doc: use bullet list for wwwlisting options
Otherwise, pod2man complains about "=item 404" not starting with a letter and thinking it's part of a numbered list.
2019-05-05Merge remote-tracking branch 'origin/wwwlisting'
* origin/wwwlisting: www: support listing of inboxes start depending on Perl 5.10.1+
2019-04-25cgit: improve handling of cgit data path
Document `publicinbox.cgitdata' config directive, but allow it to be unspecified and/or missing for installations which do not wish to serve static data at all. For users installing cgit from source to their home directory, we can usually infer the cgit data path based on the cgit.cgi binary path, even.
2019-04-19www: support listing of inboxes
We will still return a 404 by default to '/' for compatibility with users of Plack::App::Cascade or similar. Inboxes are sorted by modification times to help users detect activity (similar to the /$INBOX/ topic view). New configuration options: * publicinbox.wwwlisting - configure the listing type * publicinbox.<name>.hide - hide a particular inbox from the listing See changes to public-inbox-config.pod for full descriptions of the new options. Requested-by: Leah Neukirchen <leah@vuxu.org> https://public-inbox.org/meta/871sdfzy80.fsf@gmail.com/
2019-04-18doc: config: fix braino/typo :x
2019-04-15doc/config: update cgit.cgi scan location
We account for the upstream default location as well as the Debian-installed one.
2019-04-04cgit: use a dedicated named limiter
I mainly need this to enforce RLIMIT_CPU (and RLIMIT_CORE) when requests come which generate giant, unrealistic diffs. Per-coderepo limiters may be added in the future. But for now, I need to prevent cgit from monopolizing resources on my dinky server.
2019-04-04qspawn: wire up RLIMIT_* handling to limiters
This allows users to configure RLIMIT_{CORE,CPU,DATA} using our "limiter" config directive when spawning external processes.
2019-04-04www: wire up cgit as a 404 handler if cgitrc is configured
Requests intended for cgit are unlikely to conflict with requests to inboxes. So we can safely hand those requests off to cgit.cgi.