about summary refs log tree commit homepage
path: root/Documentation/public-inbox-init.pod
DateCommit message (Collapse)
2021-07-22init: allow arbitrary key-values via -c KEY=VALUE
This won't blindly append identical key=values, but allows specifying multiple, different key=value pairs as long as the values are different.
2021-05-17doc: split option variants into separate items
e226f18934eb7291 modified the lei-q manpage so that each variant of an option gets a dedicated =item to make L</--xyz> look nicer and to follow the Perl core documentation. Do the same for the other manpages. Note that this still leaves the variants of an option grouped in one scenario: when a list of options without descriptions is presented as a pointer to another location. Splitting the variants in that case would make it harder for the reader to tell what the distinct options are.
2021-05-04treewide: update to v3 Tor onions
v2 onions are insecure, deprecated and going away. v3 names are unfortunately longer and more difficult to remember, but should be more resistant to attack than v2 ones.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-09-20doc: post-1.6 updates, start 1.7
I should've dropped "PENDING" notes before the 1.6 release; they're dropped now, and a note is added to remind my future self to drop them before 1.7.
2020-08-26doc: add some more tuning notes
I've learned a thing or three about btrfs in the past few weeks and remembered some old HDD things, too. The Xapian MultiDatabase problem will need to be addressed for 1.7...
2020-08-20init+index: support --skip-docdata for Xapian
Since we no longer read document data from Xapian, allow users to opt-out of storing it. This breaks compatibility with previous releases of public-inbox, but gives us a ~1.5% space savings on Xapian storage (and associated I/O and page cache pressure reduction).
2020-08-20init: drop -N alias for --skip-artnum
It may be too easily confused for --newsgroup or --ng. This is too rarely used and never made it into a release, so it should be fine.
2020-08-20init: support --newsgroup option
We can reduce the need to edit the config file for NNTP group names this way.
2020-08-10index: cleanup internal variables
Move away from hard-to-read alllowercase naming and favor snake_case or separated-by-dashes. We'll keep `--indexlevel' as-is for now, since it's been around for several releases; but we'll support `--index-level' in the CLI and update our documentation in a few months. We'll also clarify that publicInbox.indexMaxSize is only intended for -index, and not -watch or -mda.
2020-07-14doc: release notes and version info updates
Update release notes with some features in the 1.6 timeline. We'll note the version availability of some command-line options, it may help users who are reading the latest documentation online but running older versions.
2020-06-23init: add --skip-artnum parameter
For archivists with only newer mail archives, this option allows reserving reserve NNTP article numbers for yet-to-be-archived old messages. Indexers will need to be updated to support this feature in future commits. -V1 inboxes will now be initialized with SQLite and Xapian support if this option is used, or if --indexlevel= is specified.
2020-06-23init: add -j / --jobs parameter
On a powerful (by my standards) machine with 16GB RAM and an 7200 RPM HDD marketed for "enterprise" use, indexing a 8.1G (in git) LKML snapshot from Sep 2019 did not finish after 7 days with the default number (3) of Xapian shards (`--jobs=4') and `--batch-size=10m'. Indexing starts off fast, but progressively get slower as contents of the inbox (including Xapian + SQLite DBs) could no longer be cached by the kernel. Once the on-disk size increased, HDD seek contention between the Xapian shard workers slowed the process down to a crawl. With a single shard, it still took around 3.5 days to index on the HDD. That's not good, but it's far better than not finishing after 7 days. So allow unfortunate HDD users to easily specify a single shard on public-inbox-init. For reference, a freshly TRIM-ed low-end TLC SSD on the SATA II bus on the same machine indexes that same snapshot of LKML in ~7 hours with 3 shards and the same 10m batch size. In the past, a higher-end consumer grade MLC SSDs on similar hardware indexed a similarly sized-data set in ~4 hours.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2019-10-16config: support "inboxdir" in addition to "mainrepo"
"mainrepo" ws a bad name and artifact from the early days when I intended for there to be a "spamrepo" (now just the ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be especially confusing, since v2 needs at least two git repositories (epoch + all.git) to function and we shouldn't confuse users by having them point to a git repository for v2. Much of our documentation already references "INBOX_DIR" for command-line arguments, so use "inboxdir" as the git-config(1)-friendly variant for that. "mainrepo" remains supported indefinitely for compatibility. Users may need to revert to old versions, or may be referring to old documentation and must not be forced to change config files to account for this change. So if you're using "mainrepo" today, I do NOT recommend changing it right away because other bugs can lurk. Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
2019-10-05doc: add manpage for public-inbox-init(1)
This old command was lacking a manpage, so (finally) create one.