about summary refs log tree commit homepage
path: root/script/public-inbox-compact
DateCommit message (Collapse)
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2019-05-23doc: various updates to reflect current state
-index documentation avoid redundant v1 information and refers readers to apropriate v1/v2 manpages. Search::Xapian can also be optional, now, as only the PSGI search interface uses it. Favor "INBOX_DIR" where appropriate, since "REPO_DIR" can be confused for code repos which we also support. XAPIAN_FLUSH_THRESHOLD is documented for all relevant bulk commands.
2019-05-23xcpdb|compact: support some xapian-compact switches
Allow users to specify the --blocksize <B>, --no-full, --fuller options for xapian-compact(1) for fine-tuning compact behavior for low-traffic/inactive inboxes. We also won't support --multipass, since it doesn't seem compatible with our requirement to use --no-renumber. We also won't support --single-file, since it only seems intended for totally dead inboxes; and it doesn't seem worth the support overhead when "totally dead" turns out to be a misdiagnosis.
2019-05-23compact: reuse infrastructure from xcpdb
Since -xcpdb is a superset of -compact, we can reuse much of that code used for driving compact. For compact (only), this is slightly less memory efficient since it requires an extra process per-partition, but we get to prefix the output with the partition name for more readable output.
2019-05-23admin: hoist out resolve_inboxes for -compact and -index
Both of these index-affecting commands should work similarly on the command-line. public-inbox-index no longer complains about unconfigured ~/.public-inbox/config; but often I found myself being annoyed by that, anyways...
2019-05-23xapcmd: new module for wrapping Xapian commands
Port public-inbox-compact(1) over to using it, and we will need to wrap copydatabase(1) to ease glass migrations, too.
2019-05-23doc: document the reason for --no-renumber
We're going to need copydatabase, too
2018-05-11convert+compact: fix when running without ~/.public-inbox/config
Some users may not have any public-inboxes configured, especially in tests.
2018-04-18compact: do not merge v2 repos by default
--no-renumber does not allow merging, and merging is not ideal for reindexing, either.
2018-04-07store less data in the Xapian document
Since we only query the SQLite over DB for OVER/XOVER; do not need to waste space storing fields To/Cc/:bytes/:lines or the XNUM term. We only use From/Subject/References/Message-ID/:blob in various places of the PSGI code. For reindexing, we will take advantage of docid stability in "xapian-compact --no-renumber" to ensure duplicates do not show up in search results. Since the PSGI interface is the only consumer of Xapian at the moment, it has no need to search based on NNTP article number.
2018-04-06ensure Xapian and SQLite are still optional for v1 tests
Xapian is size-intensive and SQLite is not strictly necessary for v1.
2018-04-06over: use only supported and safe SQLite APIs
Some of this jankiness was from early performance problems and they turned out to be unnecessary measures.
2018-04-05compact: better handling of over.sqlite3* files
Lets not scare users when they encounter files that are supposed to be there. Then, preserve the journal and pipe.lock, even if they're supposedly unused due to us holding the inbox-wide lock.
2018-04-02replace Xapian skeleton with SQLite overview DB
This ought to provide better performance and scalability which is less dependent on inbox size. Xapian does not seem optimized for some queries used by the WWW homepage, Atom feeds, XOVER and NEWNEWS NNTP commands. This can actually make Xapian optional for NNTP usage, and allow more functionality to work without Xapian installed. Indexing performance was extremely bad at first, but DBI::Profile helped me optimize away problematic queries.
2018-03-30v2: respect core.sharedRepository in git configs
Ensure -convert and -compact do not make repositories unreadable on live servers.
2018-03-29public-inbox-compact: new tool for driving xapian-compact
Having multiple Xapian partitions is mostly pointless after the initial import. We can compact all the partitions into one while keeping the skeleton separate.