about summary refs log tree commit homepage
path: root/lib/PublicInbox/Xapcmd.pm
DateCommit message (Collapse)
2019-05-23xapcmd: cleanup on interrupted xcpdb "--compact"
We should not have leftover junk on interrupted invocations.
2019-05-23xcpdb|compact: support some xapian-compact switches
Allow users to specify the --blocksize <B>, --no-full, --fuller options for xapian-compact(1) for fine-tuning compact behavior for low-traffic/inactive inboxes. We also won't support --multipass, since it doesn't seem compatible with our requirement to use --no-renumber. We also won't support --single-file, since it only seems intended for totally dead inboxes; and it doesn't seem worth the support overhead when "totally dead" turns out to be a misdiagnosis.
2019-05-23compact: reuse infrastructure from xcpdb
Since -xcpdb is a superset of -compact, we can reuse much of that code used for driving compact. For compact (only), this is slightly less memory efficient since it requires an extra process per-partition, but we get to prefix the output with the partition name for more readable output.
2019-05-23xcpdb: remove temporary directories on aborts
Cleanup temporary directories on common termination signals (INT, HUP, PIPE, TERM), but only if it's not in the process of being committed via rename() sequence.
2019-05-23xcpdb: show re-indexing progress
Emit information about reindexing git revision ranges when used with xcpdb. Additionally, distinguish Xapian copy output from v2 git epoch counting by increasing directory context info. For now, v1 batches batches are emitted. v2 indexing is still missing progress reporting for batches, as the data structures for reindexing would benefit from a refactoring, first. This does not currently affect the use of public-inbox-index, but may in the future.
2019-05-23xapcmd: use "print STDERR" for progress reporting
`warn' is reserved for actual warnings, as it respects $SIG{__WARN__} and we rely on that override to print message context information when we are indexing.
2019-05-23xapcmd: avoid EXDEV when finalizing changes
By creating temporary directories as deep as possible, we can allow v2 repositories to have `xap$SCHEMA_VERSION' (e.g. `xap15') reside on a separate FS. We also check st_dev ahead-of-time to avoid doing work which will fail with EXDEV. Of course, another process may still move/change things around.
2019-05-23xcpdb: cleanup error handling and diagnosis
Running a full "public-inbox-index --reindex" in parallel with "public-inbox-xcpdb" on the same inbox can still cause problems, though.
2019-05-23xcpdb: implement progress reporting
Copying an entire Xapian DB is horribly slow whether it's done via Perl or copydatabase(1). So displaying some progress indication is good for user experience. While we're at it, prefix xapian-compact output, too; since parallel processes end up clobbering each other.
2019-05-23xcpdb: use fine-grained locking
Copying an entire Xapian DB takes a long time, so update our reindexing code to support partial reindexing, snapshot the pre-copydatabase git revisions, perform the lengthy copy, and do a partial reindex when the copy + renames are done.
2019-05-23xapcmd: xcpdb supports compaction
To minimize the delay on active inboxes, it's actually ideal to run xapian-compact at the end of the per-partition cpdb process; since the new DB isn't accessible yet and so we don't have to deal with lock contention with -mda or -watch processes. The downside is temporary file overhead (3x instead of 2x) required.
2019-05-23xcpdb: implement using Perl bindings
By avoid copydatabase(1) entirely, we can make further changes to avoid locking the entire inbox for a long operation and switch to fine-grained locking.
2019-05-23xapcmd: do not cleanup on errors
We move the old directory into the new directory, so avoid the situation where a bug or error could cause the tempdir cleanup to run and destroy both our old and new directories.
2019-05-23xapcmd: support spawn options
copydatabase(1) is exceptionally noisy and it's output is confusing when run in parallel. Support redirects at least, and env while we're at it to give us future options. We can also stuff a -jobs parameter into the options to limit parallelism since it can be useful for low-priority upgrade jobs.
2019-05-23xapcmd: new module for wrapping Xapian commands
Port public-inbox-compact(1) over to using it, and we will need to wrap copydatabase(1) to ease glass migrations, too.