From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 590ED1F462 for ; Thu, 23 May 2019 09:37:04 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 00/26] xcpdb: ease Xapian DB format migrations Date: Thu, 23 May 2019 09:36:38 +0000 Message-Id: <20190523093704.18367-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: I've noticed performance problems in Xapian's old chert backend which seem alleviated with the new glass backend; particularly related to phrase searches. Unfortunately, the tool distributed with Xapian for updating DB formats, copydatabase(1), is extremely slow and blocking updates for hours at a time to perform the migration is not acceptable. (That's right, "copydatabase" is NOT a Postgres command!) So, I've written "public-inbox-xcpdb" and gotten it to perform the bulk copy operation without holding inbox.lock and have it deal gracefully with Xapian DB modifications. xcpdb is still slow, but I've (finally!) implemented partial reindexing to allow it to minimize the lock time and not stall -mda or -watch processes while it is working. There's a bunch of cleanups along the way, too; and it should make future changes to repartition the Xapian DB on existing v2 inboxes easier. Eric Wong (26): t/convert-compact: skip on missing xapian-compact(1) v1writable: retire in favor of InboxWritable doc: document the reason for --no-renumber search: reenable phrase search on non-chert Xapian xapcmd: new module for wrapping Xapian commands admin: hoist out resolve_inboxes for -compact and -index xapcmd: support spawn options xcpdb: new tool which wraps Xapian's copydatabase(1) xapcmd: do not cleanup on errors admin: move index_inbox over xcpdb: implement using Perl bindings xapcmd: xcpdb supports compaction v2writable: hoist out log_range sub for readability xcpdb: use fine-grained locking xcpdb: implement progress reporting xcpdb: cleanup error handling and diagnosis xapcmd: avoid EXDEV when finalizing changes doc: xcpdb: update to reflect the current state xapcmd: use "print STDERR" for progress reporting xcpdb: show re-indexing progress xcpdb: remove temporary directories on aborts compact: reuse infrastructure from xcpdb xcpdb|compact: support some xapian-compact switches xapcmd: cleanup on interrupted xcpdb "--compact" xcpdb|compact: support --jobs/-j flag like gmake(1) xapcmd: do not reset %SIG until last Xtmpdir is done Documentation/include.mk | 6 +- Documentation/public-inbox-v1-format.pod | 4 + Documentation/public-inbox-v2-format.pod | 4 + Documentation/public-inbox-xcpdb.pod | 57 ++++ MANIFEST | 4 +- lib/PublicInbox/Admin.pm | 66 ++++ lib/PublicInbox/InboxWritable.pm | 35 ++- lib/PublicInbox/Search.pm | 48 +-- lib/PublicInbox/SearchIdx.pm | 34 ++- lib/PublicInbox/V1Writable.pm | 34 --- lib/PublicInbox/V2Writable.pm | 109 ++++--- lib/PublicInbox/Xapcmd.pm | 370 +++++++++++++++++++++++ script/public-inbox-compact | 102 +------ script/public-inbox-index | 102 +------ script/public-inbox-init | 13 +- script/public-inbox-xcpdb | 19 ++ t/cgi.t | 4 +- t/convert-compact.t | 4 + t/indexlevels-mirror.t | 27 +- t/init.t | 4 +- t/nntpd.t | 15 +- t/search.t | 1 + t/v2mirror.t | 1 + 23 files changed, 740 insertions(+), 323 deletions(-) create mode 100644 Documentation/public-inbox-xcpdb.pod delete mode 100644 lib/PublicInbox/V1Writable.pm create mode 100644 lib/PublicInbox/Xapcmd.pm create mode 100755 script/public-inbox-xcpdb -- EW