user/dev discussion of public-inbox itself
 help / color / Atom feed
* [PATCH 00/26] xcpdb: ease Xapian DB format migrations
@ 2019-05-23  9:36 Eric Wong
  2019-05-23  9:36 ` [PATCH 01/26] t/convert-compact: skip on missing xapian-compact(1) Eric Wong
                   ` (26 more replies)
  0 siblings, 27 replies; 28+ messages in thread
From: Eric Wong @ 2019-05-23  9:36 UTC (permalink / raw)
  To: meta

I've noticed performance problems in Xapian's old chert
backend which seem alleviated with the new glass backend;
particularly related to phrase searches.

Unfortunately, the tool distributed with Xapian for updating DB
formats, copydatabase(1), is extremely slow and blocking updates
for hours at a time to perform the migration is not acceptable.
(That's right, "copydatabase" is NOT a Postgres command!)

So, I've written "public-inbox-xcpdb" and gotten it to perform
the bulk copy operation without holding inbox.lock and have it
deal gracefully with Xapian DB modifications.  xcpdb is still
slow, but I've (finally!) implemented partial reindexing to
allow it to minimize the lock time and not stall -mda or -watch
processes while it is working.

There's a bunch of cleanups along the way, too; and it should
make future changes to repartition the Xapian DB on existing v2
inboxes easier.

Eric Wong (26):
  t/convert-compact: skip on missing xapian-compact(1)
  v1writable: retire in favor of InboxWritable
  doc: document the reason for --no-renumber
  search: reenable phrase search on non-chert Xapian
  xapcmd: new module for wrapping Xapian commands
  admin: hoist out resolve_inboxes for -compact and -index
  xapcmd: support spawn options
  xcpdb: new tool which wraps Xapian's copydatabase(1)
  xapcmd: do not cleanup on errors
  admin: move index_inbox over
  xcpdb: implement using Perl bindings
  xapcmd: xcpdb supports compaction
  v2writable: hoist out log_range sub for readability
  xcpdb: use fine-grained locking
  xcpdb: implement progress reporting
  xcpdb: cleanup error handling and diagnosis
  xapcmd: avoid EXDEV when finalizing changes
  doc: xcpdb: update to reflect the current state
  xapcmd: use "print STDERR" for progress reporting
  xcpdb: show re-indexing progress
  xcpdb: remove temporary directories on aborts
  compact: reuse infrastructure from xcpdb
  xcpdb|compact: support some xapian-compact switches
  xapcmd: cleanup on interrupted xcpdb "--compact"
  xcpdb|compact: support --jobs/-j flag like gmake(1)
  xapcmd: do not reset %SIG until last Xtmpdir is done

 Documentation/include.mk                 |   6 +-
 Documentation/public-inbox-v1-format.pod |   4 +
 Documentation/public-inbox-v2-format.pod |   4 +
 Documentation/public-inbox-xcpdb.pod     |  57 ++++
 MANIFEST                                 |   4 +-
 lib/PublicInbox/Admin.pm                 |  66 ++++
 lib/PublicInbox/InboxWritable.pm         |  35 ++-
 lib/PublicInbox/Search.pm                |  48 +--
 lib/PublicInbox/SearchIdx.pm             |  34 ++-
 lib/PublicInbox/V1Writable.pm            |  34 ---
 lib/PublicInbox/V2Writable.pm            | 109 ++++---
 lib/PublicInbox/Xapcmd.pm                | 370 +++++++++++++++++++++++
 script/public-inbox-compact              | 102 +------
 script/public-inbox-index                | 102 +------
 script/public-inbox-init                 |  13 +-
 script/public-inbox-xcpdb                |  19 ++
 t/cgi.t                                  |   4 +-
 t/convert-compact.t                      |   4 +
 t/indexlevels-mirror.t                   |  27 +-
 t/init.t                                 |   4 +-
 t/nntpd.t                                |  15 +-
 t/search.t                               |   1 +
 t/v2mirror.t                             |   1 +
 23 files changed, 740 insertions(+), 323 deletions(-)
 create mode 100644 Documentation/public-inbox-xcpdb.pod
 delete mode 100644 lib/PublicInbox/V1Writable.pm
 create mode 100644 lib/PublicInbox/Xapcmd.pm
 create mode 100755 script/public-inbox-xcpdb

-- 
EW

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, back to index

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23  9:36 [PATCH 00/26] xcpdb: ease Xapian DB format migrations Eric Wong
2019-05-23  9:36 ` [PATCH 01/26] t/convert-compact: skip on missing xapian-compact(1) Eric Wong
2019-05-23  9:36 ` [PATCH 02/26] v1writable: retire in favor of InboxWritable Eric Wong
2019-05-23  9:36 ` [PATCH 03/26] doc: document the reason for --no-renumber Eric Wong
2019-05-23  9:36 ` [PATCH 04/26] search: reenable phrase search on non-chert Xapian Eric Wong
2019-05-23  9:36 ` [PATCH 05/26] xapcmd: new module for wrapping Xapian commands Eric Wong
2019-05-23  9:36 ` [PATCH 06/26] admin: hoist out resolve_inboxes for -compact and -index Eric Wong
2019-05-23  9:36 ` [PATCH 07/26] xapcmd: support spawn options Eric Wong
2019-05-23  9:36 ` [PATCH 08/26] xcpdb: new tool which wraps Xapian's copydatabase(1) Eric Wong
2019-05-23  9:36 ` [PATCH 09/26] xapcmd: do not cleanup on errors Eric Wong
2019-05-23  9:36 ` [PATCH 10/26] admin: move index_inbox over Eric Wong
2019-05-23  9:36 ` [PATCH 11/26] xcpdb: implement using Perl bindings Eric Wong
2019-05-23  9:36 ` [PATCH 12/26] xapcmd: xcpdb supports compaction Eric Wong
2019-05-23  9:36 ` [PATCH 13/26] v2writable: hoist out log_range sub for readability Eric Wong
2019-05-23  9:36 ` [PATCH 14/26] xcpdb: use fine-grained locking Eric Wong
2019-05-23  9:36 ` [PATCH 15/26] xcpdb: implement progress reporting Eric Wong
2019-05-23  9:36 ` [PATCH 16/26] xcpdb: cleanup error handling and diagnosis Eric Wong
2019-05-23  9:36 ` [PATCH 17/26] xapcmd: avoid EXDEV when finalizing changes Eric Wong
2019-05-23  9:36 ` [PATCH 18/26] doc: xcpdb: update to reflect the current state Eric Wong
2019-05-23  9:36 ` [PATCH 19/26] xapcmd: use "print STDERR" for progress reporting Eric Wong
2019-05-23  9:36 ` [PATCH 20/26] xcpdb: show re-indexing progress Eric Wong
2019-05-23  9:36 ` [PATCH 21/26] xcpdb: remove temporary directories on aborts Eric Wong
2019-05-23  9:37 ` [PATCH 22/26] compact: reuse infrastructure from xcpdb Eric Wong
2019-05-23  9:37 ` [PATCH 23/26] xcpdb|compact: support some xapian-compact switches Eric Wong
2019-05-23  9:37 ` [PATCH 24/26] xapcmd: cleanup on interrupted xcpdb "--compact" Eric Wong
2019-05-23  9:37 ` [PATCH 25/26] xcpdb|compact: support --jobs/-j flag like gmake(1) Eric Wong
2019-05-23  9:37 ` [PATCH 26/26] xapcmd: do not reset %SIG until last Xtmpdir is done Eric Wong
2019-05-23 10:37 ` [PATCH 27/26] doc: various updates to reflect current state Eric Wong

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git