user/dev discussion of public-inbox itself
 help / color / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 08/26] xcpdb: new tool which wraps Xapian's copydatabase(1)
Date: Thu, 23 May 2019 09:36:46 +0000
Message-ID: <20190523093704.18367-9-e@80x24.org> (raw)
In-Reply-To: <20190523093704.18367-1-e@80x24.org>

copydatabase(1) is an existing Xapian tool which is the
recommended way to upgrade existing DBs to the latest Xapian
database format (currently "glass" for stable/released
versions).  Our use of Xapian relies on preserving document IDs,
so we'll wrap it like we do xapian-compact(1) and use the
"--no-renumber" switch.

I could not name the tool "public-inbox-copydatabase" since it
would be ambiguous as to which DB it's actually copying.  So, I
abbreviated the suffix to "xcpdb" (Xapian CoPy DataBase), which
I hope is acceptable and unambiguous.
---
 Documentation/include.mk             |  6 ++--
 Documentation/public-inbox-xcpdb.pod | 51 ++++++++++++++++++++++++++++
 MANIFEST                             |  2 ++
 script/public-inbox-xcpdb            | 18 ++++++++++
 t/indexlevels-mirror.t               | 22 ++++++++++++
 5 files changed, 97 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/public-inbox-xcpdb.pod
 create mode 100755 script/public-inbox-xcpdb

diff --git a/Documentation/include.mk b/Documentation/include.mk
index 6415338..27d6ea6 100644
--- a/Documentation/include.mk
+++ b/Documentation/include.mk
@@ -26,11 +26,13 @@ podtext = $(PODTEXT) $(PODTEXT_OPTS)
 
 # MakeMaker only seems to support manpage sections 1 and 3...
 m1 =
-m1 += public-inbox-mda
+m1 += public-inbox-compact
 m1 += public-inbox-httpd
+m1 += public-inbox-index
+m1 += public-inbox-mda
 m1 += public-inbox-nntpd
 m1 += public-inbox-watch
-m1 += public-inbox-index
+m1 += public-inbox-xcpdb
 m5 =
 m5 += public-inbox-config
 m5 += public-inbox-v1-format
diff --git a/Documentation/public-inbox-xcpdb.pod b/Documentation/public-inbox-xcpdb.pod
new file mode 100644
index 0000000..4ff5186
--- /dev/null
+++ b/Documentation/public-inbox-xcpdb.pod
@@ -0,0 +1,51 @@
+=head1 NAME
+
+public-inbox-xcpdb - copy Xapian DBs (for format upgrades)
+
+=head1 SYNOPSIS
+
+	public-inbox-xcpdb INBOX_DIR
+
+=head1 DESCRIPTION
+
+public-inbox-xcpdb is a wrapper for L<copydatabase(1)> for
+upgrading to the latest database format supported by Xapian
+(e.g. "glass" or "honey").
+
+It locks the inbox and prevents other processes such as
+L<public-inbox-watch(1)> and L<public-inbox-mda(1)> from
+writing while it operates.
+
+This is intended for upgrading the database format used by
+Xapian.  It DOES NOT upgrade the schema used by the
+public-inbox search interface (see L<public-inbox-index(1)>).
+
+=head1 ENVIRONMENT
+
+=over 8
+
+=item PI_CONFIG
+
+The default config file, normally "~/.public-inbox/config".
+See L<public-inbox-config(5)>
+
+=back
+
+=head1 UPGRADING
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2019 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<copydatabase(1)>, L<public-inbox-index(1)>
diff --git a/MANIFEST b/MANIFEST
index dfc1f66..efd5658 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -20,6 +20,7 @@ Documentation/public-inbox-overview.pod
 Documentation/public-inbox-v1-format.pod
 Documentation/public-inbox-v2-format.pod
 Documentation/public-inbox-watch.pod
+Documentation/public-inbox-xcpdb.pod
 Documentation/standards.perl
 Documentation/txt2pre
 HACKING
@@ -154,6 +155,7 @@ script/public-inbox-mda
 script/public-inbox-nntpd
 script/public-inbox-purge
 script/public-inbox-watch
+script/public-inbox-xcpdb
 script/public-inbox.cgi
 scripts/dc-dlvr
 scripts/dc-dlvr.pre
diff --git a/script/public-inbox-xcpdb b/script/public-inbox-xcpdb
new file mode 100755
index 0000000..cbf9f55
--- /dev/null
+++ b/script/public-inbox-xcpdb
@@ -0,0 +1,18 @@
+#!/usr/bin/perl -w
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# xcpdb: Xapian copy database, a wrapper around Xapian's copydatabase(1)
+use PublicInbox::InboxWritable;
+use PublicInbox::Xapcmd;
+use PublicInbox::Admin;
+PublicInbox::Admin::require_or_die('-search');
+my $usage = "Usage: public-inbox-xcpdb INBOX_DIR\n";
+my @ibxs = PublicInbox::Admin::resolve_inboxes(\@ARGV) or die $usage;
+my $cmd = [qw(copydatabase --no-renumber)];
+open my $null, '>', '/dev/null' or die "failed to open /dev/null: $!\n";
+my $rdr = { 1 => fileno($null) };
+foreach (@ibxs) {
+	my $ibx = PublicInbox::InboxWritable->new($_);
+	# we rely on --no-renumber to keep docids synched to NNTP
+	PublicInbox::Xapcmd::run($ibx, $cmd, undef, $rdr);
+}
diff --git a/t/indexlevels-mirror.t b/t/indexlevels-mirror.t
index d124c75..61053b6 100644
--- a/t/indexlevels-mirror.t
+++ b/t/indexlevels-mirror.t
@@ -18,6 +18,7 @@ foreach my $mod (qw(DBD::SQLite)) {
 
 my $path = 'blib/script';
 my $index = "$path/public-inbox-index";
+my $xcpdb = "$path/public-inbox-xcpdb";
 
 my $mime = PublicInbox::MIME->create(
 	header => [
@@ -108,6 +109,13 @@ sub import_index_incremental {
 	ok($im->remove($mime), '2nd message removed');
 	$im->done;
 
+	if ($level ne 'basic') {
+		is(system($xcpdb, $mirror), 0, "v$v xcpdb OK");
+		delete $ro_mirror->{$_} for (qw(over search));
+		($nr, $msgs) = $ro_mirror->search->query('m:m@2');
+		is($nr, 1, "v$v found m\@2 via Xapian on $level");
+	}
+
 	# sync the mirror
 	is(system('git', "--git-dir=$fetch_dir", qw(fetch -q)), 0, 'fetch OK');
 	is(system($index, $mirror), 0, "v$v index mirror again OK");
@@ -120,6 +128,10 @@ sub import_index_incremental {
 		is_deeply([glob("$ibx->{mainrepo}/xap*/?/")], [],
 			 'no Xapian partition directories for v2 basic');
 	}
+	if ($level ne 'basic') {
+		($nr, $msgs) = $ro_mirror->search->reopen->query('m:m@2');
+		is($nr, 0, "v$v m\@2 gone from Xapian in mirror on $level");
+	}
 }
 
 # we can probably cull some other tests and put full/medium tests, here
@@ -131,4 +143,14 @@ for my $level (qw(basic)) {
 	}
 }
 
+SKIP: {
+	require PublicInbox::Search;
+	PublicInbox::Search::load_xapian() or skip 'Search::Xapian missing', 2;
+	for my $v (1..2) {
+		subtest("v$v indexlevel=medium" => sub {
+			import_index_incremental($v, 'medium');
+		})
+	}
+}
+
 done_testing();
-- 
EW


  parent reply index

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23  9:36 [PATCH 00/26] xcpdb: ease Xapian DB format migrations Eric Wong
2019-05-23  9:36 ` [PATCH 01/26] t/convert-compact: skip on missing xapian-compact(1) Eric Wong
2019-05-23  9:36 ` [PATCH 02/26] v1writable: retire in favor of InboxWritable Eric Wong
2019-05-23  9:36 ` [PATCH 03/26] doc: document the reason for --no-renumber Eric Wong
2019-05-23  9:36 ` [PATCH 04/26] search: reenable phrase search on non-chert Xapian Eric Wong
2019-05-23  9:36 ` [PATCH 05/26] xapcmd: new module for wrapping Xapian commands Eric Wong
2019-05-23  9:36 ` [PATCH 06/26] admin: hoist out resolve_inboxes for -compact and -index Eric Wong
2019-05-23  9:36 ` [PATCH 07/26] xapcmd: support spawn options Eric Wong
2019-05-23  9:36 ` Eric Wong [this message]
2019-05-23  9:36 ` [PATCH 09/26] xapcmd: do not cleanup on errors Eric Wong
2019-05-23  9:36 ` [PATCH 10/26] admin: move index_inbox over Eric Wong
2019-05-23  9:36 ` [PATCH 11/26] xcpdb: implement using Perl bindings Eric Wong
2019-05-23  9:36 ` [PATCH 12/26] xapcmd: xcpdb supports compaction Eric Wong
2019-05-23  9:36 ` [PATCH 13/26] v2writable: hoist out log_range sub for readability Eric Wong
2019-05-23  9:36 ` [PATCH 14/26] xcpdb: use fine-grained locking Eric Wong
2019-05-23  9:36 ` [PATCH 15/26] xcpdb: implement progress reporting Eric Wong
2019-05-23  9:36 ` [PATCH 16/26] xcpdb: cleanup error handling and diagnosis Eric Wong
2019-05-23  9:36 ` [PATCH 17/26] xapcmd: avoid EXDEV when finalizing changes Eric Wong
2019-05-23  9:36 ` [PATCH 18/26] doc: xcpdb: update to reflect the current state Eric Wong
2019-05-23  9:36 ` [PATCH 19/26] xapcmd: use "print STDERR" for progress reporting Eric Wong
2019-05-23  9:36 ` [PATCH 20/26] xcpdb: show re-indexing progress Eric Wong
2019-05-23  9:36 ` [PATCH 21/26] xcpdb: remove temporary directories on aborts Eric Wong
2019-05-23  9:37 ` [PATCH 22/26] compact: reuse infrastructure from xcpdb Eric Wong
2019-05-23  9:37 ` [PATCH 23/26] xcpdb|compact: support some xapian-compact switches Eric Wong
2019-05-23  9:37 ` [PATCH 24/26] xapcmd: cleanup on interrupted xcpdb "--compact" Eric Wong
2019-05-23  9:37 ` [PATCH 25/26] xcpdb|compact: support --jobs/-j flag like gmake(1) Eric Wong
2019-05-23  9:37 ` [PATCH 26/26] xapcmd: do not reset %SIG until last Xtmpdir is done Eric Wong
2019-05-23 10:37 ` [PATCH 27/26] doc: various updates to reflect current state Eric Wong

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190523093704.18367-9-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git