user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 2/4] xapcmd: preserve indexlevel based on the destination
  2019-06-14  3:03  5% [PATCH 0/4] xcpdb: support resharding Xapian DBs Eric Wong
@ 2019-06-14  3:03  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-06-14  3:03 UTC (permalink / raw)
  To: meta

To support M:N resharding, we need to ensure we store the
indexlevel in the destination shard, rather than the
originating one.
---
 lib/PublicInbox/Xapcmd.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Xapcmd.pm b/lib/PublicInbox/Xapcmd.pm
index dad080c..7204a91 100644
--- a/lib/PublicInbox/Xapcmd.pm
+++ b/lib/PublicInbox/Xapcmd.pm
@@ -145,7 +145,8 @@ sub run {
 	if ($v == 1) {
 		my $old_parent = dirname($old);
 		same_fs_or_die($old_parent, $old);
-		$tmp->{$old} = tempdir('xapcmd-XXXXXXXX', DIR => $old_parent);
+		my $v = PublicInbox::Search::SCHEMA_VERSION();
+		$tmp->{$old} = tempdir("xapian$v-XXXXXXXX", DIR => $old_parent);
 		push @q, [ $old, $tmp->{$old} ];
 	} else {
 		opendir my $dh, $old or die "Failed to opendir $old: $!\n";
@@ -276,7 +277,7 @@ sub cpdb ($$) {
 			$dst->set_metadata('last_commit', $lc) if $lc;
 
 			# only the first xapian partition (0) gets 'indexlevel'
-			if ($old =~ m!(?:xapian[0-9]+|xap[0-9]+/0)\z!) {
+			if ($new =~ m!(?:xapian[0-9]+|xap[0-9]+/0)\b!) {
 				my $l = $src->get_metadata('indexlevel');
 				if ($l eq 'medium') {
 					$dst->set_metadata('indexlevel', $l);
-- 
EW


^ permalink raw reply related	[relevance 7%]

* [PATCH 0/4] xcpdb: support resharding Xapian DBs
@ 2019-06-14  3:03  5% Eric Wong
  2019-06-14  3:03  7% ` [PATCH 2/4] xapcmd: preserve indexlevel based on the destination Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-06-14  3:03 UTC (permalink / raw)
  To: meta

Defaulting the number of Xapian shards based on the number
of CPUs can be detrimental to performance given the lack of
speed in common storage systems; since NVMe speeds are not
yet common.

To help public-inbox users recover from this inefficiency while
allowing continuous email archival, we can support arbitrary
resharding to have fewer shards (or more, if doing HW upgrades).

Note: I'm also going to move the documentation towards using the
word "shard" (instead of "partition") to be consistent with
current Xapian documentation (1.4+, and "master").

Xapian 1.2 did not use the word "shard" at all, but IME from my
interactions with non-Xapian search engine folks, the word
"shard" is pretty common.

Eric Wong (4):
  v2writable: use a smaller default for Xapian partitions
  xapcmd: preserve indexlevel based on the destination
  xcpdb: use destination shard as progress prefix
  xcpdb: support resharding v2 repos

 Documentation/public-inbox-xcpdb.pod |  11 ++
 MANIFEST                             |   1 +
 lib/PublicInbox/V2Writable.pm        |  18 ++-
 lib/PublicInbox/Xapcmd.pm            | 222 +++++++++++++++++++++------
 script/public-inbox-xcpdb            |   4 +-
 t/xcpdb-reshard.t                    |  83 ++++++++++
 6 files changed, 286 insertions(+), 53 deletions(-)
 create mode 100644 t/xcpdb-reshard.t

-- 
EW


^ permalink raw reply	[relevance 5%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-06-14  3:03  5% [PATCH 0/4] xcpdb: support resharding Xapian DBs Eric Wong
2019-06-14  3:03  7% ` [PATCH 2/4] xapcmd: preserve indexlevel based on the destination Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).