From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 7873E1F609 for ; Sat, 15 Jun 2019 08:47:16 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 01/20] doc: rename our Xapian "partitions" to "shards" Date: Sat, 15 Jun 2019 08:46:57 +0000 Message-Id: <20190615084716.3075-2-e@80x24.org> In-Reply-To: <20190615084716.3075-1-e@80x24.org> References: <20190615084716.3075-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: For consistency with Xapian documentation (in the "master" branch). --- Documentation/public-inbox-v2-format.pod | 10 +++++----- Documentation/public-inbox-xcpdb.pod | 11 +++++------ 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/Documentation/public-inbox-v2-format.pod b/Documentation/public-inbox-v2-format.pod index bdfe7ab..28d3550 100644 --- a/Documentation/public-inbox-v2-format.pod +++ b/Documentation/public-inbox-v2-format.pod @@ -16,7 +16,7 @@ Message-IDs. The key change in v2 is the inbox is no longer a bare git repository, but a directory with two or more git repositories. v2 divides git repositories by time "epochs" and Xapian -databases for parallelism by "partitions". +databases for parallelism by "shards". =head2 INBOX OVERVIEW AND DEFINITIONS @@ -28,7 +28,7 @@ foo/ # assuming "foo" is the name of the list - inbox.lock # lock file (flock) to protect global state - git/$EPOCH.git # normal git repositories - all.git # empty git repo, alternates to git/$EPOCH.git -- xap$SCHEMA_VERSION/$PART # per-partition Xapian DB +- xap$SCHEMA_VERSION/$SHARD # per-shard Xapian DB - xap$SCHEMA_VERSION/over.sqlite3 # OVER-view DB for NNTP and threading - msgmap.sqlite3 # same the v1 msgmap @@ -95,16 +95,16 @@ are documented at: L -=head2 XAPIAN PARTITIONS +=head2 XAPIAN SHARDS Another second scalability problem in v1 was the inability to utilize multiple CPU cores for Xapian indexing. This is -addressed by using partitions in Xapian to perform import +addressed by using shards in Xapian to perform import indexing in parallel. As with git alternates, Xapian natively supports a read-only interface which transparently abstracts away the knowledge of -multiple partitions. This allows us to simplify our read-only +multiple shards. This allows us to simplify our read-only code paths. The performance of the storage device is now the bottleneck on diff --git a/Documentation/public-inbox-xcpdb.pod b/Documentation/public-inbox-xcpdb.pod index fd8770a..a13c4ef 100644 --- a/Documentation/public-inbox-xcpdb.pod +++ b/Documentation/public-inbox-xcpdb.pod @@ -21,7 +21,7 @@ L or L. =item --compact In addition to performing the copy operation, run L -on each Xapian partition after copying but before finalizing it. +on each Xapian shard after copying but before finalizing it. Compared to the cost of copying a Xapian database, compacting a Xapian database takes only around 5% of the time required to copy. @@ -32,14 +32,13 @@ the compaction to take hours at-a-time. =item --reshard=N / -R N -Repartition the Xapian database on a L -inbox to C partitions. Since L is not suitable -for merging, users can rely on this switch to repartition the +Reshard the Xapian database on a L +inbox to C shards . Since L is not suitable +for merging, users can rely on this switch to reshard the existing Xapian database(s) to any positive value of C. This is useful in case the Xapian DB was created with too few or -too many partitions given the capabilities of the current -hardware. +too many shards given the capabilities of the current hardware. =item --blocksize / --no-full / --fuller -- EW