about summary refs log tree commit homepage
path: root/Documentation
diff options
context:
space:
mode:
authorEric Wong <e@yhbt.net>2020-05-17 19:37:21 +0000
committerEric Wong <e@yhbt.net>2020-05-18 02:38:03 +0000
commitc43813b9138398ed2de06c3616a5932725090ae3 (patch)
tree7c64bf483be47ecf6fa54759670458b1d272fb72 /Documentation
parentf3482d4a19a8de47199fa18beb258deb699bf703 (diff)
downloadpublic-inbox-c43813b9138398ed2de06c3616a5932725090ae3.tar.gz
On powerful systems, having this option is preferable to
XAPIAN_FLUSH_THRESHOLD due to lock granularity and contention
with other processes (-learn, -mda, -watch).

Setting XAPIAN_FLUSH_THRESHOLD can cause -learn, -mda, and
-watch to get stuck until an epoch is completely processed.
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/public-inbox-index.pod42
1 files changed, 38 insertions, 4 deletions
diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod
index 8a37580c..5be3c897 100644
--- a/Documentation/public-inbox-index.pod
+++ b/Documentation/public-inbox-index.pod
@@ -56,6 +56,10 @@ Xapian database.  Using this with C<--compact> or running
 L<public-inbox-compact(1)> afterwards is recommended to
 release free space.
 
+public-inbox protects writes to various indices with L<flock(2)>,
+so it is safe to reindex while L<public-inbox-watch(1)>,
+L<public-inbox-mda(1)> or L<public-inbox-learn(1)> run.
+
 This does not touch the NNTP article number database or
 affect threading.
 
@@ -72,6 +76,12 @@ Sets or overrides L</publicinbox.indexMaxSize> on a
 per-invocation basis.  See L</publicinbox.indexMaxSize>
 below.
 
+=item --batch-size SIZE
+
+Sets or overrides L</publicinbox.indexBatchSize> on a
+per-invocation basis.  See L</publicinbox.indexBatchSize>
+below.
+
 =back
 
 =head1 FILES
@@ -98,6 +108,27 @@ This option is only available in public-inbox 1.5 or later.
 
 Default: none
 
+=item publicinbox.indexBatchSize
+
+Flushes changes to the filesystem and releases locks after
+indexing the given number of bytes.  The default value of C<1m>
+(one megabyte) is low to minimize memory use and reduce
+contention with parallel invocations of L<public-inbox-mda(1)>,
+L<public-inbox-learn(1)>, and L<public-inbox-watch(1)>.
+
+Increase this value on powerful systems to improve throughput at
+the expense of memory use.  The reduction of lock granularity
+may not be noticeable on fast systems.
+
+This option is available in public-inbox 1.6 or later.
+public-inbox 1.5 and earlier used the current default, C<1m>.
+
+For L<public-inbox-v2-format(5)> inboxes, this value is
+multiplied by the number of Xapian shards.  Thus a typical v2
+inbox with 3 shards will flush every 3 megabytes by default.
+
+Default: 1m (one megabyte)
+
 =back
 
 =head1 ENVIRONMENT
@@ -114,10 +145,13 @@ The number of documents to update before committing changes to
 disk.  This environment is handled directly by Xapian, refer to
 Xapian API documentation for more details.
 
-Default: our indexing code flushes every megabyte of mail seen
-to keep memory usage low.  Setting this environment variable to
-any positive value will switch to a document count-based
-threshold in Xapian.
+For public-inbox 1.6 and later, use C<publicinbox.indexBatchSize>
+instead.  Setting C<XAPIAN_FLUSH_THRESHOLD> for a large C<--reindex>
+may cause L<public-inbox-mda(1)>, L<public-inbox-learn(1)> and
+L<public-inbox-watch(1)> tasks to wait long periods of time
+during C<--reindex>.
+
+Default: none, uses C<publicinbox.indexBatchSize>
 
 =back