From c43813b9138398ed2de06c3616a5932725090ae3 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Sun, 17 May 2020 19:37:21 +0000 Subject: index: add --batch-size=SIZE option On powerful systems, having this option is preferable to XAPIAN_FLUSH_THRESHOLD due to lock granularity and contention with other processes (-learn, -mda, -watch). Setting XAPIAN_FLUSH_THRESHOLD can cause -learn, -mda, and -watch to get stuck until an epoch is completely processed. --- Documentation/public-inbox-index.pod | 42 ++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-) (limited to 'Documentation/public-inbox-index.pod') diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod index 8a37580c..5be3c897 100644 --- a/Documentation/public-inbox-index.pod +++ b/Documentation/public-inbox-index.pod @@ -56,6 +56,10 @@ Xapian database. Using this with C<--compact> or running L afterwards is recommended to release free space. +public-inbox protects writes to various indices with L, +so it is safe to reindex while L, +L or L run. + This does not touch the NNTP article number database or affect threading. @@ -72,6 +76,12 @@ Sets or overrides L on a per-invocation basis. See L below. +=item --batch-size SIZE + +Sets or overrides L on a +per-invocation basis. See L +below. + =back =head1 FILES @@ -98,6 +108,27 @@ This option is only available in public-inbox 1.5 or later. Default: none +=item publicinbox.indexBatchSize + +Flushes changes to the filesystem and releases locks after +indexing the given number of bytes. The default value of C<1m> +(one megabyte) is low to minimize memory use and reduce +contention with parallel invocations of L, +L, and L. + +Increase this value on powerful systems to improve throughput at +the expense of memory use. The reduction of lock granularity +may not be noticeable on fast systems. + +This option is available in public-inbox 1.6 or later. +public-inbox 1.5 and earlier used the current default, C<1m>. + +For L inboxes, this value is +multiplied by the number of Xapian shards. Thus a typical v2 +inbox with 3 shards will flush every 3 megabytes by default. + +Default: 1m (one megabyte) + =back =head1 ENVIRONMENT @@ -114,10 +145,13 @@ The number of documents to update before committing changes to disk. This environment is handled directly by Xapian, refer to Xapian API documentation for more details. -Default: our indexing code flushes every megabyte of mail seen -to keep memory usage low. Setting this environment variable to -any positive value will switch to a document count-based -threshold in Xapian. +For public-inbox 1.6 and later, use C +instead. Setting C for a large C<--reindex> +may cause L, L and +L tasks to wait long periods of time +during C<--reindex>. + +Default: none, uses C =back -- cgit v1.2.3-24-ge0c7