From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id D964F1F462; Wed, 12 Jun 2019 16:50:31 +0000 (UTC) Date: Wed, 12 Jun 2019 16:50:31 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: [RFC] v2writable: use a smaller default for Xapian partitions Message-ID: <20190612165031.t2x3rnwoyr6cnf7m@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline List-Id: Apparently 16 CPUs (probably HT) and SATA storage is common these days. Having excessive Xapian partitions leads to contention and excessive FD/space use. So set a smaller default but continue allowing user-specified values to bump this up. --- I noticed korg had lots of partitions, which seems like overkill and wastes FDs, at least. repartitioning will be a different step. lib/PublicInbox/V2Writable.pm | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm index a8c33ef..c504651 100644 --- a/lib/PublicInbox/V2Writable.pm +++ b/lib/PublicInbox/V2Writable.pm @@ -23,7 +23,14 @@ use IO::Handle; # an estimate of the post-packed size to the raw uncompressed size my $PACKING_FACTOR = 0.4; -# assume 2 cores if GNU nproc(1) is not available +# SATA storage lags behind what CPUs are capable of, so relying on +# nproc(1) can be misleading and having extra Xapian partions is a +# waste of FDs and space. It can also lead to excessive IO latency +# and slow things down. Users on NVME or other fast storage can +# use the NPROC env or switches in our script/public-inbox-* programs +# to increase Xapian partitions. +our $NPROC_MAX_DEFAULT = 4; + sub nproc_parts ($) { my ($creat_opt) = @_; if (ref($creat_opt) eq 'HASH') { @@ -32,7 +39,14 @@ sub nproc_parts ($) { } } - my $n = int($ENV{NPROC} || `nproc 2>/dev/null` || 2); + my $n = $ENV{NPROC}; + if (!$n) { + chomp($n = `nproc 2>/dev/null`); + # assume 2 cores if GNU nproc(1) is not available + $n = 2 if !$n; + $n = $NPROC_MAX_DEFAULT if $NPROC_MAX_DEFAULT > 4; + } + # subtract for the main process and git-fast-import $n -= 1; $n < 1 ? 1 : $n; -- EW