PUBLIC-INBOX-TUNING(7) public-inbox user manual PUBLIC-INBOX-TUNING(7)
NAME
public-inbox-tuning - tuning public-inbox
DESCRIPTION
public-inbox intends to support a wide variety of hardware. While we
strive to provide the best out-of-the-box performance possible, tuning
knobs are an unfortunate necessity in some cases.
1. New inboxes: public-inbox-init -V2
2. Optional Inline::C use
3. Performance on rotational hard disk drives
4. Btrfs (and possibly other copy-on-write filesystems)
5. Performance on solid state drives
6. Read-only daemons
7. Other OS tuning knobs
8. Scalability to many inboxes
9. public-inbox-cindex --join performance
10. public-inbox-clone with shared object stores
New inboxes: public-inbox-init -V2
If you're starting a new inbox (and not mirroring an existing one), the
-V2 requires DBD::SQLite, but is orders of magnitude more scalable than
the original "-V1" format.
Optional Inline::C use
Our optional use of Inline::C speeds up subprocess spawning from large
daemon processes.
To enable Inline::C, either set the "PERL_INLINE_DIRECTORY" environment
variable to point to a writable directory, or create
"~/.cache/public-inbox/inline-c" for any user(s) running public-inbox
processes.
If libgit2 development files are installed and Inline::C is enabled
(described above), per-inbox "git cat-file --batch" processes are
replaced with a single perl(1) process running
"PublicInbox::Lg2::gcf2_loop" in read-only daemons. libgit2 use will
be available in public-inbox 1.7.0+
More (optional) Inline::C use will be introduced in the future to lower
memory use and improve scalability.
Note: Inline::C is required for lei(1), but not public-inbox-*
Performance on rotational hard disk drives
Random I/O performance is poor on rotational HDDs. Xapian indexing
performance degrades significantly as DBs grow larger than available
RAM. Attempts to parallelize random I/O on HDDs leads to pathological
slowdowns as inboxes grow.
While "-V2" introduced Xapian shards as a parallelization mechanism for
SSDs, enabling "publicInbox.indexSequentialShard" repurposes sharding
as a mechanism to reduce the kernel page cache footprint when indexing
on HDDs.
Initializing a mirror with a high "--jobs" count to create more shards
(in "-V2" inboxes) will keep each shard smaller and reduce its kernel
page cache footprint. Keep in mind excessive sharding imposes a
performance penalty for read-only queries.
Users with large amounts of RAM are advised to set a large value for
"publicinbox.indexBatchSize" as documented in public-inbox-index(1).
"dm-crypt" users on Linux 4.0+ are advised to try the
"--perf-same_cpu_crypt" "--perf-submit_from_crypt_cpus" switches of
cryptsetup(8) to reduce I/O contention from kernel workqueue threads.
Btrfs (and possibly other copy-on-write filesystems)
btrfs(5) performance degrades from fragmentation when using large
databases and random writes. The Xapian + SQLite indices used by
public-inbox are no exception to that.
public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite
indices on btrfs to achieve acceptable performance (even on SSD).
Disabling copy-on-write also disables checksumming, thus "raid1" (or
higher) configurations may be corrupt after unsafe shutdowns.
Fortunately, these SQLite and Xapian indices are designed to be
recoverable from git if missing.
Disabling CoW does not prevent all fragmentation. Large values of
"publicInbox.indexBatchSize" also limit fragmentation during the
initial index.
Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.
Snapshots use CoW despite our efforts to disable it, resulting in
fragmentation.
filefrag(8) can be used to monitor fragmentation, and "btrfs filesystem
defragment -fr $INBOX_DIR" may be necessary.
Large filesystems benefit significantly from the "space_cache=v2" mount
option documented in btrfs(5).
Older, non-CoW filesystems generally work well out of the box for our
Xapian and SQLite indices.
Performance on solid state drives
While SSD read performance is generally good, SSD write performance
degrades as the drive ages and/or gets full. Issuing "TRIM" commands
via fstrim(8) or similar is required to sustain write performance.
Users of the Flash-Friendly File System F2FS
<https://en.wikipedia.org/wiki/F2FS> may benefit from optimizations
found in SQLite 3.21.0+. Benchmarks are greatly appreciated.
Read-only daemons
public-inbox-httpd(1), public-inbox-imapd(1), and public-inbox-nntpd(1)
are all designed for C10K (or higher) levels of concurrency from a
single process. SMP systems may use "--worker-processes=NUM" as
documented in public-inbox-daemon(8) for parallelism.
The open file descriptor limit ("RLIMIT_NOFILE", "ulimit -n" in sh(1),
"LimitNOFILE=" in systemd.exec(5)) may need to be raised to accommodate
many concurrent clients.
Transport Layer Security (HTTPS, IMAPS, NNTPS, or via STARTTLS)
significantly increases memory use of client sockets, be sure to
account for that in capacity planning. Switching
<https://public-inbox.org/> to a 64-bit userspace reduced libcrypto.so
(OpenSSL) CPU use by ~70%.
Bursts of small object allocations late in process life contribute to
fragmentation of the heap due to arenas (slabs) used internally by
Perl. glibc malloc users should use "MALLOC_MMAP_THRESHOLD_=131072" to
reduce fragmentation from the sliding mmap window. On 64-bit systems,
jemalloc (tested as an LD_PRELOAD on GNU/Linux) reduces fragmentation
at the expense of VM space. 32-bit systems may be better off sticking
with glibc and MALLOC_MMAP_THRESHOLD_.
Other OS tuning knobs
Linux users: the "sys.vm.max_map_count" sysctl may need to be increased
if handling thousands of inboxes (with public-inbox-extindex(1)) to
avoid out-of-memory errors from git.
Other OSes may have similar tuning knobs (patches appreciated).
Scalability to many inboxes
public-inbox-extindex(1) allows any number of public-inboxes to share
the same Xapian indices.
git 2.33+ startup time is orders of magnitude faster and uses less
memory when dealing with thousands of alternates required for thousands
of inboxes with public-inbox-extindex(1).
Frequent packing (via git-gc(1)) both improves performance and reduces
the need to increase "sys.vm.max_map_count".
public-inbox-cindex --join performance
A C++ compiler and the Xapian development files makes "--join" or
"--join=aggressive" orders of magnitude faster in
public-inbox-cindex(1). On Debian-based systems this is
"libxapian-dev". RPM-based distros have these in "xapian-core-devel"
or "xapian14-core-libs". *BSDs typically package development files
together with runtime libraries, so the "xapian" or "xapian-core"
package will already have the development files.
public-inbox-clone with shared object stores
When mirroring manifests with many forks using the same objstore, git
2.41+ is highly recommended for performance as we automatically use the
"fetch.hideRefs" feature to speed up negotiation.
CONTACT
Feedback encouraged via plain-text mail to
<mailto:meta@public-inbox.org>
Information for *BSDs and non-traditional filesystems especially
welcome.
Our archives are hosted at <https://public-inbox.org/meta/>,
<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>,
and other places
COPYRIGHT
Copyright all contributors <mailto:meta@public-inbox.org>
License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
public-inbox.git 1993-10-02 PUBLIC-INBOX-TUNING(7)