From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 55FC01F55B; Tue, 12 May 2020 08:37:34 +0000 (UTC) Date: Tue, 12 May 2020 08:37:34 +0000 From: Eric Wong To: meta@public-inbox.org Subject: how's memory use? May 2020 edition Message-ID: <20200512083734.GA13688@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline List-Id: Hey all, if possible; I'd like to know the memory use of your daemons (particularly -httpd), relevant pmap(1) (or equivalent) output, and version of public-inbox in use. I'm primarily a GNU/Linux user, so much of the following is glibc-specific. If you use another malloc (or libc) I'd also be interested to know. I expect my changes to be sympathetic to the way all reasonable malloc implementations work. Over the past few months, my processes have been using less RAM. With 1.5.0 on 64-bit systems, I don't see -httpd go stay at or above ~80MB RSS, 32-bit systems ~50 MB. It might spike for giant messages, but the change to preload encodings[1] seems to let malloc to trim the top of the heap more consistently. The biggest message in an inbox is still a factor, and I use this to find the largest blob in a git repo: git cat-file --batch-check --batch-all-objects --unordered | \ awk '$2 == "blob" && $3 > max { max = $3; oid = $1 } END {print oid, max}' It's usually spam, which won't get served if "public-inbox-learn spam"-ed away. Linux-based systems with `procps' installed can use pmap to show anonymous mappings (not sure about other OSes): pmap $PID | grep -w anon On a "beefy" 64-bit workstation running -httpd, there's only one giant anonymous region (and several smaller ones probably not used by malloc): 00055df38140000 63540K rw--- [ anon ] Above is for the process which hosts http://czquwvybam4bgbro.onion/ On a lesser VM (still 64-bit) which hosts http://hjrcffqmbrq6wope.onion/, the heap is split since the lack of space caused sbrk(2) to fail and forced malloc to use mmap(2) to create a new (sub) heap: 00005575d3d4a000 30616K rw--- [ anon ] 00005575d5b30000 13852K rw--- [ anon ] glibc malloc defaults to a sliding window for mmap, so messages which are beyond that window won't risk fragmenting the main heap for their in-memory representation. For the curious, the Linux mallopt(3) manpage also documents environment variables which can be used to set a fixed mmap window, trim threshold, and several other malloc knobs. However, one of my goals is to get things working as well as possible out-of-the-box so users won't need to fiddle with knobs :> -nntpd uses significantly less memory than -httpd since it: 1) doesn't split MIME parts 2) doesn't decode quoted-printable or base64 3) doesn't do character set conversions STARTTLS or NNTPS for OpenSSL requires a significant amount of per-socket memory, though I'm not sure how many NNTP readers there are and if they use TLS. [1] - https://public-inbox.org/meta/20200508015901.GA27432@dcvr/ ("www: preload: load all encodings at startup")