user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* how's memory use? May 2020 edition
@ 2020-05-12  8:37  5% Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-05-12  8:37 UTC (permalink / raw)
  To: meta

Hey all, if possible; I'd like to know the memory use of your
daemons (particularly -httpd), relevant pmap(1) (or equivalent)
output, and version of public-inbox in use.

I'm primarily a GNU/Linux user, so much of the following is
glibc-specific.  If you use another malloc (or libc) I'd also be
interested to know.  I expect my changes to be sympathetic to
the way all reasonable malloc implementations work.

Over the past few months, my processes have been using less RAM.
With 1.5.0 on 64-bit systems, I don't see -httpd go stay at or
above ~80MB RSS, 32-bit systems ~50 MB.  It might spike for
giant messages, but the change to preload encodings[1] seems
to let malloc to trim the top of the heap more consistently.

The biggest message in an inbox is still a factor, and I use
this to find the largest blob in a git repo:

    git cat-file --batch-check --batch-all-objects --unordered | \
      awk '$2 == "blob" && $3 > max { max = $3; oid = $1 } END {print oid, max}'

It's usually spam, which won't get served if
"public-inbox-learn spam"-ed away.

Linux-based systems with `procps' installed can use pmap to show
anonymous mappings (not sure about other OSes):

    pmap $PID | grep -w anon

On a "beefy" 64-bit workstation running -httpd, there's only
one giant anonymous region (and several smaller ones probably
not used by malloc):

    00055df38140000  63540K rw---   [ anon ]

Above is for the process which hosts http://czquwvybam4bgbro.onion/

On a lesser VM (still 64-bit) which hosts http://hjrcffqmbrq6wope.onion/,
the heap is split since the lack of space caused sbrk(2) to fail
and forced malloc to use mmap(2) to create a new (sub) heap:

    00005575d3d4a000  30616K rw---   [ anon ]
    00005575d5b30000  13852K rw---   [ anon ]

glibc malloc defaults to a sliding window for mmap, so messages
which are beyond that window won't risk fragmenting the main
heap for their in-memory representation.

For the curious, the Linux mallopt(3) manpage also documents
environment variables which can be used to set a fixed mmap
window, trim threshold, and several other malloc knobs.
However, one of my goals is to get things working as well as
possible out-of-the-box so users won't need to fiddle with
knobs :>

-nntpd uses significantly less memory than -httpd since it:

1) doesn't split MIME parts
2) doesn't decode quoted-printable or base64
3) doesn't do character set conversions

STARTTLS or NNTPS for OpenSSL requires a significant amount of
per-socket memory, though I'm not sure how many NNTP readers
there are and if they use TLS.


[1] - https://public-inbox.org/meta/20200508015901.GA27432@dcvr/
      ("www: preload: load all encodings at startup")

^ permalink raw reply	[relevance 5%]

* [PATCH] www: preload: load all encodings at startup
  @ 2020-05-08  1:59 14%   ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-05-08  1:59 UTC (permalink / raw)
  To: meta

Eric Wong <e@yhbt.net> wrote:
> Eric Wong <e@yhbt.net> wrote:
> > For long-lived daemons, perform immortal allocations as early as
> > possible to reduce the likelyhood of heap fragmentation due to
> > mixed-lifetime allocations happening once the process is fully
> > loaded and serving requests, since per-request allocations
> > should all be short-lived.
> 
> Encode also loads lazily...

-----------8<---------
Subject: [PATCH] www: preload: load all encodings at startup

Encode lazy-loads encodings on an as-needed basis.  This is
great for short-lived programs, but leads to fragmentation in
long-lived daemons where immortal allocations can get
interleaved with short-lived, per-request allocations.

Since we have no idea which encodings will be needed when
there's a constant flow of incoming mail, just preload
everything available at startup.
---
 lib/PublicInbox/WWW.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 275e509f..3a428218 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -141,6 +141,12 @@ sub call {
 # fragmentation since common allocators favor a large contiguous heap.
 sub preload {
 	my ($self) = @_;
+
+	# populate caches used by Encode internally, since emails
+	# may show up with any encoding.
+	require Encode;
+	Encode::find_encoding($_) for Encode->encodings(':all');
+
 	require PublicInbox::ExtMsg;
 	require PublicInbox::Feed;
 	require PublicInbox::View;

^ permalink raw reply related	[relevance 14%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-03-19  8:32     [PATCH 0/6] daemon: reduce fragmentation via preload Eric Wong
2020-04-21  8:52     ` Encode preloading Eric Wong
2020-05-08  1:59 14%   ` [PATCH] www: preload: load all encodings at startup Eric Wong
2020-05-12  8:37  5% how's memory use? May 2020 edition Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).