user/dev discussion of public-inbox itself
 help / color / Atom feed
From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: Re: how's memory usage on public-inbox-httpd?
Date: Thu, 6 Jun 2019 22:10:09 +0000
Message-ID: <20190606221009.y4fe2e2rervvq3z4@dcvr> (raw)
In-Reply-To: <20190606214509.GA4087@chatter.i7.local>

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Thu, Jun 06, 2019 at 08:37:52PM +0000, Eric Wong wrote:
> > Do you have commit 7d02b9e64455831d3bda20cd2e64e0c15dc07df5?
> > ("view: stop storing all MIME objects on large threads")
> > That was most significant.
> 
> Yes. We're running 743ac758 with a few cherry-picked patches on top of that
> (like epoch roll-over fix).
> 
> > Otherwise it's probably a combination of several things...
> > httpd and nntpd both supports streaming, arbitrarily  large
> > endpoints (all.mbox.gz, and /T/, /t/, /t.mbox.gz threads with
> > thousands of messages, giant NNTP BODY/ARTICLE ranges).
> > 
> > All those endpoints should detect backpressure from a slow
> > client (varnish/nginx in your case) using the ->getline method.
> 
> Wouldn't that spike up and down? The size I'm seeing stays pretty constant
> without any significant changes across requests.

Nope.  That's the thing with glibc malloc not wanting to trim
the heap for good benchmarks.

You could also try starting with MALLOC_MMAP_THRESHOLD_=131072
in env (or some smaller/larger number in bytes) to force it to
use mmap in more cases instead of sbrk.

> > Also, are you only using the default of -W/--worker-process=1
> > on a 16-core machine?  Just checked public-inbox-httpd(8), the
> > -W switch is documented :)  You can use SIGTTIN/TTOU to
> > increase, decrease workers w/o restarting, too.
> 
> D'oh, yes... though it's not been a problem yet. :) I'm not sure I want to
> bump that up, though, if that means we're going to have multiple 19GB-sized
> processes instead of one. :)

You'd probably end up with several smaller processes totalling
up to 19GB.

In any case, killing individual workers with QUIT/INT/TERM is
graceful and won't drop connections if memory use on one goes
awry.

> > Do you have any stats on the number of simultaneous connections
> > public-inbox-httpd/nginx/varnish handles (and logging of that
> > info at peek)?  (perhaps running "ss -tan" periodically)(*)
> 
> We don't collect that info, but I'm not sure it's the number of concurrent
> connections that's the culprit, as there is no fluctuation in RSS size based
> on the number of responses.

Without concurrent connections; I can't see that happening
unless there's a single message which is gigabytes in size.  I'm
already irked that Email::MIME requires slurping entire emails
into memory; but it should not be using more than one
Email::MIME object in memory-at-a-time for a single client.

Anything from varnish/nginx logs can't keep up for some reason?

Come to think of it, nginx proxy buffering might be redundant
and even harmful if varnish is already doing it.

Perhaps "proxy_buffering off" in nginx is worth trying...
I use yahns instead of nginx, which does lazy buffering (but
scary Ruby experimental server warning applies :x).

Last I checked: nginx is either buffer-in-full-before-first-byte
or no buffering at all (which is probably fine with varnish).

> To answer the questions in your follow-up:
> 
> It would appear to be all in anon memory. Mem_usage [1] reports:
> 
> # ./Mem_usage 18275
> Backed by file:
>  Executable                r-x  16668
>  Write/Exec (jump tables)  rwx  0
>  RO data                   r--  106908
>  Data                      rw-  232
>  Unreadable                ---  94072
>  Unknown                        0
> Anonymous:
>  Writable code (stack)     rwx  0
>  Data (malloc, mmap)       rw-  19988892
>  RO data                   r--  0
>  Unreadable                ---  0
>  Unknown                        12
> 
> I've been looking at lsof -p of that process and I see sqlite and xapian
> showing up and disappearing. The lkml ones are being accessed almost all the
> time, but even there I see them showing up with different FD entries, so
> they are being closed and reopened properly.

Yep, that's expected.  It's to better detect DB changes in case
of compact/copydatabase/xcpdb for Xapian.

Might not be necessary strictly necessary for SQLite, but maybe
somebody could be running VACUUM offline; then flock-ing
inbox.lock and rename-ing it into place or something (and
retrying/restarting the VACUUM if out-of-date, seq_lock style).

  reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-01 19:44 Eric Wong
2019-06-06 19:04 ` Konstantin Ryabitsev
2019-06-06 20:37   ` Eric Wong
2019-06-06 21:45     ` Konstantin Ryabitsev
2019-06-06 22:10       ` Eric Wong [this message]
2019-06-06 22:19         ` Konstantin Ryabitsev
2019-06-06 22:29           ` Eric Wong
2019-06-10 10:09             ` [RFC] optionally support glibc malloc_info via SIGCONT Eric Wong
2019-06-09  8:39         ` how's memory usage on public-inbox-httpd? Eric Wong
2019-06-12 17:08           ` Eric Wong
2019-06-06 20:54   ` Eric Wong
2019-10-16 22:10   ` Eric Wong
2019-10-18 19:23     ` Konstantin Ryabitsev
2019-10-19  0:11       ` Eric Wong
2019-10-22 17:28         ` Konstantin Ryabitsev
2019-10-22 19:11           ` Eric Wong
2019-10-28 23:24         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190606221009.y4fe2e2rervvq3z4@dcvr \
    --to=e@80x24.org \
    --cc=konstantin@linuxfoundation.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git