From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id AF5A71F462; Thu, 6 Jun 2019 22:29:41 +0000 (UTC) Date: Thu, 6 Jun 2019 22:29:38 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: Re: how's memory usage on public-inbox-httpd? Message-ID: <20190606222938.y7nt3uankntkktly@dcvr> References: <20181201194429.d5aldesjkb56il5c@dcvr> <20190606190455.GA17362@chatter.i7.local> <20190606203752.7wpdla5ynemjlshs@dcvr> <20190606214509.GA4087@chatter.i7.local> <20190606221009.y4fe2e2rervvq3z4@dcvr> <20190606221904.GB4087@chatter.i7.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190606221904.GB4087@chatter.i7.local> List-Id: Konstantin Ryabitsev wrote: > On Thu, Jun 06, 2019 at 10:10:09PM +0000, Eric Wong wrote: > > > > All those endpoints should detect backpressure from a slow > > > > client (varnish/nginx in your case) using the ->getline method. > > > > > > Wouldn't that spike up and down? The size I'm seeing stays pretty constant > > > without any significant changes across requests. > > > > Nope. That's the thing with glibc malloc not wanting to trim > > the heap for good benchmarks. > > > > You could also try starting with MALLOC_MMAP_THRESHOLD_=131072 > > in env (or some smaller/larger number in bytes) to force it to > > use mmap in more cases instead of sbrk. > > I've restarted the process and I'm running mmap -x $PID | tail -1 on it once > a minute. I'll try to collect this data for a while and see if I can notice > significant increases and correlate that with access logs. From the first > few minutes of running I see: > > Thu Jun 6 22:06:03 UTC 2019 > total kB 298160 102744 96836 > Thu Jun 6 22:07:03 UTC 2019 > total kB 355884 154968 147664 > Thu Jun 6 22:08:03 UTC 2019 > total kB 355884 154980 147664 > Thu Jun 6 22:09:03 UTC 2019 > total kB 359976 156788 148336 > Thu Jun 6 22:10:03 UTC 2019 > total kB 359976 156788 148336 > Thu Jun 6 22:11:03 UTC 2019 > total kB 359976 156788 148336 > Thu Jun 6 22:12:03 UTC 2019 > total kB 365464 166612 158160 > Thu Jun 6 22:13:03 UTC 2019 > total kB 366884 167908 159456 > Thu Jun 6 22:14:03 UTC 2019 > total kB 366884 167908 159456 > Thu Jun 6 22:15:03 UTC 2019 > total kB 366884 167908 159456 Would also be good to correlate that to open sockets, too. (168M is probably normal for 64-bit, I'm still on 32-bit and its <100M). I'm not happy with that memory use, even; but it's better than gigabytes. > > Without concurrent connections; I can't see that happening > > unless there's a single message which is gigabytes in size. I'm > > already irked that Email::MIME requires slurping entire emails > > into memory; but it should not be using more than one > > Email::MIME object in memory-at-a-time for a single client. > > > > Anything from varnish/nginx logs can't keep up for some reason? > > Speaking of logs, I did notice that even though we're passing -1 > /var/log/public-inbox/httpd.out.log, that file stays empty. There's > nttpd.out.log there, which is non-empty, so that's curious: > > # ls -ahl > total 2.6M > drwx------. 2 publicinbox publicinbox 177 Jun 6 22:05 . > drwxr-xr-x. 21 root root 4.0K Jun 2 03:12 .. > -rw-r--r--. 1 publicinbox publicinbox 0 Jun 6 22:05 httpd.out.log > -rw-r--r--. 1 publicinbox publicinbox 422K Jun 6 22:04 nntpd.out.log > -rw-r--r--. 1 publicinbox publicinbox 771K May 12 01:02 nntpd.out.log-20190512.gz > -rw-r--r--. 1 publicinbox publicinbox 271K May 19 03:45 nntpd.out.log-20190519.gz > -rw-r--r--. 1 publicinbox publicinbox 86K May 25 22:23 nntpd.out.log-20190526.gz > -rw-r--r--. 1 publicinbox publicinbox 1.1M Jun 2 00:52 nntpd.out.log-20190602 > > Could it be that stdout is not being written out and is just perpetually > buffered? That could explain the ever-growing size. There's no HTTP access logging by default. AccessLog::Timed is commented out in examples/public-inbox.psgi; and the example uses syswrite, even. Also, PublicInbox::Daemon definitely enables autoflush on STDOUT.