From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 4A2AE1F463; Fri, 13 Sep 2019 03:12:12 +0000 (UTC) Date: Fri, 13 Sep 2019 03:12:12 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: Re: httpd 502s [was: trying to figure out 100% CPU usage in nntpd...] Message-ID: <20190913031212.dkbtvj3ij6qqm6im@dcvr> References: <20190910083820.GA8018@pure.paranoia.local> <20190910181224.urhyoo6av7mhjs67@dcvr> <20190911022215.GA309@dcvr> <20190911102436.GA21959@pure.paranoia.local> <20190911171250.vqqpaeb7sn34hv3s@dcvr> <20190911173628.GA14147@pure.paranoia.local> <20190912000541.gikjoimbdeahh7lx@whir> <20190912024942.7dejzf47swpzkdux@dcvr> <20190912083503.GA10657@dcvr> <20190912113758.GB29277@pure.paranoia.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190912113758.GB29277@pure.paranoia.local> List-Id: Konstantin Ryabitsev wrote: > On Thu, Sep 12, 2019 at 08:35:03AM +0000, Eric Wong wrote: > > Eric Wong wrote: > > > One more thing, are you running any extra middlewares in the > > > .psgi file? Thanks. > > No, it's just vanilla what comes with the source. OK, and Perl 5.16.3 from CentOS 7? (4:5.16.3-294.el7_6 RPM) > > That's probably not it, I suspected the non-fileno path was > > being hit, but I just tested a debug change on top of > > b7cfd5aeff4b6b316b61b327af9c144776d77225 (branch: "unlink") > > ("tmpfile: support O_APPEND and use it in DS::tmpio") > > to fake the presence of a middleware wrapping psgi.input. > > I sent you a dump of lsof -p of all 4 processes after about 20 minutes > of running. For another data point, the daemon was running in > SELinux-permissive mode, to make sure unlinks aren't failing because of > any permission errors. It looks like there's a Perl reference leak (cycle) of some sort holding on to FDs, since you have lots of input files and pipes, yet only one established IPv4 connection. And the inodes encoded into the filenames don't point to the connected socket, even.... However, I'm not able to reproduce it on my CentOS 7 VM which has nginx 1.12.2. I don't think nginx is a factor in this since public-inbox-httpd is clearly not holding TCP sockets open, even. Not at all familiar with SELinux, but I'm just using the defaults CentOS comes with and running both nginx + public-inbox-httpd as a regular user. That "if (0..." GitHTTPBackend patch definitely isn't needed for testing anymore and only makes FD exhaustion happen sooner. > Let me know if you would like any further info. If there's a reference leak somewhere, this could also be part of the high memory use you showed us a few months ago. Dunno if you had many FDs back then. I could see about adding explicit close() calls in a few places, but that would make a corresponding memory leak harder-to-notice, even. I pushed out two patches to the "unlink" branch which may be able to reproduce the issue on your end (I see nothing out of the ordinary on my slow CentOS 7 VM or Debian machines/VMs) * [PATCH] t/httpd-corner: check for leaking FDs and pipes * [RFC] t/git-http-backend: add MANY_CLONE test # no pipes should be present in -httpd with -W0 prove -lv t/httpd-corner.t # unrelated note: there's 4 pipes from -W1 (the default), # but I think 2 can be closed, actually... GIANT_GIT_DIR=/path/to/git.git MANY_CLONE=1 prove -lv t/git-http-backend.t If those updated test cases can't reproduce the problem, can you reproduce this on erol or any other machines? perhaps with a different Perl? Thanks.