From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 0F5421F461; Mon, 9 Sep 2019 17:53:41 +0000 (UTC) Date: Mon, 9 Sep 2019 17:53:41 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: Re: trying to figure out 100% CPU usage in nntpd... Message-ID: <20190909175340.u5aq4ztfzukko7zb@dcvr> References: <20190908104518.11919-1-e@80x24.org> <20190908105243.GA15983@dcvr> <20190909100500.GA9452@pure.paranoia.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190909100500.GA9452@pure.paranoia.local> List-Id: Konstantin Ryabitsev wrote: > There also was a weird problem a couple of days ago where one of the > httpd daemons started returning "Internal Server Error" to all requests. > Restarting public-inbox-httpd fixed the problem, but I am not sure how I > would troubleshoot the causes if it happens next time -- is there a way > to enable error logging for the httpd daemon? That's a new one... I haven't seen any problems from -httpd myself in ages. So -httpd could not handle requests at all? The daemons already spits errors to stderr which typically ends up in syslog via systemd. So, that's the first place to look (also "systemctl status $SERVICE"); anything in there? I can usually figure everything out from strace/lsof on a worker process and hitting it with some requests (SIGTTOU to decrement workers down to one). That said, out-of-FD/memory conditions might not always be logged correctly to stderr and we need to fix that. Also, right now the code considers git-cat-file to be reliable, but I guess it wouldn't be the case in disk failures and perhaps timeouts will be necessary. Maybe nginx/varnish logs would have something, too; but more likely syslog. Also, would be curious how memory usage improves for you with some of the new changes. I don't think I've exceeded 100MB/worker this year, but Email::MIME can be a pig with giant messages.