From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id BFEBB1F461; Sun, 8 Sep 2019 10:52:43 +0000 (UTC) Date: Sun, 8 Sep 2019 10:52:43 +0000 From: Eric Wong To: meta@public-inbox.org Subject: trying to figure out 100% CPU usage in nntpd... Message-ID: <20190908105243.GA15983@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190908104518.11919-1-e@80x24.org> List-Id: I've been having trouble reproducing this bug (but maybe summer weather has been taking its toll, certainly has on my HW). And it won't pin the process to any client, so other clients continue to be served while it happens. It'll just end up being a bunch of epoll_wait( ... timeout=0) calls (and no other syscalls) which can drive up power bills :< So this doesn't show up with .onion instances since those only see connections from tor running on localhost; either. The problem only happens on news.public-inbox.org, so it could have something to do with less-reliable clients triggering our long_response code path. While debugging this, I've isolated it to the last ->requeue call in NNTP.pm event_step by using write(2) syscalls to /dev/null so I can strace things as-needed. Showing "git diff -w" output so the entire old event_step sub is shown. The problem is somehow, $$rbuf has "\n" in it, so there's no do_read call happening. Yet the while loop regexp is never consuming $$rbuf... So I'm pretty sure the old regexp is bogus, and the patch: "nntp: regexp always consumes rbuf if "\n" exists" will fix things. diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 7c3f68a..ef050b9 100644 --- a/lib/PublicInbox/NNTP.pm +++ b/lib/PublicInbox/NNTP.pm @@ -976,44 +976,54 @@ sub err ($$;@) { sub out ($$;@) { my ($self, $fmt, @args) = @_; printf { $self->{nntpd}->{out} } $fmt."\n", @args; } +use Data::Dumper; +my $null; +BEGIN { + open $null, '>>', '/dev/null' or die; + $null->autoflush(1); +}; + # callback used by PublicInbox::DS for any (e)poll (in/out/hup/err) sub event_step { my ($self) = @_; return unless $self->flush_write && $self->{sock}; update_idle_time($self); # only read more requests if we've drained the write buffer, # otherwise we can be buffering infinitely w/o backpressure my $rbuf = $self->{rbuf} // (\(my $x = '')); my $r = 1; if (index($$rbuf, "\n") < 0) { my $off = bytes::length($$rbuf); $r = $self->do_read($rbuf, LINE_MAX, $off) or return; } while ($r > 0 && $$rbuf =~ s/\A[ \t\r\n]*([^\r\n]*)\r?\n//) { my $line = $1; return $self->close if $line =~ /[[:cntrl:]]/s; my $t0 = now(); my $fd = fileno($self->{sock}); $r = eval { process_line($self, $line) }; my $pending = $self->{wbuf} ? ' pending' : ''; out($self, "[$fd] %s - %0.6f$pending", $line, now() - $t0); } return $self->close if $r < 0; my $len = bytes::length($$rbuf); return $self->close if ($len >= LINE_MAX); $self->rbuf_idle($rbuf); update_idle_time($self); # maybe there's more pipelined data, or we'll have # to register it for socket-readiness notifications - $self->requeue unless $self->{wbuf}; + unless ($self->{wbuf}) { + print $null Dumper({r=>$r, rbuf=>$rbuf}); + $self->requeue; + } } sub not_idle_long ($$) {