From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 5235E1F463 for ; Wed, 11 Sep 2019 09:44:42 +0000 (UTC) Received: by mail-qt1-x829.google.com with SMTP id c17so5108240qtv.9 for ; Wed, 11 Sep 2019 02:44:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ifg4rHfYSDY6+r16jDaSCdeO/8khMSq4K8hO74wjiHc=; b=Sje84URM02iBKtwfk1pXpXq4i5CKXh931taI7Z6n3E6HeZ6sMLx8KT8us/biA0kl7C FnxNh8x/iCGbGCm/AvJUki5quWggSp4SvUsVxyNEMDJxSed+S+PHl6uullgJWZGZN3Cu R1Y4dMO7GOUOY6zPZvG/MYx0AuO1XVV0Iwttc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=ifg4rHfYSDY6+r16jDaSCdeO/8khMSq4K8hO74wjiHc=; b=DkgVDSWOmM4LTEUWU8rmqyg2bZ4TbO6f9yuozDelU/eb/AGmlpIHglRldb2rfHhLCZ fN+a6hLr2hw7KiL1HrF5ViP1tMS30EEkBFHOYbDZTaHGxMnY0niG1/0tsFwl4sdhQeqA C8hBX/KFTg0Qdg6zbaH+kjOLSyvUaujzfc+Ca7lVp34JyPRvWLB+CJIB3jXKk/yBdVxV aZ20/tLy32jH2QQLz2I0DTvMU1pHNautqbnRenrkqdsyM11m+CbYEgVe7d10vp20qoVo eklc6uPXa3rgTLuWtRRW0lCGdClaKsXgnZnVV7H5SKpxqe6pXhIFKq6Z4//1LQD68Zk7 kxHQ== X-Gm-Message-State: APjAAAWubOj8eQEDoZKHipQ171imqhhby54DBneGlOlX+p26eF3raPtQ 04vYIKEhQ1HszT8f2zZugu+rlw== X-Google-Smtp-Source: APXvYqwoDRB7TxmBHojNkT+KXDEV0U4Kv4sTTSziJbCe1a4qsIYH9ke9x41EuOHj6vuKSOKGxRYsqg== X-Received: by 2002:ac8:4796:: with SMTP id k22mr32539010qtq.333.1568195080388; Wed, 11 Sep 2019 02:44:40 -0700 (PDT) Received: from pure.paranoia.local ([87.101.92.157]) by smtp.gmail.com with ESMTPSA id d7sm6105703qto.71.2019.09.11.02.44.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2019 02:44:39 -0700 (PDT) Date: Wed, 11 Sep 2019 05:44:35 -0400 From: Konstantin Ryabitsev To: Eric Wong Cc: meta@public-inbox.org Subject: Re: trying to figure out 100% CPU usage in nntpd... Message-ID: <20190911094435.GB3548@pure.paranoia.local> Mail-Followup-To: Eric Wong , meta@public-inbox.org References: <20190908104518.11919-1-e@80x24.org> <20190908105243.GA15983@dcvr> <20190909100500.GA9452@pure.paranoia.local> <20190909175340.u5aq4ztfzukko7zb@dcvr> <20190910083820.GA8018@pure.paranoia.local> <20190910181224.urhyoo6av7mhjs67@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190910181224.urhyoo6av7mhjs67@dcvr> User-Agent: Mutt/1.12.1 (2019-06-15) List-Id: On Tue, Sep 10, 2019 at 06:12:24PM +0000, Eric Wong wrote: > > It does seem like there's perhaps a leak somewhere? > > Probably. Not seeing any of that on my (smaller) instances; > but those -httpd haven't been restarted in weeks/months. > > The "PerlIO_" prefix is created from open(..., '+>', undef), so > it either has to be for: > > 1. POST bodies for git http-backend > > 1. git cat-file --batch-check stderr > > 3. experimental ViewVCS|SolverGit which isn't configured on lore :) > > = git-http-backend > > Looking at the size (984), PerlIO_* prefix and proximity of FD > numbers, I think those lines above are git-http-backend POST body. Pretty sure that's the culprit. This is how we replicate between lore.kernel.org to erol.kernel.org: - once a minute, two nodes that are behind erol.kernel.org grab the newest manifest.js.gz - if there are changes, each updated repository is pulled from lore.kernel.org, so if there were 5 repository updates, there would be 10 "git pull" requests I switched the replication nodes to pull once every 5 minutes instead of once every minute and I see a direct correlation between when those processes run and the number of broken pipes and "/tmp/PerlIO_* (deleted)" processes showing up and hanging around. Not every run produces these, but increase spikes come in roughly 5-minute intervals. On the first run after public-inbox-httpd restart, the correlation is direct: this is from one of the mirroring nodes: [82807] 2019-09-11 09:30:02,044 - INFO - Updating 18 repos from https://lore.kernel.org this is on lore.kernel.org after the run is completed: # ls -al /proc/{16212,16213,16214,16215}/fd | grep deleted | wc -l 36 > Any git-http-backend stuck from people fetching/cloning? No, all git processes seem to exit cleanly on both ends. > This is -httpd writing to varnish, still, right? We bypass varnish for git requests, since this is not generally useful. Nginx goes straight to public-inbox-httpd for those. I did run some updates on lore.kernel.org on Thursday, including kernel (3.10.0-957.27.2), nginx (1.16.1) and public-inbox updates. For the latter, it went from f4f0a3be to what was latest master at the time (d327141c). Hope this helps, and thanks for your help! -K