From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 755941F54E for ; Sat, 10 Sep 2022 08:18:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1662797931; bh=CnwTe+15wEsDmqJYdFcim2nj4Y3ZhVpawmFGraf9BKs=; h=From:To:Subject:Date:From; b=XSzcP69hFA8Szn9WVazcS0VJGf20Y/yq2/GhcCt/zGexDExd1OqaTbX78wQ70aXMt 5S9XnzY0z8RSmB+SHNwg7VAnFJ7BWyoJNfxCQLm+4Prgx6kIfMHYQSobU8LGcCZHvK 7fBrL+phYQSskwNBYRGpmgYf9Fcctk8+JPB0ch60= From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 00/38] www: reduce memory usage Date: Sat, 10 Sep 2022 08:16:51 +0000 Message-Id: <20220910081729.2011934-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: I'm over the moon with this series since this drops dozens of megabytes of scratchpad use while providing tiny speedups along the way. For me, that's a 10-15% reduction in memory use under public-inbox-netd w/ mwrap-perl[1] overhead. This scratchpad use has been bothering me for a long time (since I fixed all the other leaks, including one in the core Encode module). There's more coming, of course, but this series is big enough and shown good results on https://yhbt.net/lore/ Also, it also provides a good pattern/guidance going forward on how to efficiently implement future features. I actually started out in this series trying to buffer everything using gzip to avoid space-wasting uncompressed strings living in memory. Unfortunately, Compress::Raw::Zlib::deflate calls proved too expensive to call frequently for short strings. Going back to `.=' ops via a ->zadd method brought back some of the speed while consolidating the scratchpad to a single place; but I didn't like the performance regression. I kept those detours in the history presented here since I figure it's worth showing Finally relying on PerlIO::scalar with print|say ops proved to be the fastest since OO ->method dispatch overhead can be avoided and there's no scratchpad use at all from these, either. As before, we still call C:R:Z:deflate after every full message and flush to the socket periodically. I may even consider using PerlIO::gzip in the future, but that's a non-standard module. However, I definitely took inspiration from it since I saw that it would buffer uncompressed data into memory before compressing it. There's also a few small simplifications and speedups I noticed along the way, and several other bugfixes I posted independently while working on this series. [1] I used https://80x24.org/mwrap-perl.git to check malloc use Eric Wong (38): xt: fold perf-obfuscate into perf-msgview, future-proof www: gzip_filter: implicitly flush {obuf} on zmore/zflush view: rework single message page to compress earlier www_atom_stream: require 200 response www_stream: aresponse assumes 200, too www_text: reduce parameter passing for response header viewvcs: use shorter and simpler ctx->html_done www_listing: consolidate some ->zmore dispatches www_listing: avoid unnecessary work for common cases www: viewdiff: use return value for diff_hunk view: simplify _parent_headers view: eml_entry: reduce manipulation of ctx->{obuf} gzip_filter: ->translate can reuse zmore/zflush view: remove multipart_text_as_html view: reduce subroutine calls for submsg_hdr view: attach_link: reduce obuf manipulation viewdiff: reuse existing string in diff_before_or_after view: _th_index_lite: avoid one s///, improve symmetry view: _th_index_lite: use `//' defined-or op view: reduce ascii_html calls and {obuf} use view: html_footer: golf out a few lines view: html_footer: remove obuf dependency view: html_footer: avoid escaping " in a few places viewdiff: diff_hunk: shorten conditionals, slightly view: switch a few things to ctx->zmore www: drop {obuf} use entirely, for now www: switch to zadd for the majority of buffering www: use PerlIO::scalar (zfh) for buffering viewdiff: diff_before_or_after: avoid extra capture viewdiff: diff_header: shorten function, slightly www_static: switch to `print $zfh', and optimize httpd/async: describe which ->write subs it can call translate: support multiple buffer args gzip_filter: write: use multi-arg translate feed: new_html_i: switch from zmore to `print $zfh' mbox*: use multi-arg ->translate and ->write www_listing: switch to `print $zfh' viewvcs: switch to `print $zfh' Documentation/mknews.perl | 3 +- MANIFEST | 1 - lib/PublicInbox/CompressNoop.pm | 4 +- lib/PublicInbox/Feed.pm | 12 +- lib/PublicInbox/GzipFilter.pm | 62 +++--- lib/PublicInbox/HTTPD/Async.pm | 9 +- lib/PublicInbox/Mbox.pm | 11 +- lib/PublicInbox/MboxGz.pm | 3 +- lib/PublicInbox/SearchView.pm | 8 +- lib/PublicInbox/View.pm | 312 ++++++++++++------------------- lib/PublicInbox/ViewDiff.pm | 115 +++++------- lib/PublicInbox/ViewVCS.pm | 17 +- lib/PublicInbox/WwwAtomStream.pm | 19 +- lib/PublicInbox/WwwListing.pm | 40 ++-- lib/PublicInbox/WwwStatic.pm | 32 ++-- lib/PublicInbox/WwwStream.pm | 23 ++- lib/PublicInbox/WwwText.pm | 35 ++-- t/psgi_v2.t | 4 +- xt/perf-msgview.t | 10 +- xt/perf-obfuscate.t | 66 ------- 20 files changed, 320 insertions(+), 466 deletions(-) delete mode 100644 xt/perf-obfuscate.t