* [PATCH 7/7] view: more culling for search threads
2019-01-10 21:35 7% [PATCH 0/7] psgi: more memory reductions Eric Wong
@ 2019-01-10 21:35 6% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2019-01-10 21:35 UTC (permalink / raw)
To: meta
{mapping} overhead is now down to ~1.3M at the end of
a giant thread from hell.
---
lib/PublicInbox/Inbox.pm | 5 +++--
lib/PublicInbox/SearchThread.pm | 5 +++++
lib/PublicInbox/View.pm | 10 ++++++++--
3 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 73f5761..d57e46d 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -302,8 +302,9 @@ sub smsg_by_mid ($$) {
my ($self, $mid) = @_;
my $srch = search($self) or return;
# favor the Message-ID we used for the NNTP article number:
- my $num = mid2num($self, $mid);
- defined $num ? $srch->lookup_article($num) : undef;
+ defined(my $num = mid2num($self, $mid)) or return;
+ my $smsg = $srch->lookup_article($num) or return;
+ PublicInbox::SearchMsg::psgi_cull($smsg);
}
sub msg_by_mid ($$;$) {
diff --git a/lib/PublicInbox/SearchThread.pm b/lib/PublicInbox/SearchThread.pm
index be29098..931bd57 100644
--- a/lib/PublicInbox/SearchThread.pm
+++ b/lib/PublicInbox/SearchThread.pm
@@ -53,6 +53,11 @@ sub _add_message ($$) {
my $this = _get_cont_for_id($id_table, $smsg->{mid});
$this->{smsg} = $smsg;
+ # saves around 4K across 1K messages
+ # TODO: move this to a more appropriate place, breaks tests
+ # if we do it during psgi_cull
+ delete $smsg->{num};
+
# B. For each element in the message's References field:
defined(my $refs = $smsg->{references}) or return;
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 5ddb842..cd125e0 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -219,7 +219,10 @@ sub index_entry {
$rv .= _th_index_lite($mid_raw, \$irt, $id, $ctx);
my @tocc;
my $ds = $smsg->ds; # for v1 non-Xapian/SQLite users
- my $mime = delete $smsg->{mime}; # critical to memory use
+ # deleting {mime} is critical to memory use,
+ # the rest of the fields saves about 400K as we iterate across 1K msgs
+ my ($mime) = delete @$smsg{qw(mime ds ts blob subject)};
+
my $hdr = $mime->header_obj;
my $from = _hdr_names_html($hdr, 'From');
obfuscate_addrs($obfs_ibx, $from) if $obfs_ibx;
@@ -311,7 +314,10 @@ sub _th_index_lite {
my $nr_s = 0;
my $siblings;
if (my $smsg = $node->{smsg}) {
- ($$irt) = (($smsg->{references} || '') =~ m/<([^>]+)>\z/);
+ # delete saves about 200KB on a 1K message thread
+ if (my $refs = delete $smsg->{references}) {
+ ($$irt) = ($refs =~ m/<([^>]+)>\z/);
+ }
}
my $irt_map = $mapping->{$$irt} if defined $$irt;
if (defined $irt_map) {
--
EW
^ permalink raw reply related [relevance 6%]
* [PATCH 0/7] psgi: more memory reductions
@ 2019-01-10 21:35 7% Eric Wong
2019-01-10 21:35 6% ` [PATCH 7/7] view: more culling for search threads Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2019-01-10 21:35 UTC (permalink / raw)
To: meta
While of these are as significant as the patch avoid inadvertant
MIME objects storage in threads(*), they add up to some meaningful
reductions and can make it easier for memory-starved VPS to serve
serve public-inboxes.
I've diffed output of /T/, /t/ and &x=t endpoints of various HTML
pages before and after without finding differences.
There's definitely more that can be done in this area, though...
Sprinkling Devel::Size::total_size calls in various places (mostly
->getline iterators/callbacks ) was instrumental in the development
of these patches.
(*) https://public-inbox.org/meta/20190108004606.23760-1-e@80x24.org/
("view: stop storing all MIME objects on large threads")
Eric Wong (7):
httpd: remove psgix.harakiri reference
searchmsg: get rid of termlist scanning for mid
searchmsg: remove Xapian::Document field
searchview: drop unused {seen} hashref
searchmsg: remove unused fields for PSGI in Xapian results
over: cull unneeded fields for get_thread
view: more culling for search threads
lib/PublicInbox/HTTPD.pm | 1 -
lib/PublicInbox/Inbox.pm | 5 ++--
lib/PublicInbox/Over.pm | 19 ++++++++-----
lib/PublicInbox/SearchIdx.pm | 6 ++--
lib/PublicInbox/SearchMsg.pm | 49 ++++++++++++++++-----------------
lib/PublicInbox/SearchThread.pm | 5 ++++
lib/PublicInbox/SearchView.pm | 1 -
lib/PublicInbox/View.pm | 10 +++++--
t/search.t | 10 ++++---
9 files changed, 60 insertions(+), 46 deletions(-)
--
EW
^ permalink raw reply [relevance 7%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2019-01-10 21:35 7% [PATCH 0/7] psgi: more memory reductions Eric Wong
2019-01-10 21:35 6% ` [PATCH 7/7] view: more culling for search threads Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).