From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 4/4] nntp: reduce syscalls for ARTICLE and BODY
Date: Thu, 27 Jun 2019 22:51:48 +0000 [thread overview]
Message-ID: <20190627225148.9657-5-e@80x24.org> (raw)
In-Reply-To: <20190627225148.9657-1-e@80x24.org>
Chances are we already have extra buffer space following the
expensive LF => CRLF conversion that we can safely append an
extra CRLF in those places without incurring a copy of the
full string buffer.
While we're at it, document where our pain points are in terms
of memory usage, since tracking/controlling memory use isn't
exactly obvious in high-level languages.
Perhaps we should start storing messages in git as CRLF...
---
lib/PublicInbox/NNTP.pm | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 30d3dab6..5a886a3c 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -521,10 +521,12 @@ found:
sub msg_body_write ($$) {
my ($self, $msg) = @_;
+
+ # these can momentarily double the memory consumption :<
$$msg =~ s/^\./../smg;
- $$msg =~ s/(?<!\r)\n/\r\n/sg;
+ $$msg =~ s/(?<!\r)\n/\r\n/sg; # Alpine barfs without this
+ $$msg .= "\r\n" unless $$msg =~ /\r\n\z/s;
msg_more($self, $$msg);
- msg_more($self, "\r\n") unless $$msg =~ /\r\n\z/s;
'.'
}
@@ -533,18 +535,19 @@ sub set_art {
$self->{article} = $art if defined $art && $art =~ /\A[0-9]+\z/;
}
-sub _header ($) {
- my $hdr = $_[0]->as_string;
+sub msg_hdr_write ($$$) {
+ my ($self, $hdr, $body_follows) = @_;
+ $hdr = $hdr->as_string;
utf8::encode($hdr);
- $hdr =~ s/(?<!\r)\n/\r\n/sg;
+ $hdr =~ s/(?<!\r)\n/\r\n/sg; # Alpine barfs without this
# for leafnode compatibility, we need to ensure Message-ID headers
# are only a single line. We can't subclass Email::Simple::Header
# and override _default_fold_at in here, either; since that won't
# affect messages already in the archive.
$hdr =~ s/^(Message-ID:)[ \t]*\r\n[ \t]+([^\r]+)\r\n/$1 $2\r\n/igsm;
-
- $hdr
+ $hdr .= "\r\n" if $body_follows;
+ msg_more($self, $hdr);
}
sub cmd_article ($;$) {
@@ -554,8 +557,7 @@ sub cmd_article ($;$) {
my ($n, $mid, $msg, $hdr) = @$r;
set_art($self, $art);
more($self, "220 $n <$mid> article retrieved - head and body follow");
- msg_more($self, _header($hdr));
- msg_more($self, "\r\n");
+ msg_hdr_write($self, $hdr, 1);
msg_body_write($self, $msg);
}
@@ -566,7 +568,7 @@ sub cmd_head ($;$) {
my ($n, $mid, undef, $hdr) = @$r;
set_art($self, $art);
more($self, "221 $n <$mid> article retrieved - head follows");
- msg_more($self, _header($hdr));
+ msg_hdr_write($self, $hdr, 0);
'.'
}
--
EW
prev parent reply other threads:[~2019-06-27 22:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-27 22:51 [PATCH 0/4] www|nntp: optimize uses of Email::Simple Eric Wong
2019-06-27 22:51 ` [PATCH 1/4] nntp: rework and simplify art_lookup response Eric Wong
2019-06-27 22:51 ` [PATCH 2/4] mbox: use Email::Simple->new to do in-place modifications Eric Wong
2019-06-27 22:51 ` [PATCH 3/4] mbox: split header and body processing Eric Wong
2019-06-27 22:51 ` Eric Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190627225148.9657-5-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).