user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 4/7] eml: avoid Encode 2.87..3.12 leak
Date: Wed, 13 Oct 2021 10:16:08 +0000	[thread overview]
Message-ID: <20211013101611.22962-5-e@80x24.org> (raw)
In-Reply-To: <20211013101611.22962-1-e@80x24.org>

Encode::FB_CROAK leaks memory in old versions of Encode:
<https://rt.cpan.org/Public/Bug/Display.html?id=139622>

Since I expect there's still many users on old systems and old
Perls, we can use "$SIG{__WARN__} = \&croak" here with
Encode::FB_WARN to emulate Encode::FB_CROAK behavior.
---
 lib/PublicInbox/Eml.pm | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index 0867a016..69c26932 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -28,7 +28,7 @@ package PublicInbox::Eml;
 use strict;
 use v5.10.1;
 use Carp qw(croak);
-use Encode qw(find_encoding decode encode); # stdlib
+use Encode qw(find_encoding); # stdlib
 use Text::Wrap qw(wrap); # stdlib, we need Perl 5.6+ for $huge
 use MIME::Base64 3.05; # Perl 5.10.0 / 5.9.2
 use MIME::QuotedPrint 3.05; # ditto
@@ -334,9 +334,14 @@ sub body_set {
 
 sub body_str_set {
 	my ($self, $body_str) = @_;
-	my $charset = ct($self)->{attributes}->{charset} or
+	my $cs = ct($self)->{attributes}->{charset} //
 		croak('body_str was given, but no charset is defined');
-	body_set($self, \(encode($charset, $body_str, Encode::FB_CROAK)));
+	my $enc = find_encoding($cs) // croak "unknown encoding `$cs'";
+	$body_str = do {
+		local $SIG{__WARN__} = \&croak;
+		$enc->encode($body_str, Encode::FB_WARN);
+	};
+	body_set($self, \$body_str);
 }
 
 sub content_type { scalar header($_[0], 'Content-Type') }
@@ -452,15 +457,17 @@ sub body {
 sub body_str {
 	my ($self) = @_;
 	my $ct = ct($self);
-	my $charset = $ct->{attributes}->{charset};
-	if (!$charset) {
-		if ($STR_TYPE{$ct->{type}} && $STR_SUBTYPE{$ct->{subtype}}) {
+	my $cs = $ct->{attributes}->{charset} // do {
+		($STR_TYPE{$ct->{type}} && $STR_SUBTYPE{$ct->{subtype}}) and
 			return body($self);
-		}
 		croak("can't get body as a string for ",
 			join("\n\t", header_raw($self, 'Content-Type')));
-	}
-	decode($charset, body($self), Encode::FB_CROAK);
+	};
+	my $enc = find_encoding($cs) or croak "unknown encoding `$cs'";
+	my $tmp = body($self);
+	# workaround https://rt.cpan.org/Public/Bug/Display.html?id=139622
+	local $SIG{__WARN__} = \&croak;
+	$enc->decode($tmp, Encode::FB_WARN);
 }
 
 sub as_string {

  parent reply	other threads:[~2021-10-13 10:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-04 23:53 httpd memory usage? Eric Wong
2021-09-27  7:10 ` Eric Wong
2021-10-04  0:07 ` [PATCH 0/2] www: fix ref cycles when threading extindex Eric Wong
2021-10-04  0:07   ` [PATCH 1/2] t/thread-cycle: make Email::Simple optional Eric Wong
2021-10-04  0:07   ` [PATCH 2/2] www: fix ref cycle from threading w/ extindex Eric Wong
2021-10-04 22:51   ` [PATCH 0/2] www: fix ref cycles when threading extindex Eric Wong
2021-10-05 11:33     ` Encode.pm leak Eric Wong
2021-10-12 10:59       ` Encode.pm leak in v2.87..v3.12 Eric Wong
2021-10-13 10:16         ` [PATCH 0/7] workaround Encode leak, several test fixes Eric Wong
2021-10-13 10:16           ` [PATCH 1/7] xt/perf-msgview: drop unnecessary use_ok Eric Wong
2021-10-13 10:16           ` [PATCH 2/7] test_common: hoist out tail_f sub Eric Wong
2021-10-13 10:16           ` [PATCH 3/7] t/www_listing: require opt-in for grokmirror tests Eric Wong
2021-10-13 10:16           ` Eric Wong [this message]
2021-10-13 10:16           ` [PATCH 5/7] t/lei-mirror: avoid reading ~/.public-inbox/config in test Eric Wong
2021-10-13 10:16           ` [PATCH 6/7] t/git: avoid "once" warning for async_warn Eric Wong
2021-10-13 10:16           ` [PATCH 7/7] t/nntpd-tls: change diag() to like() assertion Eric Wong
2021-11-04  0:17 ` httpd memory usage? Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211013101611.22962-5-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    --subject='Re: [PATCH 4/7] eml: avoid Encode 2.87..3.12 leak' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).