* [PATCH 19/19] httpd/async: avoid unnecessary on-stack delete
2021-02-07 8:52 5% ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
@ 2021-02-07 8:52 5% ` Eric Wong
1 sibling, 0 replies; 5+ results
From: Eric Wong @ 2021-02-07 8:52 UTC (permalink / raw)
To: meta
While this doesn't fix a known problem, this was a risky
construct in case somebody uses confess/longmess inside
the user-supplied callback.
cf. commit 0795b0906cc81f40
("ds: guard against stack-not-refcounted quirk of Perl 5")
---
lib/PublicInbox/HTTPD/Async.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/HTTPD/Async.pm b/lib/PublicInbox/HTTPD/Async.pm
index 1de9501d..7238650a 100644
--- a/lib/PublicInbox/HTTPD/Async.pm
+++ b/lib/PublicInbox/HTTPD/Async.pm
@@ -46,7 +46,7 @@ sub event_step {
my ($self) = @_;
if (my $cb = delete $self->{cb}) {
# this may call async_pass when headers are done
- $cb->(delete $self->{arg});
+ $cb->(my $refcnt_guard = delete $self->{arg});
} elsif (my $sock = $self->{sock}) {
my $http = $self->{http};
# $self->{sock} is a read pipe for git-http-backend or cgit
^ permalink raw reply related [relevance 5%]
* [PATCH 18/19] imap: avoid unnecessary on-stack delete
@ 2021-02-07 8:52 5% ` Eric Wong
2021-02-07 8:52 5% ` [PATCH 19/19] httpd/async: " Eric Wong
1 sibling, 0 replies; 5+ results
From: Eric Wong @ 2021-02-07 8:52 UTC (permalink / raw)
To: meta
None of the Content-Type attributes are long-lived
(and unlikely to be memory intensive). While these
callsites won't trigger $DB::args segfaults via
confess or longmess, it'll make future code audits
easier.
cf. commit 0795b0906cc81f40
("ds: guard against stack-not-refcounted quirk of Perl 5")
---
lib/PublicInbox/HTTPD/Async.pm | 2 +-
---
lib/PublicInbox/IMAP.pm | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index 226e98a2..af8ce72b 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -499,7 +499,7 @@ sub body_disposition ($) {
my $cd = $eml->header_raw('Content-Disposition') or return 'NIL';
$cd = parse_content_disposition($cd);
my $buf = '('._esc($cd->{type});
- $buf .= ' ' . _esc_hash(delete $cd->{attributes});
+ $buf .= ' ' . _esc_hash($cd->{attributes});
$buf .= ')';
}
@@ -511,7 +511,7 @@ sub body_leaf ($$;$) {
my $ct = $eml->ct;
$buf .= '('._esc($ct->{type}).' ';
$buf .= _esc($ct->{subtype});
- $buf .= ' ' . _esc_hash(delete $ct->{attributes});
+ $buf .= ' ' . _esc_hash($ct->{attributes});
$buf .= ' ' . _esc($eml->header_raw('Content-ID'));
$buf .= ' ' . _esc($eml->header_raw('Content-Description'));
my $cte = $eml->header_raw('Content-Transfer-Encoding') // '7bit';
@@ -540,7 +540,7 @@ sub body_parent ($$$) {
$buf .= @$hold ? join('', @$hold) : 'NIL';
$buf .= ' '._esc($ct->{subtype});
if ($structure) {
- $buf .= ' '._esc_hash(delete $ct->{attributes});
+ $buf .= ' '._esc_hash($ct->{attributes});
$buf .= ' '.body_disposition($eml);
$buf .= ' '._esc($eml->header_raw('Content-Language'));
$buf .= ' '._esc($eml->header_raw('Content-Location'));
^ permalink raw reply related [relevance 5%]
* [PATCH 11/17] treewide: replace confess with croak
@ 2021-02-06 12:18 4% ` Eric Wong
0 siblings, 0 replies; 5+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
To: meta
The PublicInbox::Eml (and previously Email::MIME) use of confess
was the primary (or only) culprit behind the lei2mail segfaults
fixed by commit 0795b0906cc81f40.
("ds: guard against stack-not-refcounted quirk of Perl 5").
We never care about a backtrace when dealing with Eml objects
anyways, so it was just a worthless waste of CPU cycles.
We can also drop confess in a few other places. Since we only
use Perl and Inline::C, users will never be without source
and can replace s/croak/Carp::confess/ on a per-callsite basis
to help report problems.
It's also possible to use PERL5OPT=-MCarp=verbose in the
environment though still potentially risky.
Link: https://public-inbox.org/meta/20210201082833.3293-1-e@80x24.org/
---
lib/PublicInbox/DS.pm | 10 +++++-----
lib/PublicInbox/Eml.pm | 4 ++--
lib/PublicInbox/IPC.pm | 2 +-
lib/PublicInbox/OverIdx.pm | 2 +-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 263c3458..ec965abe 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -32,7 +32,7 @@ use Scalar::Util qw(blessed);
use PublicInbox::Syscall qw(:epoll);
use PublicInbox::Tmpfile;
use Errno qw(EAGAIN EINVAL);
-use Carp qw(confess carp);
+use Carp qw(carp);
our @EXPORT_OK = qw(now msg_more dwaitpid);
my $nextq; # queue for next_tick
@@ -335,9 +335,9 @@ retry:
$ev &= ~EPOLLEXCLUSIVE;
goto retry;
}
- die "couldn't add epoll watch for $fd: $!\n";
+ die "EPOLL_CTL_ADD $self/$sock/$fd: $!";
}
- confess("DescriptorMap{$fd} defined ($DescriptorMap{$fd})")
+ croak("FD:$fd in use by $DescriptorMap{$fd} (for $self/$sock)")
if defined($DescriptorMap{$fd});
$DescriptorMap{$fd} = $self;
@@ -368,7 +368,7 @@ sub close {
# notifications about it
my $fd = fileno($sock);
epoll_ctl($Epoll, EPOLL_CTL_DEL, $fd, 0) and
- confess("EPOLL_CTL_DEL: $!");
+ croak("EPOLL_CTL_DEL($self/$sock): $!");
# we explicitly don't delete from DescriptorMap here until we
# actually close the socket, as we might be in the middle of
@@ -587,7 +587,7 @@ sub msg_more ($$) {
sub epwait ($$) {
my ($sock, $ev) = @_;
epoll_ctl($Epoll, EPOLL_CTL_MOD, fileno($sock), $ev) and
- confess("EPOLL_CTL_MOD $!");
+ croak("EPOLL_CTL_MOD($sock): $!");
}
# return true if complete, false if incomplete (or failure)
diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index f7f62e7b..81a6632b 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -332,7 +332,7 @@ sub body_set {
sub body_str_set {
my ($self, $body_str) = @_;
my $charset = ct($self)->{attributes}->{charset} or
- Carp::confess('body_str was given, but no charset is defined');
+ croak('body_str was given, but no charset is defined');
body_set($self, \(encode($charset, $body_str, Encode::FB_CROAK)));
}
@@ -454,7 +454,7 @@ sub body_str {
if ($STR_TYPE{$ct->{type}} && $STR_SUBTYPE{$ct->{subtype}}) {
return body($self);
}
- Carp::confess("can't get body as a string for ",
+ croak("can't get body as a string for ",
join("\n\t", header_raw($self, 'Content-Type')));
}
decode($charset, body($self), Encode::FB_CROAK);
diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index a0e6bfee..0dee2a92 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -11,7 +11,7 @@ package PublicInbox::IPC;
use strict;
use v5.10.1;
use parent qw(Exporter);
-use Carp qw(confess croak);
+use Carp qw(croak);
use PublicInbox::DS qw(dwaitpid);
use PublicInbox::Spawn;
use PublicInbox::OnDestroy;
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 985c5473..9013ae23 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -456,7 +456,7 @@ sub dbh_close {
sub create {
my ($self) = @_;
my $fn = $self->{filename} // do {
- Carp::confess('BUG: no {filename}') unless $self->{dbh};
+ croak('BUG: no {filename}') unless $self->{dbh};
return;
};
unless (-r $fn) {
^ permalink raw reply related [relevance 4%]
* [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5
2021-02-01 8:28 5% [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
@ 2021-02-01 8:28 7% ` Eric Wong
0 siblings, 0 replies; 5+ results
From: Eric Wong @ 2021-02-01 8:28 UTC (permalink / raw)
To: meta
The Perl 5 stack is weakly-referenced for performance reasons.
This means it's possible for items in the stack to be freed
while executing further down the stack.
In lei (and perhaps public-facing read-only daemons in the
future), we'll fork and call PublicInbox::DS->Reset in the child
process. This causes %DescriptorMap to be clobbered, allowing
the $DescriptorMap{$fd} arg to be freed inside the child
process.
When Carp::confess or Carp::longmess is called to generate a
backtrace, it may access the @DB::args array. This array access
is not protected by reference counting and is known to cause
segfaults and other weird errors.
While the caller of an unnecessary Carp::confess may be
eliminated in a future commit, we can't guarantee our
dependencies will be free of @DB::args access attempts
in the future.
So guard against this Perl 5 quirmk by defensively bumping the
refcount of any object we call ->event_step on.
cf. https://rt.perl.org/Public/Bug/Display.html?id=131046
https://github.com/Perl/perl5/issues/15928
---
lib/PublicInbox/DS.pm | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 40994fd4..2d312f0a 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -271,6 +271,7 @@ sub EventLoop {
$Epoll //= _InitPoller();
local $in_loop = 1;
my @events;
+ my $obj; # guard stack-not-refcounted w/ Carp + @DB::args
do {
my $timeout = RunTimers();
@@ -281,7 +282,8 @@ sub EventLoop {
# that ones in the front triggered unregister-interest actions. if we
# can't find the %sock entry, it's because we're no longer interested
# in that event.
- $DescriptorMap{$fd}->event_step;
+ $obj = $DescriptorMap{$fd};
+ $obj->event_step;
}
} while (PostEventLoop());
_run_later();
^ permalink raw reply related [relevance 7%]
* [PATCH 00/21] lei2mail worker segfault finally fixed
@ 2021-02-01 8:28 5% Eric Wong
2021-02-01 8:28 7% ` [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5 Eric Wong
0 siblings, 1 reply; 5+ results
From: Eric Wong @ 2021-02-01 8:28 UTC (permalink / raw)
To: meta
This lei2mail segfault turned out to be an old Perl 5 quirk
which plagued many before me. It was not consistently
reproducible, and random changes seemed to make it happen more
or less frequently. There were several times when I thought I
fixed it (and maybe this is still one of them!) only to have it
pop up again.
Still, I found many other little bugs and improvements worth
doing along the way. Hope things go more smoothly in the
future...
Anyways, [PATCH 18/21] is the fix (and I'll followup with more
on how I found the fix). 19/21 is purely defensive
future-proofing.
Eric Wong (21):
lei: more consistent dedupe and ovv_buf init
ipc: switch wq to use the event loop
lei: remove per-child SIG{__WARN__}
lei: remove SIGPIPE handler
ipc: more helpful ETOOMANYREFS error messages
lei: remove syslog dependency
sharedkv: release {dbh} before rmtree
lei: keep $lei around until workers are reaped
lei_dedupe: use Digest::SHA
lei_xsearch: load PublicInbox::Smsg
lei: deep clone {ovv} for l2m workers
sharedkv: lock and explicitly disconnect {dbh}
lei: increase initial timeout
sharedkv: use lock_for_scope_fast
lei_to_mail: reduce spew on Maildir removal
sharedkv: do not set cache_size by default
import: reap git-config(1) synchronously
ds: guard against stack-not-refcounted quirk of Perl 5
ds: next_tick: avoid $_ in top-level loop iterator
lei: avoid ETOOMANYREFS, cleanup imports
doc: note optional BSD::Resource use
Documentation/public-inbox-config.pod | 2 +-
INSTALL | 6 ++
MANIFEST | 2 +
lib/PublicInbox/DS.pm | 12 ++--
lib/PublicInbox/IPC.pm | 43 +++++++-----
lib/PublicInbox/Import.pm | 1 +
lib/PublicInbox/LEI.pm | 95 +++++++++++++++------------
lib/PublicInbox/LeiDedupe.pm | 6 +-
lib/PublicInbox/LeiExternal.pm | 3 +-
lib/PublicInbox/LeiOverview.pm | 51 +++++++-------
lib/PublicInbox/LeiToMail.pm | 84 +++++++++++------------
lib/PublicInbox/LeiXSearch.pm | 36 +++++-----
lib/PublicInbox/Lock.pm | 17 +++++
lib/PublicInbox/SharedKV.pm | 33 +++++++---
lib/PublicInbox/WQWorker.pm | 34 ++++++++++
script/lei | 28 +++++---
t/lei_to_mail.t | 31 +++++----
xt/stress-sharedkv.t | 50 ++++++++++++++
18 files changed, 342 insertions(+), 192 deletions(-)
create mode 100644 lib/PublicInbox/WQWorker.pm
create mode 100644 xt/stress-sharedkv.t
^ permalink raw reply [relevance 5%]
Results 1-5 of 5 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-02-01 8:28 5% [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
2021-02-01 8:28 7% ` [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5 Eric Wong
2021-02-06 12:18 [PATCH 00/17] lei: more random updates Eric Wong
2021-02-06 12:18 4% ` [PATCH 11/17] treewide: replace confess with croak Eric Wong
2021-02-07 8:51 [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
2021-02-07 8:52 5% ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
2021-02-07 8:52 5% ` [PATCH 19/19] httpd/async: " Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).