user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 19/19] httpd/async: avoid unnecessary on-stack delete
    2021-02-07  8:52  5% ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
@ 2021-02-07  8:52  5% ` Eric Wong
  1 sibling, 0 replies; 5+ results
From: Eric Wong @ 2021-02-07  8:52 UTC (permalink / raw)
  To: meta

While this doesn't fix a known problem, this was a risky
construct in case somebody uses confess/longmess inside
the user-supplied callback.

cf. commit 0795b0906cc81f40
    ("ds: guard against stack-not-refcounted quirk of Perl 5")
---
 lib/PublicInbox/HTTPD/Async.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/HTTPD/Async.pm b/lib/PublicInbox/HTTPD/Async.pm
index 1de9501d..7238650a 100644
--- a/lib/PublicInbox/HTTPD/Async.pm
+++ b/lib/PublicInbox/HTTPD/Async.pm
@@ -46,7 +46,7 @@ sub event_step {
 	my ($self) = @_;
 	if (my $cb = delete $self->{cb}) {
 		# this may call async_pass when headers are done
-		$cb->(delete $self->{arg});
+		$cb->(my $refcnt_guard = delete $self->{arg});
 	} elsif (my $sock = $self->{sock}) {
 		my $http = $self->{http};
 		# $self->{sock} is a read pipe for git-http-backend or cgit

^ permalink raw reply related	[relevance 5%]

* [PATCH 18/19] imap: avoid unnecessary on-stack delete
  @ 2021-02-07  8:52  5% ` Eric Wong
  2021-02-07  8:52  5% ` [PATCH 19/19] httpd/async: " Eric Wong
  1 sibling, 0 replies; 5+ results
From: Eric Wong @ 2021-02-07  8:52 UTC (permalink / raw)
  To: meta

None of the Content-Type attributes are long-lived
(and unlikely to be memory intensive).  While these
callsites won't trigger $DB::args segfaults via
confess or longmess, it'll make future code audits
easier.

cf. commit 0795b0906cc81f40
    ("ds: guard against stack-not-refcounted quirk of Perl 5")
---
 lib/PublicInbox/HTTPD/Async.pm | 2 +-
---
 lib/PublicInbox/IMAP.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index 226e98a2..af8ce72b 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -499,7 +499,7 @@ sub body_disposition ($) {
 	my $cd = $eml->header_raw('Content-Disposition') or return 'NIL';
 	$cd = parse_content_disposition($cd);
 	my $buf = '('._esc($cd->{type});
-	$buf .= ' ' . _esc_hash(delete $cd->{attributes});
+	$buf .= ' ' . _esc_hash($cd->{attributes});
 	$buf .= ')';
 }
 
@@ -511,7 +511,7 @@ sub body_leaf ($$;$) {
 	my $ct = $eml->ct;
 	$buf .= '('._esc($ct->{type}).' ';
 	$buf .= _esc($ct->{subtype});
-	$buf .= ' ' . _esc_hash(delete $ct->{attributes});
+	$buf .= ' ' . _esc_hash($ct->{attributes});
 	$buf .= ' ' . _esc($eml->header_raw('Content-ID'));
 	$buf .= ' ' . _esc($eml->header_raw('Content-Description'));
 	my $cte = $eml->header_raw('Content-Transfer-Encoding') // '7bit';
@@ -540,7 +540,7 @@ sub body_parent ($$$) {
 		$buf .= @$hold ? join('', @$hold) : 'NIL';
 		$buf .= ' '._esc($ct->{subtype});
 		if ($structure) {
-			$buf .= ' '._esc_hash(delete $ct->{attributes});
+			$buf .= ' '._esc_hash($ct->{attributes});
 			$buf .= ' '.body_disposition($eml);
 			$buf .= ' '._esc($eml->header_raw('Content-Language'));
 			$buf .= ' '._esc($eml->header_raw('Content-Location'));

^ permalink raw reply related	[relevance 5%]

* [PATCH 11/17] treewide: replace confess with croak
  @ 2021-02-06 12:18  4% ` Eric Wong
  0 siblings, 0 replies; 5+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

The PublicInbox::Eml (and previously Email::MIME) use of confess
was the primary (or only) culprit behind the lei2mail segfaults
fixed by commit 0795b0906cc81f40.
("ds: guard against stack-not-refcounted quirk of Perl 5").

We never care about a backtrace when dealing with Eml objects
anyways, so it was just a worthless waste of CPU cycles.

We can also drop confess in a few other places.  Since we only
use Perl and Inline::C, users will never be without source
and can replace s/croak/Carp::confess/ on a per-callsite basis
to help report problems.

It's also possible to use PERL5OPT=-MCarp=verbose in the
environment though still potentially risky.

Link: https://public-inbox.org/meta/20210201082833.3293-1-e@80x24.org/
---
 lib/PublicInbox/DS.pm      | 10 +++++-----
 lib/PublicInbox/Eml.pm     |  4 ++--
 lib/PublicInbox/IPC.pm     |  2 +-
 lib/PublicInbox/OverIdx.pm |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 263c3458..ec965abe 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -32,7 +32,7 @@ use Scalar::Util qw(blessed);
 use PublicInbox::Syscall qw(:epoll);
 use PublicInbox::Tmpfile;
 use Errno qw(EAGAIN EINVAL);
-use Carp qw(confess carp);
+use Carp qw(carp);
 our @EXPORT_OK = qw(now msg_more dwaitpid);
 
 my $nextq; # queue for next_tick
@@ -335,9 +335,9 @@ retry:
             $ev &= ~EPOLLEXCLUSIVE;
             goto retry;
         }
-        die "couldn't add epoll watch for $fd: $!\n";
+        die "EPOLL_CTL_ADD $self/$sock/$fd: $!";
     }
-    confess("DescriptorMap{$fd} defined ($DescriptorMap{$fd})")
+    croak("FD:$fd in use by $DescriptorMap{$fd} (for $self/$sock)")
         if defined($DescriptorMap{$fd});
 
     $DescriptorMap{$fd} = $self;
@@ -368,7 +368,7 @@ sub close {
     # notifications about it
     my $fd = fileno($sock);
     epoll_ctl($Epoll, EPOLL_CTL_DEL, $fd, 0) and
-        confess("EPOLL_CTL_DEL: $!");
+        croak("EPOLL_CTL_DEL($self/$sock): $!");
 
     # we explicitly don't delete from DescriptorMap here until we
     # actually close the socket, as we might be in the middle of
@@ -587,7 +587,7 @@ sub msg_more ($$) {
 sub epwait ($$) {
     my ($sock, $ev) = @_;
     epoll_ctl($Epoll, EPOLL_CTL_MOD, fileno($sock), $ev) and
-        confess("EPOLL_CTL_MOD $!");
+        croak("EPOLL_CTL_MOD($sock): $!");
 }
 
 # return true if complete, false if incomplete (or failure)
diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index f7f62e7b..81a6632b 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -332,7 +332,7 @@ sub body_set {
 sub body_str_set {
 	my ($self, $body_str) = @_;
 	my $charset = ct($self)->{attributes}->{charset} or
-		Carp::confess('body_str was given, but no charset is defined');
+		croak('body_str was given, but no charset is defined');
 	body_set($self, \(encode($charset, $body_str, Encode::FB_CROAK)));
 }
 
@@ -454,7 +454,7 @@ sub body_str {
 		if ($STR_TYPE{$ct->{type}} && $STR_SUBTYPE{$ct->{subtype}}) {
 			return body($self);
 		}
-		Carp::confess("can't get body as a string for ",
+		croak("can't get body as a string for ",
 			join("\n\t", header_raw($self, 'Content-Type')));
 	}
 	decode($charset, body($self), Encode::FB_CROAK);
diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index a0e6bfee..0dee2a92 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -11,7 +11,7 @@ package PublicInbox::IPC;
 use strict;
 use v5.10.1;
 use parent qw(Exporter);
-use Carp qw(confess croak);
+use Carp qw(croak);
 use PublicInbox::DS qw(dwaitpid);
 use PublicInbox::Spawn;
 use PublicInbox::OnDestroy;
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 985c5473..9013ae23 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -456,7 +456,7 @@ sub dbh_close {
 sub create {
 	my ($self) = @_;
 	my $fn = $self->{filename} // do {
-		Carp::confess('BUG: no {filename}') unless $self->{dbh};
+		croak('BUG: no {filename}') unless $self->{dbh};
 		return;
 	};
 	unless (-r $fn) {

^ permalink raw reply related	[relevance 4%]

* [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5
  2021-02-01  8:28  5% [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
@ 2021-02-01  8:28  7% ` Eric Wong
  0 siblings, 0 replies; 5+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

The Perl 5 stack is weakly-referenced for performance reasons.
This means it's possible for items in the stack to be freed
while executing further down the stack.

In lei (and perhaps public-facing read-only daemons in the
future), we'll fork and call PublicInbox::DS->Reset in the child
process.  This causes %DescriptorMap to be clobbered, allowing
the $DescriptorMap{$fd} arg to be freed inside the child
process.

When Carp::confess or Carp::longmess is called to generate a
backtrace, it may access the @DB::args array.  This array access
is not protected by reference counting and is known to cause
segfaults and other weird errors.

While the caller of an unnecessary Carp::confess may be
eliminated in a future commit, we can't guarantee our
dependencies will be free of @DB::args access attempts
in the future.

So guard against this Perl 5 quirmk by defensively bumping the
refcount of any object we call ->event_step on.

cf. https://rt.perl.org/Public/Bug/Display.html?id=131046
    https://github.com/Perl/perl5/issues/15928
---
 lib/PublicInbox/DS.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 40994fd4..2d312f0a 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -271,6 +271,7 @@ sub EventLoop {
     $Epoll //= _InitPoller();
     local $in_loop = 1;
     my @events;
+    my $obj; # guard stack-not-refcounted w/ Carp + @DB::args
     do {
         my $timeout = RunTimers();
 
@@ -281,7 +282,8 @@ sub EventLoop {
             # that ones in the front triggered unregister-interest actions.  if we
             # can't find the %sock entry, it's because we're no longer interested
             # in that event.
-            $DescriptorMap{$fd}->event_step;
+            $obj = $DescriptorMap{$fd};
+            $obj->event_step;
         }
     } while (PostEventLoop());
     _run_later();

^ permalink raw reply related	[relevance 7%]

* [PATCH 00/21] lei2mail worker segfault finally fixed
@ 2021-02-01  8:28  5% Eric Wong
  2021-02-01  8:28  7% ` [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5 Eric Wong
  0 siblings, 1 reply; 5+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

This lei2mail segfault turned out to be an old Perl 5 quirk
which plagued many before me.  It was not consistently
reproducible, and random changes seemed to make it happen more
or less frequently.  There were several times when I thought I
fixed it (and maybe this is still one of them!) only to have it
pop up again.

Still, I found many other little bugs and improvements worth
doing along the way.  Hope things go more smoothly in the
future...

Anyways, [PATCH 18/21] is the fix (and I'll followup with more
on how I found the fix).  19/21 is purely defensive
future-proofing.

Eric Wong (21):
  lei: more consistent dedupe and ovv_buf init
  ipc: switch wq to use the event loop
  lei: remove per-child SIG{__WARN__}
  lei: remove SIGPIPE handler
  ipc: more helpful ETOOMANYREFS error messages
  lei: remove syslog dependency
  sharedkv: release {dbh} before rmtree
  lei: keep $lei around until workers are reaped
  lei_dedupe: use Digest::SHA
  lei_xsearch: load PublicInbox::Smsg
  lei: deep clone {ovv} for l2m workers
  sharedkv: lock and explicitly disconnect {dbh}
  lei: increase initial timeout
  sharedkv: use lock_for_scope_fast
  lei_to_mail: reduce spew on Maildir removal
  sharedkv: do not set cache_size by default
  import: reap git-config(1) synchronously
  ds: guard against stack-not-refcounted quirk of Perl 5
  ds: next_tick: avoid $_ in top-level loop iterator
  lei: avoid ETOOMANYREFS, cleanup imports
  doc: note optional BSD::Resource use

 Documentation/public-inbox-config.pod |  2 +-
 INSTALL                               |  6 ++
 MANIFEST                              |  2 +
 lib/PublicInbox/DS.pm                 | 12 ++--
 lib/PublicInbox/IPC.pm                | 43 +++++++-----
 lib/PublicInbox/Import.pm             |  1 +
 lib/PublicInbox/LEI.pm                | 95 +++++++++++++++------------
 lib/PublicInbox/LeiDedupe.pm          |  6 +-
 lib/PublicInbox/LeiExternal.pm        |  3 +-
 lib/PublicInbox/LeiOverview.pm        | 51 +++++++-------
 lib/PublicInbox/LeiToMail.pm          | 84 +++++++++++------------
 lib/PublicInbox/LeiXSearch.pm         | 36 +++++-----
 lib/PublicInbox/Lock.pm               | 17 +++++
 lib/PublicInbox/SharedKV.pm           | 33 +++++++---
 lib/PublicInbox/WQWorker.pm           | 34 ++++++++++
 script/lei                            | 28 +++++---
 t/lei_to_mail.t                       | 31 +++++----
 xt/stress-sharedkv.t                  | 50 ++++++++++++++
 18 files changed, 342 insertions(+), 192 deletions(-)
 create mode 100644 lib/PublicInbox/WQWorker.pm
 create mode 100644 xt/stress-sharedkv.t

^ permalink raw reply	[relevance 5%]

Results 1-5 of 5 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-02-01  8:28  5% [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
2021-02-01  8:28  7% ` [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5 Eric Wong
2021-02-06 12:18     [PATCH 00/17] lei: more random updates Eric Wong
2021-02-06 12:18  4% ` [PATCH 11/17] treewide: replace confess with croak Eric Wong
2021-02-07  8:51     [PATCH 00/19] lei import Maildir, remote mboxrd fixes Eric Wong
2021-02-07  8:52  5% ` [PATCH 18/19] imap: avoid unnecessary on-stack delete Eric Wong
2021-02-07  8:52  5% ` [PATCH 19/19] httpd/async: " Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).