user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: Perl debug patches used to track down source of segfault
Date: Sun, 31 Jan 2021 23:07:16 -1000	[thread overview]
Message-ID: <20210201090716.GA3388@dcvr> (raw)
In-Reply-To: <20210201082833.3293-19-e@80x24.org>

[-- Attachment #1: Type: text/plain, Size: 2968 bytes --]

Attached are two patches against the Debian-packaged perl
5.28.1-6+deb10u1 which I used for tracking down the attempt to
access @DB::args of PublicInbox::Listener::event_step as the
source of the segfault.

I don't know Perl internals very well, and I was never an
advanced gdb user when I hacked C.

While most segfaults I saw did not emit any errors, maybe 1-2 of
them pointed out something from Carp (the error reporting and
backtrace module of Perl); so I wasn't sure if there were
multiple sources of segfaults.

I initially thought it was something warning during the
destruction phase because most of the code isn't doing anything
out-of-the-ordinary aside from short-lived workers.

I then ran lei with debugperl(1) (deb: perl-debug) instead of
the normal "perl" on my system.  That got me to the assertion
failure in sv.c::Perl_sv_clear at

	assert(SvTYPE(sv) != (svtype)SVTYPEMASK);

So I made the attached patch to sv.c to call Perl_op_dump (which
I learned about from the perlhacktips(1) manpage).

This patch to sv.c got me a dump which put me around lines 340-350
of Carp.pm.  Poking into Carp.pm brought me to this comment
around lines 330:

        # Guard our serialization of the stack from stack refcounting bugs
        # NOTE this is NOT a complete solution, we cannot 100% guard against
        # these bugs.  However in many cases Perl *is* capable of detecting
        # them and throws an error when it does.  Unfortunately serializing
        # the arguments on the stack is a perfect way of finding these bugs,
        # even when they would not affect normal program flow that did not
        # poke around inside the stack.  Inside of Carp.pm it makes little
        # sense reporting these bugs, as Carp's job is to report the callers
        # errors, not the ones it might happen to tickle while doing so.
        # See: https://rt.perl.org/Public/Bug/Display.html?id=131046
        # and: https://rt.perl.org/Public/Bug/Display.html?id=52610
        # for more details and discussion. - Yves

The conundrum is Carp itself is generating backtraces, so
getting a Perl-level backtrace becomes tricky...

I decided to edit Carp.pm to store the caller-supplied error
message in a global variable, and set the G_ERR environment
variable before where it would occasionally segfault.

Once the environment variable is set, printing fields of the
global **environ array from gdb on the core dump where G_ERR is
stored would get me the $sub_name and offending error message.

It turned out the error was the confess call in
PublicInbox::Eml::body_str which was copied over from
Email::MIME, and $sub_name was PublicInbox::Listener::event_step.

That led me to realize $DescriptorMap{$fd} was no longer
referenced clobbered and to this workaround.

I didn't want to spend time compiling, relinking and install
Perl again (and didn't want to do it the first time), but I
could've printed the G_ERR environment where I was doing
Perl_op_dump, too.

[-- Attachment #2: 0001-sv.c-run-Perl_op_dump-before-failing-assertion.patch --]
[-- Type: text/x-diff, Size: 870 bytes --]

From 03712da7eb875b18879662079279241cd4fee2d0 Mon Sep 17 00:00:00 2001
From: Eric Wong <e@80x24.org>
Date: Mon, 1 Feb 2021 05:42:42 +0000
Subject: [PATCH 1/2] sv.c: run Perl_op_dump before failing assertion

The debugperl(1) binary packaged by Debian alerted me
to an assertion in Perl_sv_clear failing at:

	assert(SvTYPE(sv) != (svtype)SVTYPEMASK);

To get more information about what Perl code is executing,
we can run Perl_op_dump on PL_op before triggering the
assertion.
---
 sv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sv.c b/sv.c
index 07865bb..2c950e0 100644
--- a/sv.c
+++ b/sv.c
@@ -6510,6 +6510,9 @@ Perl_sv_clear(pTHX_ SV *const orig_sv)
 	type = SvTYPE(sv);
 
 	assert(SvREFCNT(sv) == 0);
+	if ((SvTYPE(sv) == (svtype)SVTYPEMASK)) {
+	    Perl_op_dump(aTHX_ PL_op);
+	}
 	assert(SvTYPE(sv) != (svtype)SVTYPEMASK);
 
 	if (type <= SVt_IV) {

[-- Attachment #3: 0002-carp-set-G_ERR-in-environ-before-accessing-DB-args.patch --]
[-- Type: text/x-diff, Size: 1735 bytes --]

From ef7a02def4468ac7301c120ece586caea7351c4d Mon Sep 17 00:00:00 2001
From: Eric Wong <e@80x24.org>
Date: Mon, 1 Feb 2021 05:34:12 +0000
Subject: [PATCH 2/2] carp: set G_ERR in environ before accessing @DB::args

In case accessing @DB::args causes a segfault and core dump;
one can open the core dump with gdb(1) run "p environ[$IDX]"
to figure out what message and sub_name is tickling Perl5's
stack-not-refcounted behavior and causing a segfault.

($IDX is typically the last environment variable index in the
 environ(7) array, but it could be another number)
---
 dist/Carp/lib/Carp.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/dist/Carp/lib/Carp.pm b/dist/Carp/lib/Carp.pm
index 109b7fe..c87e373 100644
--- a/dist/Carp/lib/Carp.pm
+++ b/dist/Carp/lib/Carp.pm
@@ -302,6 +302,8 @@ BEGIN {
     }
 }
 
+our $g_err;
+
 sub caller_info {
     my $i = shift(@_) + 1;
     my %call_info;
@@ -327,6 +329,7 @@ sub caller_info {
 
     my $sub_name = Carp::get_subname( \%call_info );
     if ( $call_info{has_args} ) {
+$ENV{G_ERR} = "$sub_name $g_err";
         # Guard our serialization of the stack from stack refcounting bugs
         # NOTE this is NOT a complete solution, we cannot 100% guard against
         # these bugs.  However in many cases Perl *is* capable of detecting
@@ -594,6 +597,7 @@ sub ret_backtrace {
         $tid_msg = " thread $tid" if $tid;
     }
 
+$g_err = $err;
     my %i = caller_info($i);
     $mess = "$err at $i{file} line $i{line}$tid_msg";
     if( $. ) {
@@ -635,6 +639,7 @@ sub ret_summary {
         my $tid = threads->tid;
         $tid_msg = " thread $tid" if $tid;
     }
+$g_err = $err;
 
     my %i = caller_info($i);
     return "$err at $i{file} line $i{line}$tid_msg\.\n";

  reply	other threads:[~2021-02-01  9:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01  8:28 [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
2021-02-01  8:28 ` [PATCH 01/21] lei: more consistent dedupe and ovv_buf init Eric Wong
2021-02-01  8:28 ` [PATCH 02/21] ipc: switch wq to use the event loop Eric Wong
2021-02-01  8:28 ` [PATCH 03/21] lei: remove per-child SIG{__WARN__} Eric Wong
2021-02-01  8:28 ` [PATCH 04/21] lei: remove SIGPIPE handler Eric Wong
2021-02-01  8:28 ` [PATCH 05/21] ipc: more helpful ETOOMANYREFS error messages Eric Wong
2021-02-01  8:28 ` [PATCH 06/21] lei: remove syslog dependency Eric Wong
2021-02-01  8:28 ` [PATCH 07/21] sharedkv: release {dbh} before rmtree Eric Wong
2021-02-01  8:28 ` [PATCH 08/21] lei: keep $lei around until workers are reaped Eric Wong
2021-02-01  8:28 ` [PATCH 09/21] lei_dedupe: use Digest::SHA Eric Wong
2021-02-01  8:28 ` [PATCH 10/21] lei_xsearch: load PublicInbox::Smsg Eric Wong
2021-02-01  8:28 ` [PATCH 11/21] lei: deep clone {ovv} for l2m workers Eric Wong
2021-02-01  8:28 ` [PATCH 12/21] sharedkv: lock and explicitly disconnect {dbh} Eric Wong
2021-02-01  8:28 ` [PATCH 13/21] lei: increase initial timeout Eric Wong
2021-02-01  8:28 ` [PATCH 14/21] sharedkv: use lock_for_scope_fast Eric Wong
2021-02-01  8:28 ` [PATCH 15/21] lei_to_mail: reduce spew on Maildir removal Eric Wong
2021-02-01  8:28 ` [PATCH 16/21] sharedkv: do not set cache_size by default Eric Wong
2021-02-01  8:28 ` [PATCH 17/21] import: reap git-config(1) synchronously Eric Wong
2021-02-01  8:28 ` [PATCH 18/21] ds: guard against stack-not-refcounted quirk of Perl 5 Eric Wong
2021-02-01  9:07   ` Eric Wong [this message]
2021-02-01  8:28 ` [PATCH 19/21] ds: next_tick: avoid $_ in top-level loop iterator Eric Wong
2021-02-01  8:28 ` [PATCH 20/21] lei: avoid ETOOMANYREFS, cleanup imports Eric Wong
2021-02-01  8:28 ` [PATCH 21/21] doc: note optional BSD::Resource use Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210201090716.GA3388@dcvr \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).