ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "ioquatix (Samuel Williams)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:108782] [Ruby master Bug#18818] SEGV (Fiber scheduler?)
Date: Mon, 06 Jun 2022 05:18:57 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-97847.20220606051857.33540@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-18818.20220605213916.33540@ruby-lang.org

Issue #18818 has been updated by ioquatix (Samuel Williams).


@ko1 I saw this problem because fiber is not retained while waiting, because we have waiting threads but not waiting fibers at VM level IIRC. Probably we need to make mutex/queue mark the wait list correctly? Is there performance issue?

----------------------------------------
Bug #18818: SEGV (Fiber scheduler?)
https://bugs.ruby-lang.org/issues/18818#change-97847

* Author: nevans (Nicholas Evans)
* Status: Open
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
* ruby -v: 3.1.2, 3.0.4, master
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
The attached script (and/or others like it) can cause SEGV in 3.0, 3.1, and master.  It has always behaved as expected when I use `optflags=-O0`.

When I use it with `make run` on `master`:
```
./miniruby -I../lib -I. -I.ext/common  -r./x86_64-linux-fake  ../test.rb 
========================================================================
fiber_queue
completed in 0.00031349004711955786
========================================================================
fiber_sized_queue
../test.rb:62: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.2.0dev (2022-06-05T06:18:26Z master 5ce0be022f) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0005 p:---- s:0023 e:000022 CFUNC  :%
c:0004 p:0031 s:0018 e:000015 METHOD ../test.rb:62 [FINISH]
c:0003 p:---- s:0010 e:000009 CFUNC  :pop
c:0002 p:0009 s:0006 e:000005 BLOCK  ../test.rb:154 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
../test.rb:154:in `block (2 levels) in <main>'
../test.rb:154:in `pop'
../test.rb:62:in `unblock'
../test.rb:62:in `%'

-- Machine register context ------------------------------------------------
 RIP: 0x000055eae9ffa417 RBP: 0x00007f80aba855d8 RSP: 0x00007f80a9789598
 RAX: 0x000000000000009b RBX: 0x00007f80a9789628 RCX: 0x00007f80ab9c37a0
 RDX: 0x00007f80a97895c0 RDI: 0x0000000000000000 RSI: 0x000000000000009b
  R8: 0x0000000000000000  R9: 0x00007f80a97895c0 R10: 0x0000000055550083
 R11: 0x00007f80ac32ace0 R12: 0x00007f80aba855d8 R13: 0x00007f80ab9c3780
 R14: 0x00007f80a97895c0 R15: 0x000000000000009b EFL: 0x0000000000010202

-- C level backtrace information -------------------------------------------
./miniruby(rb_vm_bugreport+0x5cf) [0x55eaea06b0ef]
./miniruby(rb_bug_for_fatal_signal+0xec) [0x55eae9e4fc2c]
./miniruby(sigsegv+0x4d) [0x55eae9fba30d]
[0x7f80ac153520]
./miniruby(rb_id_table_lookup+0x7) [0x55eae9ffa417]
./miniruby(callable_method_entry+0x103) [0x55eaea046bd3]
./miniruby(vm_respond_to+0x3f) [0x55eaea056c1f]
./miniruby(rb_check_funcall_default_kw+0x19c) [0x55eaea05788c]
./miniruby(rb_check_convert_type_with_id+0x8e) [0x55eae9f1b85e]
./miniruby(rb_str_format_m+0x1a) [0x55eae9fce82a]
./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7]
./miniruby(vm_exec_core+0x114) [0x55eaea05d684]
./miniruby(rb_vm_exec+0x187) [0x55eaea04e747]
./miniruby(rb_funcallv_scope+0x1b0) [0x55eaea05a770]
./miniruby(rb_fiber_scheduler_unblock+0x3e) [0x55eae9fb979e]
./miniruby(sync_wakeup+0x10d) [0x55eae9ffd45d]
./miniruby(rb_szqueue_pop+0xf5) [0x55eae9ffefd5]
./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7]
./miniruby(vm_exec_core+0x114) [0x55eaea05d684]
./miniruby(rb_vm_exec+0x187) [0x55eaea04e747]
./miniruby(rb_vm_invoke_proc+0x5f) [0x55eaea05584f]
./miniruby(rb_fiber_start+0x1da) [0x55eae9e1e24a]
./miniruby(fiber_entry+0x0) [0x55eae9e1e550]

```

I've attached the rest of the VM dump.  `make runruby` gives a nearly identical dump.  I can post a core dump or `rr` recording, if needed.
_
I'm sorry I didn't simplify the script more; small, seemingly irrelevant changes can change the failure or allow it to pass.  Sometimes it raises a bizarre exception instead of SEGV, most commonly a NoMethodError which seemingly indicates that the local vars have been shifted or scrambled.  For example, this particular SEGV was caused by a guard clause checking that `unblock(blocker, fiber)` was given a Fiber object.  Here, that object is invalid, but I've seen it be a string or some other object from elsewhere in the process.

For comparison, this is what the script output should look like:
```
========================================================================
fiber_queue
completed in 0.00031569297425448895
========================================================================
fiber_sized_queue
completed in 0.1176840600091964
========================================================================
fiber_sized_queue2
completed in 0.19209402799606323
========================================================================
fiber_sized_queue3
completed in 0.21404067997355014
========================================================================
fiber_sized_queue4
completed in 0.30277197097893804
```

I was attempting to create some simple benchmarks for `Queue` and `SizedQueue` with fibers, to mimic `benchmark/vm_thread_*queue*.rb`.  I never completed the benchmarks because of this SEGV.  :)

---Files--------------------------------
test.rb (5.6 KB)
segv-master-5ce0be022f.txt (11.8 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2022-06-06  5:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-05 21:39 [ruby-core:108774] [Ruby master Bug#18818] SEGV (Fiber scheduler?) nevans (Nicholas Evans)
2022-06-06  0:47 ` [ruby-core:108776] " mame (Yusuke Endoh)
2022-06-06  5:17 ` [ruby-core:108781] " ioquatix (Samuel Williams)
2022-06-06  5:18 ` ioquatix (Samuel Williams) [this message]
2022-06-06 16:40 ` [ruby-core:108784] " nevans (Nicholas Evans)
2022-06-06 18:31 ` [ruby-core:108788] " nevans (Nicholas Evans)
2022-07-09 22:44 ` [ruby-core:109174] " nevans (Nicholas Evans)
2022-09-20 22:51 ` [ruby-core:109968] " ioquatix (Samuel Williams)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-97847.20220606051857.33540@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).