ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "ioquatix (Samuel Williams)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:108781] [Ruby master Bug#18818] SEGV (Fiber scheduler?)
Date: Mon, 06 Jun 2022 05:17:32 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-97846.20220606051732.33540@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-18818.20220605213916.33540@ruby-lang.org

Issue #18818 has been updated by ioquatix (Samuel Williams).


Unfortunately Mutex, Queue and probably other objects don't take a reference to the waiting fiber, so your fiber is being garbage collected after it's in the wait list. Then some other object is allocated and unblock is called with it (or not = SEGV).

You need to keep track of blocked fibers:

```ruby
def block...
  @blocked[fiber] = true
...

def unblock
  @blocked.delete(fiber)
...
```

This prevents the fiber from going out of scope.

----------------------------------------
Bug #18818: SEGV (Fiber scheduler?)
https://bugs.ruby-lang.org/issues/18818#change-97846

* Author: nevans (Nicholas Evans)
* Status: Open
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
* ruby -v: 3.1.2, 3.0.4, master
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
The attached script (and/or others like it) can cause SEGV in 3.0, 3.1, and master.  It has always behaved as expected when I use `optflags=-O0`.

When I use it with `make run` on `master`:
```
./miniruby -I../lib -I. -I.ext/common  -r./x86_64-linux-fake  ../test.rb 
========================================================================
fiber_queue
completed in 0.00031349004711955786
========================================================================
fiber_sized_queue
../test.rb:62: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.2.0dev (2022-06-05T06:18:26Z master 5ce0be022f) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0005 p:---- s:0023 e:000022 CFUNC  :%
c:0004 p:0031 s:0018 e:000015 METHOD ../test.rb:62 [FINISH]
c:0003 p:---- s:0010 e:000009 CFUNC  :pop
c:0002 p:0009 s:0006 e:000005 BLOCK  ../test.rb:154 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
../test.rb:154:in `block (2 levels) in <main>'
../test.rb:154:in `pop'
../test.rb:62:in `unblock'
../test.rb:62:in `%'

-- Machine register context ------------------------------------------------
 RIP: 0x000055eae9ffa417 RBP: 0x00007f80aba855d8 RSP: 0x00007f80a9789598
 RAX: 0x000000000000009b RBX: 0x00007f80a9789628 RCX: 0x00007f80ab9c37a0
 RDX: 0x00007f80a97895c0 RDI: 0x0000000000000000 RSI: 0x000000000000009b
  R8: 0x0000000000000000  R9: 0x00007f80a97895c0 R10: 0x0000000055550083
 R11: 0x00007f80ac32ace0 R12: 0x00007f80aba855d8 R13: 0x00007f80ab9c3780
 R14: 0x00007f80a97895c0 R15: 0x000000000000009b EFL: 0x0000000000010202

-- C level backtrace information -------------------------------------------
./miniruby(rb_vm_bugreport+0x5cf) [0x55eaea06b0ef]
./miniruby(rb_bug_for_fatal_signal+0xec) [0x55eae9e4fc2c]
./miniruby(sigsegv+0x4d) [0x55eae9fba30d]
[0x7f80ac153520]
./miniruby(rb_id_table_lookup+0x7) [0x55eae9ffa417]
./miniruby(callable_method_entry+0x103) [0x55eaea046bd3]
./miniruby(vm_respond_to+0x3f) [0x55eaea056c1f]
./miniruby(rb_check_funcall_default_kw+0x19c) [0x55eaea05788c]
./miniruby(rb_check_convert_type_with_id+0x8e) [0x55eae9f1b85e]
./miniruby(rb_str_format_m+0x1a) [0x55eae9fce82a]
./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7]
./miniruby(vm_exec_core+0x114) [0x55eaea05d684]
./miniruby(rb_vm_exec+0x187) [0x55eaea04e747]
./miniruby(rb_funcallv_scope+0x1b0) [0x55eaea05a770]
./miniruby(rb_fiber_scheduler_unblock+0x3e) [0x55eae9fb979e]
./miniruby(sync_wakeup+0x10d) [0x55eae9ffd45d]
./miniruby(rb_szqueue_pop+0xf5) [0x55eae9ffefd5]
./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7]
./miniruby(vm_exec_core+0x114) [0x55eaea05d684]
./miniruby(rb_vm_exec+0x187) [0x55eaea04e747]
./miniruby(rb_vm_invoke_proc+0x5f) [0x55eaea05584f]
./miniruby(rb_fiber_start+0x1da) [0x55eae9e1e24a]
./miniruby(fiber_entry+0x0) [0x55eae9e1e550]

```

I've attached the rest of the VM dump.  `make runruby` gives a nearly identical dump.  I can post a core dump or `rr` recording, if needed.
_
I'm sorry I didn't simplify the script more; small, seemingly irrelevant changes can change the failure or allow it to pass.  Sometimes it raises a bizarre exception instead of SEGV, most commonly a NoMethodError which seemingly indicates that the local vars have been shifted or scrambled.  For example, this particular SEGV was caused by a guard clause checking that `unblock(blocker, fiber)` was given a Fiber object.  Here, that object is invalid, but I've seen it be a string or some other object from elsewhere in the process.

For comparison, this is what the script output should look like:
```
========================================================================
fiber_queue
completed in 0.00031569297425448895
========================================================================
fiber_sized_queue
completed in 0.1176840600091964
========================================================================
fiber_sized_queue2
completed in 0.19209402799606323
========================================================================
fiber_sized_queue3
completed in 0.21404067997355014
========================================================================
fiber_sized_queue4
completed in 0.30277197097893804
```

I was attempting to create some simple benchmarks for `Queue` and `SizedQueue` with fibers, to mimic `benchmark/vm_thread_*queue*.rb`.  I never completed the benchmarks because of this SEGV.  :)

---Files--------------------------------
test.rb (5.6 KB)
segv-master-5ce0be022f.txt (11.8 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2022-06-06  5:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-05 21:39 [ruby-core:108774] [Ruby master Bug#18818] SEGV (Fiber scheduler?) nevans (Nicholas Evans)
2022-06-06  0:47 ` [ruby-core:108776] " mame (Yusuke Endoh)
2022-06-06  5:17 ` ioquatix (Samuel Williams) [this message]
2022-06-06  5:18 ` [ruby-core:108782] " ioquatix (Samuel Williams)
2022-06-06 16:40 ` [ruby-core:108784] " nevans (Nicholas Evans)
2022-06-06 18:31 ` [ruby-core:108788] " nevans (Nicholas Evans)
2022-07-09 22:44 ` [ruby-core:109174] " nevans (Nicholas Evans)
2022-09-20 22:51 ` [ruby-core:109968] " ioquatix (Samuel Williams)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-97846.20220606051732.33540@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).