ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: aselder@mac.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:90185] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
Date: Fri, 30 Nov 2018 05:18:13 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-75306.20181130051812.ef47046a60b63b60@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-14561.20180228233240@ruby-lang.org

Issue #14561 has been updated by aselder (Andrew Selder).


Would it possible to get this addressed? It blocking my entire organization from upgrading past Ruby 2.4

It's reproducible on all 2.5 releases as well as the 2.6 preview releases.

It's been reported multiple times:
https://bugs.ruby-lang.org/issues/14334
https://bugs.ruby-lang.org/issues/14561
https://bugs.ruby-lang.org/issues/14714
https://bugs.ruby-lang.org/issues/15308

Our friend wanabe has even found the commit that introduced the error. I'd love to help out and try and solve this, but I'm afraid I'd just screw things up in the GC. Please let me know if there is anything I can do to help get a fix out.



----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-75306

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2018-11-30  5:18 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
2018-02-28 23:32 ` [ruby-core:85870] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread dazuma
2018-03-01  3:09 ` [ruby-core:85874] " ko1
2018-03-01  5:29 ` [ruby-core:85876] " dazuma
2018-03-01  6:04 ` [ruby-core:85877] " harbirg
2018-03-01 10:44 ` [ruby-core:85881] " h.nedim
2018-03-01 15:56 ` [ruby-core:85885] " stuartdhadfield
2018-03-02  5:31 ` [ruby-core:85889] " nobu
2018-03-11 19:38 ` [ruby-core:86081] " briantkephart
2018-03-14 11:40 ` [ruby-core:86109] " samuel
2018-03-16 20:23 ` [ruby-core:86163] " s.wanabe
2018-03-18 13:42 ` [ruby-core:86174] " s.wanabe
2018-04-21 13:09 ` [ruby-core:86640] " samuel
2018-05-02 12:56 ` [ruby-core:86833] " samuel
2018-05-02 12:57 ` [ruby-core:86834] " samuel
2018-11-30  5:18 ` aselder [this message]
2018-11-30  5:20 ` [ruby-core:90187] " samuel
2018-11-30 20:27 ` [ruby-core:90194] " alanwucanada
2018-12-01  6:52 ` [ruby-core:90208] [Ruby trunk Bug#14561][Closed] " samuel
2019-01-10 14:18 ` [ruby-core:90997] [Ruby trunk Bug#14561] " nagachika00

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-75306.20181130051812.ef47046a60b63b60@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).