ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:85870] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
@ 2018-02-28 23:32 ` dazuma
  2018-03-01  3:09 ` [ruby-core:85874] " ko1
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: dazuma @ 2018-02-28 23:32 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been reported by dazuma (Daniel Azuma).

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85874] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
  2018-02-28 23:32 ` [ruby-core:85870] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread dazuma
@ 2018-03-01  3:09 ` ko1
  2018-03-01  5:29 ` [ruby-core:85876] " dazuma
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: ko1 @ 2018-03-01  3:09 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ko1 (Koichi Sasada).


I can't reproduce this issue with current trunk (2.6) on Linux / Windows.
clang (mac) issue?


----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70730

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85876] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
  2018-02-28 23:32 ` [ruby-core:85870] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread dazuma
  2018-03-01  3:09 ` [ruby-core:85874] " ko1
@ 2018-03-01  5:29 ` dazuma
  2018-03-01  6:04 ` [ruby-core:85877] " harbirg
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: dazuma @ 2018-03-01  5:29 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by dazuma (Daniel Azuma).


Yes, I have been able to reproduce this issue only on mac.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70733

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85877] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2018-03-01  5:29 ` [ruby-core:85876] " dazuma
@ 2018-03-01  6:04 ` harbirg
  2018-03-01 10:44 ` [ruby-core:85881] " h.nedim
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: harbirg @ 2018-03-01  6:04 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by harbirg (Harbir G).


I can also reproduce the same crash on MacOS High Sierra 10.13.2. Occurs on both 2.5.0 and 2.6.0-preview1.  

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70734

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85881] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2018-03-01  6:04 ` [ruby-core:85877] " harbirg
@ 2018-03-01 10:44 ` h.nedim
  2018-03-01 15:56 ` [ruby-core:85885] " stuartdhadfield
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: h.nedim @ 2018-03-01 10:44 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ned (Nedim Hadzimahmutovic).


I can reproduce, MacOS High Sierra 10.13.3 (17D47)

```nedims-MacBook-Pro:~ nedim$ irb
2.5.0 :001 > enum = Enumerator.new { |y| y << 1 }
 => #<Enumerator: #<Enumerator::Generator:0x00007fe043133b90>:each> 
2.5.0 :002 > thread = Thread.new { enum.peek } # enum.next also causes the segfault, but not enum.size
 => #<Thread:0x00007fe0431e7640@(irb):2 run> 
2.5.0 :003 > thread.join
 => #<Thread:0x00007fe0431e7640@(irb):2 dead> 
2.5.0 :004 > GC.start 
(irb):4: [BUG] Segmentation fault at 0x0000700006d0fbc0
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
````

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70736

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85885] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2018-03-01 10:44 ` [ruby-core:85881] " h.nedim
@ 2018-03-01 15:56 ` stuartdhadfield
  2018-03-02  5:31 ` [ruby-core:85889] " nobu
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: stuartdhadfield @ 2018-03-01 15:56 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by stuarthadfield (Stuart Hadfield).


harbirg (Harbir G) wrote:
> I can also reproduce the same crash on MacOS High Sierra 10.13.2. Occurs on both 2.5.0 and 2.6.0-preview1.


+1.  Reliably reproducible on Mac, ruby 2.5.0 OSX El Capitan 10.11.6, Ruby 2.5.0

Currently causes our unit tests to seg fault on production code. Does not occur on Circle Env which is Linux based.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70740

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

####
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
####

The C-level backtrace identifies this as within the mark phase of GC:

-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:85889] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2018-03-01 15:56 ` [ruby-core:85885] " stuartdhadfield
@ 2018-03-02  5:31 ` nobu
  2018-03-11 19:38 ` [ruby-core:86081] " briantkephart
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: nobu @ 2018-03-02  5:31 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by nobu (Nobuyoshi Nakada).

Description updated

Seems marking dead fiber's stack.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70744

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86081] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2018-03-02  5:31 ` [ruby-core:85889] " nobu
@ 2018-03-11 19:38 ` briantkephart
  2018-03-14 11:40 ` [ruby-core:86109] " samuel
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: briantkephart @ 2018-03-11 19:38 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by brian-kephart (Brian Kephart).


I think I'm running into the same bug. I'm new to reading these types of traces, so please let me know if this needs to be a separate issue instead. Running Ruby 2.5.0 on OS X.

~~~
/Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:227: [BUG] Segmentation fault at 0x000000010dd5fa3a
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]

-- Crash Report log information --------------------------------------------
   See Crash Report log file under the one of following:                    
     * ~/Library/Logs/DiagnosticReports                                     
     * /Library/Logs/DiagnosticReports                                      
   for more details.                                                        
Don't forget to include the above Crash Report log file in bug reports.     

-- Control frame information -----------------------------------------------
c:0014 p:---- s:0136 e:000135 CFUNC  :getaddrinfo
c:0013 p:0034 s:0126 e:000125 METHOD /Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:227
c:0012 p:0073 s:0115 e:000114 METHOD /Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:631
c:0011 p:0034 s:0102 e:000101 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/webpacker-3.2.2/lib/webpacker/dev_server.rb:14
c:0010 p:0032 s:0098 e:000097 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/webpacker-3.2.2/lib/webpacker/dev_server_proxy.rb:7
c:0009 p:0014 s:0090 E:001848 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/rack-proxy-0.6.3/lib/rack/proxy.rb:57
c:0008 p:0041 s:0085 E:0018d8 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/skylight-core-2.0.0.beta2/lib/skylight/core/probes/middleware.rb:28
c:0007 p:0020 s:0074 E:001940 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/railties-5.2.0.rc1/lib/rails/engine.rb:524
c:0006 p:0026 s:0068 E:001998 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/configuration.rb:225
c:0005 p:0258 s:0063 E:001a98 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:624
c:0004 p:0026 s:0038 E:002590 METHOD /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:438
c:0003 p:0065 s:0026 E:0025e8 BLOCK  /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:302 [FINISH]
c:0002 p:0125 s:0016 E:002690 BLOCK  /Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/thread_pool.rb:120 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/thread_pool.rb:120:in `block in spawn_thread'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:302:in `block in run'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:438:in `process_client'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/server.rb:624:in `handle_request'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/puma-3.11.2/lib/puma/configuration.rb:225:in `call'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/railties-5.2.0.rc1/lib/rails/engine.rb:524:in `call'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/skylight-core-2.0.0.beta2/lib/skylight/core/probes/middleware.rb:28:in `call'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/rack-proxy-0.6.3/lib/rack/proxy.rb:57:in `call'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/webpacker-3.2.2/lib/webpacker/dev_server_proxy.rb:7:in `rewrite_response'
/Users/briankephart/.rvm/gems/ruby-2.5.0/gems/webpacker-3.2.2/lib/webpacker/dev_server.rb:14:in `running?'
/Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:631:in `tcp'
/Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:227:in `foreach'
/Users/briankephart/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/socket.rb:227:in `getaddrinfo'

-- Machine register context ------------------------------------------------
 rax: 0x0000000000000000 rbx: 0x00007fc5eef71ec0 rcx: 0x0000000000010000
 rdx: 0x000070000c3b3c80 rdi: 0x000000010dd5fa38 rsi: 0x00007fc5eef71ec0
 rbp: 0x000070000c3b3c70 rsp: 0x000070000c3b3c38  r8: 0x0000000000000000
  r9: 0xffffffff00000000 r10: 0x000000010d7fa738 r11: 0xfffff0009ac08e64
 r12: 0x00007fffaa5a5f40 r13: 0x00007fff717f2864 r14: 0x000070000c3b3c80
 r15: 0x00007fc5eef71ed0 rip: 0x00007fff717f2868 rfl: 0x0000000000010206

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010d9ebd07 rb_vm_bugreport + 135
1   libruby.2.5.dylib                   0x000000010d870978 rb_bug_context + 472
2   libruby.2.5.dylib                   0x000000010d960151 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff717c6f5a _sigtramp + 26
4   libsystem_trace.dylib               0x00007fff717f2868 _os_log_cmp_key + 4

-- Other runtime information -----------------------------------------------

* Loaded script: puma: cluster worker 0: 2204 [brian-becca]

* Loaded features:

    0 enumerator.so
    1 thread.rb
    2 rational.so
    3 complex.so
    ... lots more
~~~

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70947

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86109] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2018-03-11 19:38 ` [ruby-core:86081] " briantkephart
@ 2018-03-14 11:40 ` samuel
  2018-03-16 20:23 ` [ruby-core:86163] " s.wanabe
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-03-14 11:40 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).

File ruby_2018-03-14-205753_Fukurou.crash added
File ruby_2018-03-14-222035_Fukurou.crash added
File dump.txt added

I believe I've run into this bug too.

macOS: 10.13.3
Ruby: 2.5.0

I was running this code:

https://github.com/socketry/async-websocket/tree/ruby-segv

Specifically, in the "chat/" directory, run `./config.ru` to start the server, and then run `client.rb` several times. Finally, press Ctrl-C on the server. Sometimes it crash. Sometimes it's okay. I don't believe I experienced this with previous version of Ruby.

I can help reproduce the issue if needed.




----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-70979

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86163] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (8 preceding siblings ...)
  2018-03-14 11:40 ` [ruby-core:86109] " samuel
@ 2018-03-16 20:23 ` s.wanabe
  2018-03-18 13:42 ` [ruby-core:86174] " s.wanabe
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: s.wanabe @ 2018-03-16 20:23 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by wanabe (_ wanabe).


`git bisect` shows this is from r60440 [Feature #14038].

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-71046

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86174] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (9 preceding siblings ...)
  2018-03-16 20:23 ` [ruby-core:86163] " s.wanabe
@ 2018-03-18 13:42 ` s.wanabe
  2018-04-21 13:09 ` [ruby-core:86640] " samuel
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: s.wanabe @ 2018-03-18 13:42 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by wanabe (_ wanabe).


It seems to be a `FIBER_USE_NATIVE == 0` environment issue.
Perhaps it may be a potential `Enumerator`'s behaviour issue.

First, `Fiber.new` and `Fiber#resume` must be in same thread, but `Enumerator.new` and `Enumerator#peek` don't have to be.
Because `Fiber.new` calls `fiber_t_alloc()` immediately, but `Enumerator.new` doesn't. He is lazy :)
So `Enumerator` can take out machine stack value of killed-thread.
I think `Thread.new { enum.peek }` should raise FiberError.
But it is big incompatibility and not realistic.

There is no "marking dead fiber's stack" problem on `FIBER_USE_NATIVE != 0` environment.
Because `fiber_setcontext()` set `oldfib->cont.saved_ec.machine.stack_end = NULL;` and skip machine stack mark when `ec->machine.stack_end == NULL` in `rb_execution_context_mark()`.

There are some ways:
1. `fiber_mark` checks not only `fib->status` but also `fib->cont->saved_ec.thread_ptr->status` on `FIBER_USE_NATIVE == 0` environment.
2. `thread_cleanup_func()` makes all fibers `FIBER_TERMINATED`.
3. etc.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-71058

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86640] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (10 preceding siblings ...)
  2018-03-18 13:42 ` [ruby-core:86174] " s.wanabe
@ 2018-04-21 13:09 ` samuel
  2018-05-02 12:56 ` [ruby-core:86833] " samuel
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-04-21 13:09 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).


Thanks so much for your effort to isolate this issue.

I don't think any of my code is violating "First, Fiber.new and Fiber#resume must be in same thread, but Enumerator.new and Enumerator#peek don't have to be."

I will check.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-71597

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86833] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (11 preceding siblings ...)
  2018-04-21 13:09 ` [ruby-core:86640] " samuel
@ 2018-05-02 12:56 ` samuel
  2018-05-02 12:57 ` [ruby-core:86834] " samuel
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-05-02 12:56 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).


I don't know if this is related or not, but I just now saw a very odd error.

```
Traceback (most recent call last):
	7: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/task.rb:74:in `block in initialize'
	6: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-io-1.10.0/lib/async/io/socket.rb:83:in `block in accept'
	5: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-io-1.10.0/lib/async/io/socket.rb:83:in `ensure in block in accept'
	4: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/wrapper.rb:135:in `close'
	3: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/wrapper.rb:186:in `cancel_monitor'
	2: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/wrapper.rb:186:in `close'
	1: from /Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/wrapper.rb:186:in `deregister'
/Users/samuel/.rvm/gems/ruby-2.5.0/gems/async-1.8.0/lib/async/wrapper.rb:186:in `lock': deadlock; recursive locking (ThreadError)
```

Notice the last four lines all point to the same line, and yet have different method names.

The line in question is this one: https://github.com/socketry/async/blob/v1.8.0/lib/async/wrapper.rb#L186

The only method named `deregister` is here: https://github.com/socketry/async/blob/v1.8.0/lib/async/debug/selector.rb#L52

Even thought there was such an error the spec passed.

I don't know how such a thing could happen. It seems like corruption in the VM.


----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-71796

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:86834] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (12 preceding siblings ...)
  2018-05-02 12:56 ` [ruby-core:86833] " samuel
@ 2018-05-02 12:57 ` samuel
  2018-11-30  5:18 ` [ruby-core:90185] " aselder
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-05-02 12:57 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).


By the way, it only happened once. Re-running the same spec several more times didn't generate any further odd output.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-71797

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:90185] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (13 preceding siblings ...)
  2018-05-02 12:57 ` [ruby-core:86834] " samuel
@ 2018-11-30  5:18 ` aselder
  2018-11-30  5:20 ` [ruby-core:90187] " samuel
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: aselder @ 2018-11-30  5:18 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by aselder (Andrew Selder).


Would it possible to get this addressed? It blocking my entire organization from upgrading past Ruby 2.4

It's reproducible on all 2.5 releases as well as the 2.6 preview releases.

It's been reported multiple times:
https://bugs.ruby-lang.org/issues/14334
https://bugs.ruby-lang.org/issues/14561
https://bugs.ruby-lang.org/issues/14714
https://bugs.ruby-lang.org/issues/15308

Our friend wanabe has even found the commit that introduced the error. I'd love to help out and try and solve this, but I'm afraid I'd just screw things up in the GC. Please let me know if there is anything I can do to help get a fix out.



----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-75306

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:90187] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (14 preceding siblings ...)
  2018-11-30  5:18 ` [ruby-core:90185] " aselder
@ 2018-11-30  5:20 ` samuel
  2018-11-30 20:27 ` [ruby-core:90194] " alanwucanada
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-11-30  5:20 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).


I thought this was fixed already in 2.6 - but obviously not. That's bad :(

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-75308

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:90194] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (15 preceding siblings ...)
  2018-11-30  5:20 ` [ruby-core:90187] " samuel
@ 2018-11-30 20:27 ` alanwucanada
  2018-12-01  6:52 ` [ruby-core:90208] [Ruby trunk Bug#14561][Closed] " samuel
  2019-01-10 14:18 ` [ruby-core:90997] [Ruby trunk Bug#14561] " nagachika00
  18 siblings, 0 replies; 19+ messages in thread
From: alanwucanada @ 2018-11-30 20:27 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by alanwu (Alan Wu).


I have a patch for this, #15362 if anyone could take a look.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-75315

* Author: dazuma (Daniel Azuma)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:90208] [Ruby trunk Bug#14561][Closed] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (16 preceding siblings ...)
  2018-11-30 20:27 ` [ruby-core:90194] " alanwucanada
@ 2018-12-01  6:52 ` samuel
  2019-01-10 14:18 ` [ruby-core:90997] [Ruby trunk Bug#14561] " nagachika00
  18 siblings, 0 replies; 19+ messages in thread
From: samuel @ 2018-12-01  6:52 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by ioquatix (Samuel Williams).

Status changed from Open to Closed
Assignee set to ioquatix (Samuel Williams)
Backport deleted (2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN)

Thanks @alanwu this has been applied to trunk and should be back-ported to 2.5 shortly as per #15362. As the supplied spec is passing, I'm going to assume this is now fixed. Feel free to open a new bug report if there are further issues. Thanks everyone for your time and effort.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-75330

* Author: dazuma (Daniel Azuma)
* Status: Closed
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ruby-core:90997] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
       [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
                   ` (17 preceding siblings ...)
  2018-12-01  6:52 ` [ruby-core:90208] [Ruby trunk Bug#14561][Closed] " samuel
@ 2019-01-10 14:18 ` nagachika00
  18 siblings, 0 replies; 19+ messages in thread
From: nagachika00 @ 2019-01-10 14:18 UTC (permalink / raw)
  To: ruby-core

Issue #14561 has been updated by nagachika (Tomoyuki Chikanaga).

Backport changed from 2.5: REQUIRED to 2.5: DONE

ruby_2_5 r66777 merged revision(s) 66111.

----------------------------------------
Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread
https://bugs.ruby-lang.org/issues/14561#change-76217

* Author: dazuma (Daniel Azuma)
* Status: Closed
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
* Backport: 2.5: DONE
----------------------------------------
This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0.

Small repro case:

```ruby
enum = Enumerator.new { |y| y << 1 }
thread = Thread.new { enum.peek }  # enum.next also causes the segfault, but not enum.size
thread.join
GC.start   # <- seg fault here
```

The C-level backtrace identifies this as within the mark phase of GC:

```
-- C level backtrace information -------------------------------------------
0   ruby                                0x000000010f77ced7 rb_vm_bugreport + 135
1   ruby                                0x000000010f602628 rb_bug_context + 472
2   ruby                                0x000000010f6f1491 sigsegv + 81
3   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
4   ruby                                0x000000010f61bb93 rb_gc_mark_machine_stack + 99
5   ruby                                0x000000010f76bf39 rb_execution_context_mark + 137
6   ruby                                0x000000010f5ea32b cont_mark + 27
7   ruby                                0x000000010f626a02 gc_marks_rest + 146
8   ruby                                0x000000010f6253c0 gc_start + 2816
9   ruby                                0x000000010f61d628 garbage_collect + 184
10  ruby                                0x000000010f622215 gc_start_internal + 485
11  ruby                                0x000000010f7703be vm_call_cfunc + 286
12  ruby                                0x000000010f759af4 vm_exec_core + 12260
13  ruby                                0x000000010f76ac8e vm_exec + 142
14  ruby                                0x000000010f60c101 ruby_exec_internal + 177
15  ruby                                0x000000010f60bff8 ruby_run_node + 56
16  ruby                                0x000000010f592d1f main + 79

I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace:

-- C level backtrace information -------------------------------------------
0   libruby.2.5.dylib                   0x000000010c416e19 rb_print_backtrace + 25
1   libruby.2.5.dylib                   0x000000010c416f28 rb_vm_bugreport + 136
2   libruby.2.5.dylib                   0x000000010c2096f2 rb_bug_context + 450
3   libruby.2.5.dylib                   0x000000010c35b4ee sigsegv + 94
4   libsystem_platform.dylib            0x00007fff6a779f5a _sigtramp + 26
5   libruby.2.5.dylib                   0x000000010c2395a1 mark_locations_array + 49
6   libruby.2.5.dylib                   0x000000010c22a5bb gc_mark_locations + 75
7   libruby.2.5.dylib                   0x000000010c22a7d9 mark_stack_locations + 41
8   libruby.2.5.dylib                   0x000000010c22a79f rb_gc_mark_machine_stack + 79
9   libruby.2.5.dylib                   0x000000010c3f8868 rb_execution_context_mark + 264
10  libruby.2.5.dylib                   0x000000010c1e263e cont_mark + 46
11  libruby.2.5.dylib                   0x000000010c1e2572 fiber_mark + 146
12  libruby.2.5.dylib                   0x000000010c22f4c6 gc_mark_children + 1094
13  libruby.2.5.dylib                   0x000000010c23734c gc_mark_stacked_objects + 108
14  libruby.2.5.dylib                   0x000000010c237a5b gc_mark_stacked_objects_all + 27
15  libruby.2.5.dylib                   0x000000010c236cb1 gc_marks_rest + 129
16  libruby.2.5.dylib                   0x000000010c238787 gc_marks + 103
17  libruby.2.5.dylib                   0x000000010c2352e2 gc_start + 802
18  libruby.2.5.dylib                   0x000000010c22ca18 garbage_collect + 56
19  libruby.2.5.dylib                   0x000000010c231f7d gc_start_internal + 493
20  libruby.2.5.dylib                   0x000000010c401f2a call_cfunc_m1 + 42
21  libruby.2.5.dylib                   0x000000010c400d1d vm_call_cfunc_with_frame + 605
22  libruby.2.5.dylib                   0x000000010c3fc41d vm_call_cfunc + 173
23  libruby.2.5.dylib                   0x000000010c3fb8fe vm_call_method_each_type + 190
24  libruby.2.5.dylib                   0x000000010c3fb690 vm_call_method + 160
25  libruby.2.5.dylib                   0x000000010c3fb5e5 vm_call_general + 53
26  libruby.2.5.dylib                   0x000000010c3e784e vm_exec_core + 8974
27  libruby.2.5.dylib                   0x000000010c3f6fe6 vm_exec + 182
28  libruby.2.5.dylib                   0x000000010c3f7d5b rb_iseq_eval_main + 43
29  libruby.2.5.dylib                   0x000000010c214208 ruby_exec_internal + 232
30  libruby.2.5.dylib                   0x000000010c214111 ruby_exec_node + 33
31  libruby.2.5.dylib                   0x000000010c2140d0 ruby_run_node + 64
32  ruby                                0x000000010c16ff2f main + 95
```

As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064):

```C
static void
mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n)
{
    VALUE v;
    while (n--) {
        v = *x;            // <----- Seems to be crashing here?
        gc_mark_maybe(objspace, v);
        x++;
    }
}
```

Indicating a bad pointer in the machine stack.

I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` element within a separate thread, and then waiting for the thread to end.


---Files--------------------------------
ruby_2018-03-14-222035_Fukurou.crash (38.6 KB)
ruby_2018-03-14-205753_Fukurou.crash (38.6 KB)
dump.txt (51.4 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-01-10 14:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-14561.20180228233240@ruby-lang.org>
2018-02-28 23:32 ` [ruby-core:85870] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread dazuma
2018-03-01  3:09 ` [ruby-core:85874] " ko1
2018-03-01  5:29 ` [ruby-core:85876] " dazuma
2018-03-01  6:04 ` [ruby-core:85877] " harbirg
2018-03-01 10:44 ` [ruby-core:85881] " h.nedim
2018-03-01 15:56 ` [ruby-core:85885] " stuartdhadfield
2018-03-02  5:31 ` [ruby-core:85889] " nobu
2018-03-11 19:38 ` [ruby-core:86081] " briantkephart
2018-03-14 11:40 ` [ruby-core:86109] " samuel
2018-03-16 20:23 ` [ruby-core:86163] " s.wanabe
2018-03-18 13:42 ` [ruby-core:86174] " s.wanabe
2018-04-21 13:09 ` [ruby-core:86640] " samuel
2018-05-02 12:56 ` [ruby-core:86833] " samuel
2018-05-02 12:57 ` [ruby-core:86834] " samuel
2018-11-30  5:18 ` [ruby-core:90185] " aselder
2018-11-30  5:20 ` [ruby-core:90187] " samuel
2018-11-30 20:27 ` [ruby-core:90194] " alanwucanada
2018-12-01  6:52 ` [ruby-core:90208] [Ruby trunk Bug#14561][Closed] " samuel
2019-01-10 14:18 ` [ruby-core:90997] [Ruby trunk Bug#14561] " nagachika00

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).