[ruby-core:61342] [ruby-trunk - Bug #9606] [Open] Ocassional SIGSEGV inTestException#test_machine

ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed

* [ruby-core:61342] [ruby-trunk - Bug #9606] [Open] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
@ 2014-03-06 22:24 ` merch-redmine
  2014-03-07  1:10 ` [ruby-core:61343] [ruby-trunk - Bug #9606] [Feedback] " nobu
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: merch-redmine @ 2014-03-06 22:24 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been reported by Jeremy Evans.

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606

* Author: Jeremy Evans
* Status: Open
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61343] [ruby-trunk - Bug #9606] [Feedback] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
  2014-03-06 22:24 ` [ruby-core:61342] [ruby-trunk - Bug #9606] [Open] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD merch-redmine
@ 2014-03-07  1:10 ` nobu
  2014-03-07  2:07 ` [ruby-core:61344] [ruby-trunk - Bug #9606] " merch-redmine
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: nobu @ 2014-03-07  1:10 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Nobuyoshi Nakada.

Status changed from Open to Feedback

Jeremy Evans wrote:
> It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?

Yes, it is the test for machine stack overflow handling, as its name.
It very depends on platforms, and may not be tested enough on some platforms.

Could you show the followings?
* `ruby_current_thread`
* its `machine_stack_start`, `machine_stack_end`, and `machine_stack_maxsize`,
* `grep STACK .ext/.ext/include/x86_64-openbsd/ruby/config.h`


----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45664

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61344] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
  2014-03-06 22:24 ` [ruby-core:61342] [ruby-trunk - Bug #9606] [Open] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD merch-redmine
  2014-03-07  1:10 ` [ruby-core:61343] [ruby-trunk - Bug #9606] [Feedback] " nobu
@ 2014-03-07  2:07 ` merch-redmine
  2014-03-07  3:33 ` [ruby-core:61345] " nobu
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: merch-redmine @ 2014-03-07  2:07 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Jeremy Evans.


Unfortunately, I lost the original core dump, so this core dump is slightly different, but the basic problem is the same:

    (gdb) bt
    #0  0x000014779593419a in kill () at <stdin>:2
    #1  0x000014779599452a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x000014778bf42a04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x000014778c01f178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>
    #5  0x000014778c0a67c0 in invoke_block_from_c (th=0x1477935dd000, block=0x147794965000, self=Cannot access memory at address 0x7f7fff7fbff8
    ) at vm.c:701
    #6  0x000014778c0a6be6 in vm_invoke_proc (th=0x1477935dd000, proc=0x147794965000, self=22503668220840, defined_class=8, argc=0, argv=0x14778a6e9058, blockptr=0x0) at vm.c:788
    (gdb) up 5
    #5  0x000014778c0a67c0 in invoke_block_from_c (th=0x1477935dd000, block=0x147794965000, self=Cannot access memory at address 0x7f7fff7fbff8
    ) at vm.c:701
    701     {
    Current language:  auto; currently c
    (gdb) print ruby_current_thread
    $1 = (rb_thread_t *) 0x1477935dd000
    (gdb) print ruby_current_thread->machine_stack_start
    $2 = (VALUE *) 0x7f7fffffbfe0
    (gdb) print ruby_current_thread->machine_stack_end  
    $3 = (VALUE *) 0x7f7ffffbfc20
    (gdb) print ruby_current_thread->machine_stack_maxsize
    $4 = 8388608
    (gdb) print &th  
    $5 = (rb_thread_t **) 0x7f7fff7fc008
    (gdb) print &block
    $6 = (const rb_block_t **) 0x7f7fff7fc000
    (gdb) print &self
    $7 = (VALUE *) 0x7f7fff7fbff8

So the machine_stack_maxsize looks correct, but the difference between machine_stack_start and machine_stack_end is 240k, not 8m.

    $ grep STACK .ext/include/x86_64-openbsd/ruby/config.h
    #define HAVE_SIGALTSTACK 1
    #define STACK_GROW_DIRECTION -1
    #define HAVE_PTHREAD_ATTR_GETSTACK 1
    #define HAVE_PTHREAD_STACKSEG_NP 1


----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45665

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61345] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2014-03-07  2:07 ` [ruby-core:61344] [ruby-trunk - Bug #9606] " merch-redmine
@ 2014-03-07  3:33 ` nobu
  2014-03-07  4:18 ` [ruby-core:61346] " merch-redmine
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: nobu @ 2014-03-07  3:33 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Nobuyoshi Nakada.


Thank you, and could you try it with the latest trunk?

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45666

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61346] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2014-03-07  3:33 ` [ruby-core:61345] " nobu
@ 2014-03-07  4:18 ` merch-redmine
  2014-03-07 21:20 ` [ruby-core:61376] " v.ondruch
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: merch-redmine @ 2014-03-07  4:18 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Jeremy Evans.


Same result with the nightly snapshot.tar.gz (ruby -v: ruby 2.2.0dev (2014-03-07 trunk 45279) [x86_64-openbsd])

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45667

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61376] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2014-03-07  4:18 ` [ruby-core:61346] " merch-redmine
@ 2014-03-07 21:20 ` v.ondruch
  2014-03-26  2:37   ` [ruby-core:61686] " Eric Wong
  2014-03-26  2:38 ` [ruby-core:61687] " normalperson
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 17+ messages in thread
From: v.ondruch @ 2014-03-07 21:20 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Vit Ondruch.


It fails on x86 as well - #9198

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45687

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61686] Re: [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
  2014-03-07 21:20 ` [ruby-core:61376] " v.ondruch
@ 2014-03-26  2:37   ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2014-03-26  2:37 UTC (permalink / raw
  To: Ruby developers

v.ondruch@tiscali.cz wrote:
> It fails on x86 as well - #9198

It seems the common problem is the main thread stack has no guard page
(at least not on my GNU/Linux systems).

I propose the following to test inside threads:
http://bogomips.org/ruby.git/patch?id=a2c08435f4346

I was only able to reproduce the problem on a CentOS 6.2 machine,
none of my usual Debian machines had the problem.

I am not sure if the problem with the main stack is fixable,
as mprotect is not portably usable on non-mmap-ed memory.

nobu: thoughts?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61687] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2014-03-07 21:20 ` [ruby-core:61376] " v.ondruch
@ 2014-03-26  2:38 ` normalperson
  2014-03-26  7:41   ` [ruby-core:61691] " Eric Wong
  2014-03-26  3:59 ` [ruby-core:61688] " nobu
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 17+ messages in thread
From: normalperson @ 2014-03-26  2:38 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Eric Wong.


 v.ondruch@tiscali.cz wrote:
 > It fails on x86 as well - #9198
 
 It seems the common problem is the main thread stack has no guard page
 (at least not on my GNU/Linux systems).
 
 I propose the following to test inside threads:
 http://bogomips.org/ruby.git/patch?id=a2c08435f4346
 
 I was only able to reproduce the problem on a CentOS 6.2 machine,
 none of my usual Debian machines had the problem.
 
 I am not sure if the problem with the main stack is fixable,
 as mprotect is not portably usable on non-mmap-ed memory.
 
 nobu: thoughts?

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45938

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61688] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2014-03-26  2:38 ` [ruby-core:61687] " normalperson
@ 2014-03-26  3:59 ` nobu
  2014-03-26  7:48 ` [ruby-core:61692] " normalperson
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: nobu @ 2014-03-26  3:59 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Nobuyoshi Nakada.


I found no problems.
Let's try it.

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45939

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61691] Re: [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
  2014-03-26  2:38 ` [ruby-core:61687] " normalperson
@ 2014-03-26  7:41   ` Eric Wong
  2014-03-26 22:43     ` [ruby-core:61703] " Eric Wong
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wong @ 2014-03-26  7:41 UTC (permalink / raw
  To: ruby-core

normalperson@yhbt.net wrote:
>  I propose the following to test inside threads:
>  http://bogomips.org/ruby.git/patch?id=a2c08435f4346

Bah, that fails on my 32-bit Debian 7.0 VM; I only tried
64-bit before.  I will investigate tomorrow or day after.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61692] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2014-03-26  3:59 ` [ruby-core:61688] " nobu
@ 2014-03-26  7:48 ` normalperson
  2014-03-26 22:48 ` [ruby-core:61704] " normalperson
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: normalperson @ 2014-03-26  7:48 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Eric Wong.


 normalperson@yhbt.net wrote:
 >  I propose the following to test inside threads:
 >  http://bogomips.org/ruby.git/patch?id=a2c08435f4346
 
 Bah, that fails on my 32-bit Debian 7.0 VM; I only tried
 64-bit before.  I will investigate tomorrow or day after.

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45942

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61703] Re: [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
  2014-03-26  7:41   ` [ruby-core:61691] " Eric Wong
@ 2014-03-26 22:43     ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2014-03-26 22:43 UTC (permalink / raw
  To: ruby-core

Eric Wong <normalperson@yhbt.net> wrote:
> normalperson@yhbt.net wrote:
> >  I propose the following to test inside threads:
> >  http://bogomips.org/ruby.git/patch?id=a2c08435f4346
> 
> Bah, that fails on my 32-bit Debian 7.0 VM; I only tried
> 64-bit before.  I will investigate tomorrow or day after.

Setting a 12K guard stack seems to solve the problem for me,
but it is not ideal.  An 8K guard stack is not enough, even(!)

I do not believe a giant guard is worth it for a corner-case
(especially since the main thread has no guard at all).

These tests may be too platform-dependent to support, even for common
Linux systems.

--- a/thread_pthread.c
+++ b/thread_pthread.c
@@ -936,6 +936,8 @@ native_thread_create(rb_thread_t *th)
 	CHECK_ERR(pthread_attr_setstacksize(&attr, stack_size));
 # endif
 
+	CHECK_ERR(pthread_attr_setguardsize(&attr, 12 * 1024));
+
 # ifdef HAVE_PTHREAD_ATTR_SETINHERITSCHED
 	CHECK_ERR(pthread_attr_setinheritsched(&attr, PTHREAD_INHERIT_SCHED));
 # endif

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61704] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (8 preceding siblings ...)
  2014-03-26  7:48 ` [ruby-core:61692] " normalperson
@ 2014-03-26 22:48 ` normalperson
  2014-03-27  3:48 ` [ruby-core:61711] " kosaki.motohiro
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: normalperson @ 2014-03-26 22:48 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Eric Wong.


 Eric Wong <normalperson@yhbt.net> wrote:
 > normalperson@yhbt.net wrote:
 > >  I propose the following to test inside threads:
 > >  http://bogomips.org/ruby.git/patch?id=a2c08435f4346
 > 
 > Bah, that fails on my 32-bit Debian 7.0 VM; I only tried
 > 64-bit before.  I will investigate tomorrow or day after.
 
 Setting a 12K guard stack seems to solve the problem for me,
 but it is not ideal.  An 8K guard stack is not enough, even(!)
 
 I do not believe a giant guard is worth it for a corner-case
 (especially since the main thread has no guard at all).
 
 These tests may be too platform-dependent to support, even for common
 Linux systems.
 
 --- a/thread_pthread.c
 +++ b/thread_pthread.c
 @@ -936,6 +936,8 @@ native_thread_create(rb_thread_t *th)
  	CHECK_ERR(pthread_attr_setstacksize(&attr, stack_size));
  # endif
  
 +	CHECK_ERR(pthread_attr_setguardsize(&attr, 12 * 1024));
 +
  # ifdef HAVE_PTHREAD_ATTR_SETINHERITSCHED
  	CHECK_ERR(pthread_attr_setinheritsched(&attr, PTHREAD_INHERIT_SCHED));
  # endif

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45951

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61711] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (9 preceding siblings ...)
  2014-03-26 22:48 ` [ruby-core:61704] " normalperson
@ 2014-03-27  3:48 ` kosaki.motohiro
  2014-03-27  5:15   ` [ruby-core:61712] " Eric Wong
  2014-03-27  5:18 ` [ruby-core:61713] " normalperson
  2019-05-25  1:06 ` [ruby-core:92836] [Ruby trunk Bug#9606] " merch-redmine
  12 siblings, 1 reply; 17+ messages in thread
From: kosaki.motohiro @ 2014-03-27  3:48 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Motohiro KOSAKI.


Increasing guard page size doesn't not fit for x86 32bit, at least. They have no enough room.
I see no regression if changes 64bit only.
 

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45956

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61712] Re: [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
  2014-03-27  3:48 ` [ruby-core:61711] " kosaki.motohiro
@ 2014-03-27  5:15   ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2014-03-27  5:15 UTC (permalink / raw
  To: Ruby developers

kosaki: any ideas for the lack of guard page for the main thread?

I have one idea, but I hate it: run timer thread function in the main
thread and have all normal Ruby code in pthread_create-ed threads.

However, long term, I will try to remove the timer thread and GVL
w/o slowing single-thread case.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:61713] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (10 preceding siblings ...)
  2014-03-27  3:48 ` [ruby-core:61711] " kosaki.motohiro
@ 2014-03-27  5:18 ` normalperson
  2019-05-25  1:06 ` [ruby-core:92836] [Ruby trunk Bug#9606] " merch-redmine
  12 siblings, 0 replies; 17+ messages in thread
From: normalperson @ 2014-03-27  5:18 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by Eric Wong.


 kosaki: any ideas for the lack of guard page for the main thread?
 
 I have one idea, but I hate it: run timer thread function in the main
 thread and have all normal Ruby code in pthread_create-ed threads.
 
 However, long term, I will try to remove the timer thread and GVL
 w/o slowing single-thread case.

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-45957

* Author: Jeremy Evans
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: core
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:92836] [Ruby trunk Bug#9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
       [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
                   ` (11 preceding siblings ...)
  2014-03-27  5:18 ` [ruby-core:61713] " normalperson
@ 2019-05-25  1:06 ` merch-redmine
  12 siblings, 0 replies; 17+ messages in thread
From: merch-redmine @ 2019-05-25  1:06 UTC (permalink / raw
  To: ruby-core

Issue #9606 has been updated by jeremyevans0 (Jeremy Evans).

Backport deleted (1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN)

I haven't seen `TestException#test_machine_stackoverflow` SIGSEGV in a long time on OpenBSD.  I'm guessing the numerous improvements in the last 5 years make this is no longer an issue.  Is anyone else seeing `TestException#test_machine_stackoverflow` SIGSEGV in their environment with the master branch or ruby 2.6?  If nobody responds confirming this issue is still present, I'll close this in a few weeks.

----------------------------------------
Bug #9606: Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD
https://bugs.ruby-lang.org/issues/9606#change-78215

* Author: jeremyevans0 (Jeremy Evans)
* Status: Feedback
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]
* Backport: 
----------------------------------------
ruby 2.1.1 on OpenBSD seems to occassionally suffer from a stack overflow when running TestException#test_machine_stackoverflow (about 1 every 3-4 times):

    $ make test-all TESTOPTS="-q test/ruby/test_exception.rb"
    Reading specs from /usr/lib/gcc-lib/amd64-unknown-openbsd5.5/4.2.1/specs
    Target: amd64-unknown-openbsd5.5
    Configured with: OpenBSD/amd64 system compiler
    Thread model: posix
    gcc version 4.2.1 20070719
            CC = cc
            LD = ld
            LDSHARED = cc -shared
            CFLAGS = -O0 -g -fPIC
            XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT
            CPPFLAGS = -DOPENSSL_NO_STATIC_ENGINE -I/usr/local/include   -I. -I.ext/include/x86_64-openbsd -I./include -I.
            DLDFLAGS = -L/usr/local/lib -fstack-protector
            SOLIBS = -pthread -lgmp -lm
    ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q test/ruby/test_exception.rb
    Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" -q
    # Running tests:

    .........................F...........

    Finished tests in 2.089776s, 17.7053 tests/s, 88.0477 assertions/s.


      1) Failure:
    TestException#test_machine_stackoverflow [/usr/obj/ports/ruby-2.1.1/ruby-2.1.1/test/ruby/test_exception.rb:482]:
    -:7: [BUG] Segmentation fault at 0x007f7fff7fbfe8
    ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-openbsd]

Looking at the core file in gdb:

    (gdb) bt
    #0  0x00001bb73a57a19a in kill () at <stdin>:2
    #1  0x00001bb73a5da52a in abort () at /usr/src/lib/libc/stdlib/abort.c:70
    #2  0x00001bb741a3ca04 in rb_bug (fmt=Could not find the frame base for "rb_bug".
    ) at error.c:341
    #3  0x00001bb741b19178 in sigsegv (sig=Could not find the frame base for "sigsegv".
    ) at signal.c:704
    #4  <signal handler called>

Here is the interesting part, the key passed to st_lookup should be exactly the same key as the one passed to rb_hash_aref, but the SIGSEGV happens when st_lookup tries to access it:
    
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    #6  0x00001bb741a65353 in rb_hash_aref (hash=30473858635240, key=3864588) at hash.c:701
    #7  0x00001bb741b94df8 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1857
    #8  0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    #9  0x00001bb741ba09a7 in invoke_block_from_c (th=0x1bb73f782000, block=0x1bb73aa32280, self=30473875219360, argc=0, argv=0x1bb7480167b0, blockptr=0x0, cref=0x0,
        defined_class=8) at vm.c:732
    #10 0x00001bb741ba0be6 in vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, self=30473875219360, defined_class=8, argc=0, argv=0x1bb7480167b0, blockptr=0x0)
        at vm.c:788
    #11 0x00001bb741ba0c85 in rb_vm_invoke_proc (th=0x1bb73f782000, proc=0x1bb73aa32280, argc=0, argv=0x1bb7480167b0, blockptr=0x0) at vm.c:807
    #12 0x00001bb741a48ce7 in proc_call (argc=0, argv=0x1bb7480167b0, procval=30473858635280) at proc.c:734
    #13 0x00001bb741b8bd0c in call_cfunc_m1 (func=0x1bb741a48c45 <proc_call>, recv=30473858635280, argc=0, argv=0x1bb7480167b0) at vm_insnhelper.c:1298
    #14 0x00001bb741b8c8f5 in vm_call_cfunc_with_frame (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1470
    #15 0x00001bb741b8ca6b in vm_call_cfunc (th=0x1bb73f782000, reg_cfp=0x1bb7480bf230, ci=0x1bb744054470) at vm_insnhelper.c:1560
    #16 0x00001bb741b917a9 in vm_exec_core (th=0x1bb73f782000, initial=0) at insns.def:1028
    #17 0x00001bb741ba1f5e in vm_exec (th=0x1bb73f782000) at vm.c:1304
    
Let's look at the st_lookup frame:

    (gdb) up 5
    #5  0x00001bb741b238fe in st_lookup (table=0x1bb73e7f7480, key=Cannot access memory at address 0x7f7fff7fbfe8
    ) at st.c:410
    410     {
    Current language:  auto; currently c
    (gdb) print &table
    $1 = (st_table **) 0x7f7fff7fc008
    (gdb) print &key
    $2 = (st_data_t *) 0x7f7fff7fbfe8
    (gdb) print *(&table - 1)
    $3 = (st_table *) 0x0
    (gdb) print *(&table - 2)
    Cannot access memory at address 0x7f7fff7fbff8
    (gdb) print *(&table - 3)
    Cannot access memory at address 0x7f7fff7fbff0
    (gdb) print *(&table - 4)
    Cannot access memory at address 0x7f7fff7fbfe8
    
What is happening here is that when the stack overflows, the location of key in memory is not accessible.  The top of the stack is at 0x7f7fff7fc000, and anything below that (the stack grows downward) is not accessible.

Let's look at the registers, mostly interested in the stack pointer (rsp):
    
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x1bb7438d1200   30473926283776
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fff7fc020   0x7f7fff7fc020
    rsp            0x7f7fff7fbfe0   0x7f7fff7fbfe0
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x1bb73f782000   30473857802240
    r13            0x11     17
    r14            0x1bb746a7bd50   30473978363216
    r15            0x1bb7480bf190   30474001707408
    rip            0x1bb741b238fe   0x1bb741b238fe <st_lookup+12>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35
    
Lets go to the top frame and look at the stack pointer:

    (gdb) up 16100
    #16100 0x00001bb5386010df in main (argc=17, argv=0x7f7fffffa790) at main.c:36
    36              return ruby_run_node(ruby_options(argc, argv));
    (gdb) info reg
    rax            0x1bb73f84d5e8   30473858635240
    rbx            0x7f7fffffa820   140187732518944
    rcx            0x1bb74081e3a0   30473875219360
    rdx            0x7f7fff7fc058   140187724136536
    rsi            0x3af80c 3864588
    rdi            0x1bb73e7f7480   30473841505408
    rbp            0x7f7fffffa750   0x7f7fffffa750
    rsp            0x7f7fffffa730   0x7f7fffffa730
    r8             0x8      8
    r9             0x1bb742d55431   30473914242097
    r10            0x1bb742d55431   30473914242097
    r11            0x1bb741ba1f0e   30473895681806
    r12            0x7f7fffffa790   140187732518800
    r13            0x11     17
    r14            0x0      0
    r15            0x0      0
    rip            0x1bb5386010df   0x1bb5386010df <main+79>
    eflags         0x10202  66050
    cs             0x2b     43
    ss             0x23     35
    ds             0x23     35
    es             0x23     35
    fs             0x23     35
    gs             0x23     35

The difference between the two is:
    
    (gdb) print 0x7f7fffffa730 - 0x7f7fff7fbfe0
    $4 = 8382288
    
That's pretty close to 8MB (8388608).  Sure enough, that's what the stack limit for the user is set to:

    $ ulimit -a
    time(cpu-seconds)    unlimited
    file(blocks)         unlimited
    coredump(blocks)     unlimited
    data(kbytes)         3145728
    stack(kbytes)        8192
    lockedmem(kbytes)    1267356
    memory(kbytes)       3800076
    nofiles(descriptors) 1024
    processes            1024
    
So the operating system is operating appropriately, only allocating about 8MB of stack.

The above example is from OpenBSD/amd64, similar errors occur on OpenBSD/i386.

It appears that ruby's stack overflow handling is not working correctly in this case.  Any pointers for how to fix this issue?  



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-05-25  1:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-9606.20140306222432@ruby-lang.org>
2014-03-06 22:24 ` [ruby-core:61342] [ruby-trunk - Bug #9606] [Open] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD merch-redmine
2014-03-07  1:10 ` [ruby-core:61343] [ruby-trunk - Bug #9606] [Feedback] " nobu
2014-03-07  2:07 ` [ruby-core:61344] [ruby-trunk - Bug #9606] " merch-redmine
2014-03-07  3:33 ` [ruby-core:61345] " nobu
2014-03-07  4:18 ` [ruby-core:61346] " merch-redmine
2014-03-07 21:20 ` [ruby-core:61376] " v.ondruch
2014-03-26  2:37   ` [ruby-core:61686] " Eric Wong
2014-03-26  2:38 ` [ruby-core:61687] " normalperson
2014-03-26  7:41   ` [ruby-core:61691] " Eric Wong
2014-03-26 22:43     ` [ruby-core:61703] " Eric Wong
2014-03-26  3:59 ` [ruby-core:61688] " nobu
2014-03-26  7:48 ` [ruby-core:61692] " normalperson
2014-03-26 22:48 ` [ruby-core:61704] " normalperson
2014-03-27  3:48 ` [ruby-core:61711] " kosaki.motohiro
2014-03-27  5:15   ` [ruby-core:61712] " Eric Wong
2014-03-27  5:18 ` [ruby-core:61713] " normalperson
2019-05-25  1:06 ` [ruby-core:92836] [Ruby trunk Bug#9606] " merch-redmine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).