From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id B12801F4C0 for ; Sat, 2 Nov 2019 01:22:17 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 0B55A120B48; Sat, 2 Nov 2019 10:22:07 +0900 (JST) Received: from xtrwkhkc.outbound-mail.sendgrid.net (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id 37BD8120B35 for ; Sat, 2 Nov 2019 10:22:04 +0900 (JST) Received: by filter0190p3mdw1.sendgrid.net with SMTP id filter0190p3mdw1-29452-5DBCDA3D-1F 2019-11-02 01:22:05.446213538 +0000 UTC m=+95315.582741505 Received: from herokuapp.com (unknown [3.89.31.191]) by ismtpd0084p1mdw1.sendgrid.net (SG) with ESMTP id ESTPMM0MTJq9FU7ecCACFQ for ; Sat, 02 Nov 2019 01:22:05.318 +0000 (UTC) Date: Sat, 02 Nov 2019 01:22:05 +0000 (UTC) From: mame@ruby-lang.org Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 71243 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 16288 X-Redmine-Issue-Author: davidw X-Redmine-Sender: mame X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?EJh2gqwnyqXtd++xo=2FinyA1V0bXouTB4FkWnzNiKb4++Io2h5cDedGE+2Aj2D1?= =?us-ascii?Q?+u9g+TgPlTzU5yFWqVEYrDMIyITd8VQHQh8egFi?= =?us-ascii?Q?QJKyi+XXfJrpXRs8PRt5N+P43fR8rU4ZuyknQyg?= =?us-ascii?Q?ZT5FsDglzAuHyLxX49k6rYN7Ooh7KAbZhEawq4h?= =?us-ascii?Q?M7TpVPZHtVyWAMsHno=2FbL3lz9MH53nPaZxw=3D=3D?= To: ruby-core@ruby-lang.org X-ML-Name: ruby-core X-Mail-Count: 95650 Subject: [ruby-core:95650] [Ruby master Bug#16288] Segmentation fault with finalizers, threads X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #16288 has been updated by mame (Yusuke Endoh). Thank you for the report and the great investigation! I could reproduce the issue by using your example: https://github.com/mainameiz/segfault_app Currently, starting a thread in a finalizer is dangerous. The termination of the interpreter is: (1) kill all threads except the main thread, (2) run all finalizers, and (3) destruct all (including all mutexes, the main thread, timer thread, VM itself, etc.). If a finalizer creates a thread, it starts running during or after (3), which leads to a catastrophic situation. An easy solution is to prohibit thread creation after the process (3) is started. The following patch fixes the segfault of your example. ```diff diff --git a/thread.c b/thread.c index eff5d39b51..fc609907ef 100644 --- a/thread.c +++ b/thread.c @@ -833,6 +833,11 @@ thread_create_core(VALUE thval, VALUE args, VALUE (*fn)(void *)) rb_raise(rb_eThreadError, "can't start a new thread (frozen ThreadGroup)"); } + if (current_th->vm->main_thread->status == THREAD_KILLED) { + rb_warn("can't start a new thread after the main thread has stopped"); + rb_raise(rb_eThreadError, + "can't start a new thread (the main thread has already terminated)"); + } if (fn) { th->invoke_type = thread_invoke_type_func; ``` By this patch, your example ends gracefully. ``` $ bundle exec rspec spec/models/user_spec.rb /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/mongoid-7.0.5/lib/mongoid.rb:104: warning: The last argument is used as the keyword parameter /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/activesupport-6.0.0/lib/active_support/core_ext/module/delegation.rb:171: warning: for `delegate' defined here config.eager_load is set to nil. Please update your config/environments/*.rb files accordingly: * development - set it to false * test - set it to false (unless you use a tool that preloads your test environment) * production - set it to true /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/tzinfo-1.2.5/lib/tzinfo/ruby_core_support.rb:142: warning: The last argument is used as the keyword parameter /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/tzinfo-1.2.5/lib/tzinfo/ruby_core_support.rb:142: warning: The last argument is used as the keyword parameter /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/actionpack-6.0.0/lib/action_dispatch/middleware/stack.rb:37: warning: The last argument is used as the keyword parameter /home/mame/work/ruby/local/lib/ruby/gems/2.7.0/gems/actionpack-6.0.0/lib/action_dispatch/middleware/static.rb:110: warning: for `initialize' defined here .. Finished in 0.00977 seconds (files took 1.22 seconds to load) 1 example, 0 failures /home/mame/work/ruby/local/lib/ruby/2.7.0/timeout.rb:85: warning: can't start a new thread on a finalizer /home/mame/work/ruby/local/lib/ruby/2.7.0/timeout.rb:85: warning: can't start a new thread on a finalizer ``` However, as the last two lines show, Timeout cannot be used safely in a finalizer. I'm unsure if it is acceptable, but to support thread creation in a finalizer, we need to revamp the termination process. @ko1 and @nobu, what do you think? @davidw I could be wrong as I don't understand your statement about thread_join, but I couldn't see the behavior by running your example under gdb. Anyways, thanks for the great investigation. It is really helpful. ---------------------------------------- Bug #16288: Segmentation fault with finalizers, threads https://bugs.ruby-lang.org/issues/16288#change-82438 * Author: davidw (David Welton) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.6.6p116 (2019-10-02 revision 67825) [x86_64-linux] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN ---------------------------------------- Hi, This is a tricky one and I am still working on narrowing it down, but I will report what I have so far. I compiled a version of 2_6_6 from github: ruby 2.6.6p116 (2019-10-02 revision 67825) [x86_64-linux] I have a minimal Rails project that uses Mongoid. It crashes with a segmentation fault when rspec runs. The concurrent ruby gem is in some way involved, and I have been posting there: https://github.com/ruby-concurrency/concurrent-ruby/issues/808 However, I think there is a deeper problem - I would not expect a user level script to cause a segmentation fault. I have been putting a lot of debugging statements in, and turned on Thread.DEBUG, and have noticed some things. I am not experienced with Ruby's internals, so some of these bits of data might be normal or irrelevant: * The concurrent-ruby gem uses ObjectSpace.define_finalizer to set a finalizer * That finalizer creates a new Thread * However, it appears as if that thread is running after the main thread is already dead, so code that expects to reference the main thread crashes, because it's a NULL reference. I tried the following test code: ``` class Foo def initialize ObjectSpace.define_finalizer(self, proc do Foo.foo_finalizer end) end def bar puts 'bar' end def Foo.foo_finalizer puts "foo_finalizer" t = Thread.new do puts "Thread reporting for duty" end puts "foo_finalizer thread launched" sleep 5 end end f = Foo.new f.bar f = nil ``` While trying to develop a simple test case to demonstrate the problem. It triggers rb_raise(rb_eThreadError, "can't alloc thread"); in thread_s_new, because it looks like the main thread has already been marked as 'killed' in this case. When I check the main thread status in thread_s_new with the above code, it reports 'dead'. When I run my rspec code in the sample Rails project, thread_s_new shows the main thread's status as 'run' even if it should be dead? I have seen some debugging things that shows some exceptions and thread_join interrupts and so on. Is it possible that something like this is happening? Main thread starts doing a cleanup, and gets an exception or something that generates an interrupt, and its KILLED status gets reset to RUNNABLE Then, in the finalizer, it starts creating a Thread, but at this point the main thread actually does get killed, and when that finalizer thread tries to run it runs into a null reference? I can provide the Rails sample project if needs be. Sorry if any of the above isn't clear; I've been staring at the C code for several hours and am a bit cross-eyed! Thank you for any insights. -- https://bugs.ruby-lang.org/