ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "ioquatix (Samuel Williams)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:108553] [Ruby master Bug#18782] Race conditions in autoload when loading the same feature with multiple threads.
Date: Sat, 14 May 2022 23:06:53 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-97593.20220514230653.3344@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-18782.20220514230150.3344@ruby-lang.org

Issue #18782 has been updated by ioquatix (Samuel Williams).

File 0005-Revert-removal-of-non-conditional-xfree.patch added
File 0004-Increase-timeout-for-debug-tests.patch added
File 0003-Protect-race-on-autoload-state.patch added
File 0002-Prevent-race-between-GC-mark-and-autoload-setup.patch added
File 0001-Add-RUBY_VM_CRITICAL_SECTION-for-detecting-unexpecte.patch added

@normalperson I've attached patch files. If you have time, I'd really appreciate your feedback on this. Specifically, patch 5 - freeing this unconditionally causes the Ruby VM to hang in `test-bundler-parallel` and I'm not sure why. Here is a backtrace of a hung process:

```
* thread #1, name = 'ruby', stop reason = signal SIGSTOP
  * frame #0: 0x00007f91a51972cb libc.so.6`__lll_lock_wait_private + 43
    frame #1: 0x00007f91a51ab6e0 libc.so.6`__libc_calloc + 816
    frame #2: 0x00007f91a5542823 ld-linux-x86-64.so.2`_dl_new_object + 115
    frame #3: 0x00007f91a553ded3 ld-linux-x86-64.so.2`_dl_map_object_from_fd + 179
    frame #4: 0x00007f91a553f7ac ld-linux-x86-64.so.2`_dl_map_object + 508
    frame #5: 0x00007f91a55435b9 ld-linux-x86-64.so.2`dl_open_worker_begin + 169
    frame #6: 0x00007f91a5267e18 libc.so.6`_dl_catch_exception + 136
    frame #7: 0x00007f91a5542d7b ld-linux-x86-64.so.2`dl_open_worker + 59
    frame #8: 0x00007f91a5267e18 libc.so.6`_dl_catch_exception + 136
    frame #9: 0x00007f91a554315d ld-linux-x86-64.so.2`_dl_open + 173
    frame #10: 0x00007f91a5268211 libc.so.6`do_dlopen + 65
    frame #11: 0x00007f91a5267e18 libc.so.6`_dl_catch_exception + 136
    frame #12: 0x00007f91a5267ee3 libc.so.6`_dl_catch_error + 51
    frame #13: 0x00007f91a5268170 libc.so.6`dlerror_run + 64
    frame #14: 0x00007f91a526832a libc.so.6`__libc_dlopen_mode + 74
    frame #15: 0x00007f91a521f191 libc.so.6`__libc_unwind_link_get + 81
    frame #16: 0x00007f91a522c867 libc.so.6`__backtrace + 39
    frame #17: 0x000056542d70de17 ruby`rb_vm_bugreport at vm_dump.c:762:13
    frame #18: 0x000056542d7a454a ruby`rb_bug_for_fatal_signal at error.c:822:5
    frame #19: 0x000056542d663c49 ruby`sigsegv(sig=<unavailable>, info=<unavailable>, ctx=<unavailable>) at signal.c:964:5
    frame #20: 0x00007f91a514f560 libc.so.6`__restore_rt
    frame #21: 0x00007f91a51a6ef1 libc.so.6`malloc_consolidate + 257
    frame #22: 0x00007f91a51a8450 libc.so.6`_int_free + 2032
    frame #23: 0x00007f91a51aabe3 libc.so.6`free + 115
    frame #24: 0x00007f91a19482bd libyaml-0.so.2`yaml_parser_delete + 77
    frame #25: 0x00007f91a4c49bb9 psych.so`dealloc(ptr=0x000056542efbf0e0) at psych_parser.c:48:5
    frame #26: 0x000056542d525a2c ruby`finalize_deferred_heap_pages at gc.c:4241:2
    frame #27: 0x000056542d5307f6 ruby`rb_objspace_call_finalizer at gc.c:4427:5
    frame #28: 0x000056542d509de0 ruby`rb_ec_cleanup at eval.c:164:5
    frame #29: 0x000056542d50a65c ruby`ruby_run_node at eval.c:321:12
    frame #30: 0x000056542d5041fa ruby`rb_main(argc=7, argv=0x00007ffeffea7e48) at main.c:47:9
    frame #31: 0x000056542d504230 ruby`main(argc=7, argv=0x00007ffeffea7e48) at main.c:56:10
    frame #32: 0x00007f91a513a310 libc.so.6`__libc_start_call_main + 128
    frame #33: 0x00007f91a513a3c1 libc.so.6`__libc_start_main@@GLIBC_2.34 + 129
    frame #34: 0x000056542d504095 ruby`_start + 37
```

----------------------------------------
Bug #18782: Race conditions in autoload when loading the same feature with multiple threads.
https://bugs.ruby-lang.org/issues/18782#change-97593

* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
I have identified several race conditions in the autoload code.

1. It's possible to race on adding and then deleting items in `autoload_featuremap`. When this happens, two threads will try to load the same file with different autoload data and deadlock.
2. When finishing autoload, it's necessary to clear `ele->state` before setting constants. If this is not synchronised, a thread can see the cleared `ele->state` before seeing the constants and assume the constant is not being autoloaded and then fail with `NameError`.

This test case can reproduce both cases:

```
# test.rb
autoload_path = File.join(__dir__, "foobar.rb")
File.write(autoload_path, 'module Foo; end; module Bar; end')

100_000.times do
	$stderr.puts "--------------------"
	autoload :Foo, autoload_path
	autoload :Bar, autoload_path

	t1 = Thread.new {Foo}
	t2 = Thread.new {Bar}

	t1.join
	t2.join

	Object.send(:remove_const, :Foo)
	Object.send(:remove_const, :Bar)

	$LOADED_FEATURES.delete(autoload_path)
end
```

Example failure of case (1):

```
-------------------- (success)
autoload_by_someone_else ele=0x55f33b806a30 ele->state=(nil)
autoload_by_someone_else ele=0x55f33b806a30 ele->state=(nil)
check_autoload_required 2
autoload_by_someone_else ele=0x55f33b806a30 ele->state=0x7fdd678be780
check_autoload_required 4
autoload_by_someone_else ele=0x55f33b806a30 ele->state=0x7fdd678be780
check_autoload_required 4
ele=0x55f33b806a30 ele->state=0x7fdd678be780 = NULL
check_autoload_required 4
-------------------- (failure)
autoload_by_someone_else ele=0x55f33b806a30 ele->state=(nil)
autoload_by_someone_else ele=0x55f33b6e8f40 ele->state=(nil)
check_autoload_required 2
check_autoload_required 3
autoload_by_someone_else ele=0x55f33b806a30 ele->state=0x7fdd6779d780
check_autoload_required 1
autoload_by_someone_else ele=0x55f33b806a30 ele->state=0x7fdd6779d780
check_autoload_required 1
ele=0x55f33b806a30 ele->state=0x7fdd6779d780 = NULL
ele=0x55f33b6e8f40 ele->state=0x7fdd678be780 = NULL
../test.rb:12:in `join': No live threads left. Deadlock? (fatal)
3 threads, 3 sleeps current:0x000055f33b771250 main thread:0x000055f33b66e090
* #<Thread:0x00007fdd6a2cb0b0 sleep_forever>
   rb_thread_t:0x000055f33b66e090 native:0x00007fdd6a71c3c0 int:0
   
* #<Thread:0x00007fdd676e0090 ../test.rb:9 sleep_forever>
   rb_thread_t:0x000055f33b770ff0 native:0x00007fdd6789e640 int:1 mutex:0x000055f33b7c5100 cond:1
    depended by: tb_thread_id:0x000055f33b66e090
   
* #<Thread:0x00007fdd676e1238 ../test.rb:10 sleep_forever>
   rb_thread_t:0x000055f33b771250 native:0x00007fdd679bf640 int:0
   

	from ../test.rb:12:in `block in <main>'
	from ../test.rb:4:in `times'
	from ../test.rb:4:in `<main>'
make: *** [uncommon.mk:1250: runruby] Error 1
```

Example failure of case (2):

```
[0x7f175fe5b0c8] rb_autoload_str mod=Object id=Foo file="/home/samuel/Projects/ioquatix/ruby/foobar.rb"
[0x7f175fe5b0c8] rb_autoload_str const_set mod=Object id=Foo file="/home/samuel/Projects/ioquatix/ruby/foobar.rb"
[0x7f175fe5b0c8] rb_autoload_str mod=Object id=Bar file="/home/samuel/Projects/ioquatix/ruby/foobar.rb"
[0x7f175fe5b0c8] rb_autoload_str const_set mod=Object id=Bar file="/home/samuel/Projects/ioquatix/ruby/foobar.rb"
[0x7f175fe61d88] rb_const_search_from value == Qundef -> autoloading
[0x7f175fe61e78] rb_const_search_from value == Qundef -> autoloading
[0x7f175fe61e78] Assigning constants...
[0x7f175fe61d88] rb_const_search_from value == Qundef -> autoloading
[0x7f175fe61e78] autoload_const_set name=:Foo value=Foo
[0x7f175fe61e78] autoload_const_set name=:Bar value=Bar
#<Thread:0x00007f175fe61d88 ../test.rb:11 run> terminated with exception (report_on_exception is true):
../test.rb:11:in `block (2 levels) in <main>': uninitialized constant Bar (NameError)
../test.rb:11:in `block (2 levels) in <main>': uninitialized constant Bar (NameError)
make: *** [uncommon.mk:1250: runruby] Error 1
```

These failures are very uncommon but it does impact Ruby as far back as 2.7, and probably earlier.

---Files--------------------------------
0001-Add-RUBY_VM_CRITICAL_SECTION-for-detecting-unexpecte.patch (2.07 KB)
0002-Prevent-race-between-GC-mark-and-autoload-setup.patch (1.01 KB)
0003-Protect-race-on-autoload-state.patch (10.6 KB)
0004-Increase-timeout-for-debug-tests.patch (910 Bytes)
0005-Revert-removal-of-non-conditional-xfree.patch (691 Bytes)


-- 
https://bugs.ruby-lang.org/

  reply	other threads:[~2022-05-14 23:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-14 23:01 [ruby-core:108552] [Ruby master Bug#18782] Race conditions in autoload when loading the same feature with multiple threads ioquatix (Samuel Williams)
2022-05-14 23:06 ` ioquatix (Samuel Williams) [this message]
2022-05-18 21:17   ` [ruby-core:108614] " Eric Wong
2022-05-15  2:29 ` [ruby-core:108554] " ioquatix (Samuel Williams)
2022-05-15  2:36 ` [ruby-core:108555] " ioquatix (Samuel Williams)
2022-05-15 23:36 ` [ruby-core:108562] " ioquatix (Samuel Williams)
2022-05-17 20:40 ` [ruby-core:108603] " ioquatix (Samuel Williams)
2022-05-18  3:30 ` [ruby-core:108606] " byroot (Jean Boussier)
2022-05-18  7:17 ` [ruby-core:108608] " ioquatix (Samuel Williams)
2022-05-18 14:21 ` [ruby-core:108612] " byroot (Jean Boussier)
2022-09-20 22:01 ` [ruby-core:109965] " ioquatix (Samuel Williams)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-97593.20220514230653.3344@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).