ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "Eregon (Benoit Daloze)" <noreply@ruby-lang.org>
To: ruby-core@neon.ruby-lang.org
Subject: [ruby-core:110847] [Ruby master Feature#19141] Add thread-owned Monitor to protect thread-local resources
Date: Mon, 21 Nov 2022 22:04:20 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-100205.20221121220419.52325@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-19141.20221121175011.52325@ruby-lang.org

Issue #19141 has been updated by Eregon (Benoit Daloze).


> That means any framework using Monitor is incompatible to be used with Enumerator.

It's only problematic if both the parent Fiber and the Enumerator try to acquire the same Monitor while the other holds it.

> In general, there are many types of thread-local resources (connections for example), so it would make sense to have a thread-owned monitor to protect them. I think few resources are really fiber-owned.

Whether a resource is per Thread or Fiber does not depend on the resource, the resource typically needs to be protected from concurrent access.
What it depends on is whether there is a Fiber scheduler and so whether Fibers are used for concurrency.
For example, a Fiber might hold the resource or lock while transferring to another Fiber, and that can cause big problems for the resource.
And this transfer can happen on any blocking IO call with the Fiber scheduler so basically anywhere.

Based on that I don't think a per-thread Monitor is really a solution, the fibers of that thread might not always be nicely nested or release the monitor after they are done executing, and so they can cause the same problem as a global resource or no Monitor.

Also from an implementation point of view it seems really hard if not impossible to implement a per-Thread monitor on the JVM, where the concurrency unit is java.lang.Thread and the new Loom VirtualThreads inherits from java.lang.Thread and so every lock is "per Fiber" in Java too now.
BTW, Mutex has always been per-Fiber on JRuby and TruffleRuby due to that.
My intuition is locks always need to be "per finest concurrency unit" (but I might be wrong).
OTOH for storage both fiber-local, thread-local and inherited fiber-scoped (#19078) make sense. All of these except fiber-local need their own synchronization otherwise they could be mutated concurrently.

That said, we do need to find solutions to the various issues you linked.
For concurrent-ruby I think the solution is relatively clear based on my last comment.

For https://github.com/rails/rails/issues/45994#issuecomment-1319687582 it seems to me a per-thread Monitor isn't really the right solution.
What if someone creates a Thread inside that transaction, should that also work?
Fibers with a Fiber scheduler are basically similar to threads, except fibers of a given threads don't execute at the same time, but they are a form of concurrency and they can switch at fairly unpredictable places.

I do remember someone mentioning an issue with Enumerator-using-Fiber and transactions in a Rails app. Enumerator-using-Fiber is kind of problematic for that because it doesn't necessarily respect the transaction length and might also e.g. keep the connection longer than expected.

In general I think there are very few use-cases for Enumerator-using-Fiber (which is Enumerator#{peek*,next*} and #zip with non-arrays).
They are expensive and they don't execute on the same stack and so any state they need and which should only be held only for some period is really difficult to manage.
People do use it every now and then though so it'd be nice to find a solution.

----------------------------------------
Feature #19141: Add thread-owned Monitor to protect thread-local resources
https://bugs.ruby-lang.org/issues/19141#change-100205

* Author: wildmaples (Maple Ong)
* Status: Open
* Priority: Normal
----------------------------------------
### Background 

In Ruby v3.0.2, Monitor was modified to be owned by fibers instead of threads [for reasons as described in this issue](https://bugs.ruby-lang.org/issues/17827) and so it is also consistent with Mutex. Before the change to Monitor, Mutex was modified to per-fiber instead of thread ([issue](https://bugs.ruby-lang.org/issues/16792), [PR](https://github.com/ruby/ruby/commit/178c1b0922dc727897d81d7cfe9c97d5ffa97fd9)) which caused some problems (See: [comment](https://bugs.ruby-lang.org/issues/17827#note-1)).  

### Problem

We are now encountering a problem where using Enumerator (implemented transparently using Fiber, so the user is not aware) within a Fiber-owned process, which causes a deadlock. That means any framework using Monitor is incompatible to be used with Enumerator. 

In general, there are many types of thread-local resources (connections for example), so it would make sense to have a thread-owned monitor to protect them. I think few resources are really fiber-owned.

#### Specifics 
* Concurrent Ruby is still designed with per-thread locking, which causes similar incompatibilities. (Read: [issue](https://github.com/ruby-concurrency/concurrent-ruby/issues/962))
* Systems test in Rails implements locking using Monitor, resulting in deadlock in these known cases:
  * when cache clearing (Read: [issue](https://github.com/rails/rails/issues/45994))
  * database transactions when used with Enumerator (Read: [comment](https://github.com/rails/rails/issues/45994#issuecomment-1304306575)) 

### Demo

```ruby
# ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [arm64-darwin21]
# Thread #<Thread:0x000000014a8eb228 demo.rb:8 run>, fiber #<Fiber:0x000000014a8eaf80 (resumed)>, locked true, owned true
# Thread #<Thread:0x000000014a8eb228 demo.rb:8 run>, fiber #<Fiber:0x000000014a8eacb0 demo.rb:13 (resumed)>, locked true, owned true

# ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]
# Thread #<Thread:0x0000000102329a08 demo.rb:8 run>, fiber #<Fiber:0x0000000102329828 (resumed)>, locked true, owned true
# Thread #<Thread:0x0000000102329a08 demo.rb:8 run>, fiber #<Fiber:0x00000001023294e0 demo.rb:13 (resumed)>, locked true, owned false

require 'fiber'
require 'monitor'

puts RUBY_DESCRIPTION

# We have a single monitor - we're pretending it protects some thread-local resources
m = Monitor.new

# We'll create an explicit thread
t = Thread.new do
  # Lock to protect our thread-local resource
  m.enter

  puts "Thread #{Thread.current}, fiber #{Fiber.current}, locked #{m.mon_locked?}, owned #{m.mon_owned?}"

  # The Enumerator here creates a fiber, which runs on the same thread, so would want to use the same thread-local resource
  e = Enumerator.new do |y|
    # In 2.7 this is fine, in 3.0 it's not, as the fiber thinks it doesn't have the lock
    puts "Thread #{Thread.current}, fiber #{Fiber.current}, locked #{m.mon_locked?}, owned #{m.mon_owned?}"
    
    # This would deadlock
    # m.enter

    y.yield 1
  end
  e.next
end

t.join
```

### Possible Solutions

* Allow `Monitor` to be per thread or fiber through a flag
* Having `Thread::Monitor` and `Fiber::Monitor` as two separate classes. Leave `Monitor` as it is right now. However, this may not be possible due to the `Thread::Mutex` alias 

These options would give us more flexibility in which type of Monitor to use. 



-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2022-11-21 22:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 17:50 [ruby-core:110843] [Ruby master Feature#19141] Add thread-owned Monitor to protect thread-local resources wildmaples (Maple Ong)
2022-11-21 22:02 ` [ruby-core:110846] " byroot (Jean Boussier)
2022-11-21 22:04 ` Eregon (Benoit Daloze) [this message]
2022-11-22  5:10 ` [ruby-core:110849] " ioquatix (Samuel Williams)
2022-11-22  8:46 ` [ruby-core:110852] " byroot (Jean Boussier)
2022-11-22 11:18 ` [ruby-core:110853] " Eregon (Benoit Daloze)
2022-11-22 11:26 ` [ruby-core:110854] " byroot (Jean Boussier)
2022-11-22 13:40 ` [ruby-core:110856] " Eregon (Benoit Daloze)
2022-11-22 17:24 ` [ruby-core:110858] " chrisseaton (Chris Seaton)
2022-11-23 11:09 ` [ruby-core:110866] " Eregon (Benoit Daloze)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-100205.20221121220419.52325@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    --cc=ruby-core@neon.ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).