ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174
@ 2009-08-25  6:21 Daniel Azuma
  2009-08-25  6:45 ` [ruby-core:25116] " Tanaka Akira
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Daniel Azuma @ 2009-08-25  6:21 UTC (permalink / raw
  To: ruby-core

Bug #1993: IO.select fails when called in multiple threads on 1.8.7p174
http://redmine.ruby-lang.org/issues/show/1993

Author: Daniel Azuma
Status: Open, Priority: Normal
Category: core
ruby -v: ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin9.8.0]

IO#select (Kernel#select) fails when run on different sets of IO objects in different threads. This affects release versions 1.8.7p160, 1.8.7p173, and 1.8.7p174. It does NOT seem to affect recent versions of 1.9.1 that I have tested. It also does NOT affect release version 1.8.7p72. I have not tested 1.8.6 versions. The repro steps have been tested mostly on Mac OS X 10.5.8 on an Intel-based MacBook Pro. I have, however, seen similar behavior on a recent Fedora Linux i686.

To reproduce, run the following script. (Replace the two filenames with distinct known readable files on your system.)

 # Begin code
 
 FILENAME1 = "Rakefile"
 FILENAME2 = "README"
 TWO_THREADS = true
 
 f1 = File.open(FILENAME2)
 f2 = File.open(FILENAME1)
 t1 = Thread.new do
   c1 = 0
   loop do
     c1 += 1
     s1 = IO.select([f1], nil, nil, 0)
     n1 = s1 ? s1.first.size : 0
     puts "t1: num=#{n1} iter=#{c1}"
   end
 end
 t2 = Thread.new do
   c2 = 0
   loop do
     c2 += 1
     s2 = IO.select([f2], nil, nil, 0)
     n2 = s2 ? s2.first.size : 0
     puts "t2: num=#{n2} iter=#{c2}"
   end
 end if TWO_THREADS
 t1.join
 
 # End code

The code simply repeatedly calls IO#select on IO objects known to have readable bytes, either in one thread or two threads. When run on one thread (TWO_THREADS=false), it behaves as expected, printing "num=1" indicating that select has detected the readable stream. However, when run on two threads (TWO_THREADS=true), both threads print "num=0" indicating neither thread is detecting readable information on their streams.

The relevant code appears to be the function rb_thread_schedule() in eval.c, and I believe this issue is related to revision 21165. I haven't been able to untangle everything in this code yet, but here's what I've been able to determine:

* The code that collects file descriptors for the system select() call (lines 11063-11073 of the 1.8.7 branch as of revision 24104) DOES NOT RUN for a given thread unless the thread has a THREAD_STOPPED status at that time (because of line 11051). Therefore, any threads with a THREAD_RUNNABLE status at that time, are effectively shut out of receiving select() results unless their fd lists overlap other threads.

* It appears that the tendency is (given the sample code above) for the next qualifying thread (that is, the thread that will be assigned to the "next" variable later on), to be in the THREAD_RUNNABLE state at this time. Since such threads are shut out of the select() call, they can never be assigned to "th_found" (see lines 11208-11212). As a result, "th_found" is assigned to a later thread in the list, rather than, as appears to be the intent, the first qualifying thread in the list (note the break on line 11214).

* Unfortunately, this mismatches lines 11230ff. Those lines, which choose the "next" thread, always prefer the first thread given equal priority (line 11231). Since "th_found" tends not to be the first qualifying thread, we have a situation where lines 11231 and 11232 are never both true; as a result, th->select_value is never set, and the select calls never succeed.

* The code appeared to work pre-revision-21165 (e.g. 1.8.7p72) because that version of the code set select_value on every qualifying thread, whereas the current code sets it on only one thread.

Here's where I'm unsure about how to proceed with a patch. I would like to move lines 11058 through 11073 to immediately above line 11051. This would add each thread's file descriptors to the select call, regardless of whether the thread has status THREAD_STOPPED or THREAD_RUNNABLE. This change appears to fix the test case above. And I believe it is the correct behavior; however, I'm new to this part of the code and do not have enough understanding of the intent of thread->status to assert that this is correct. I was hoping someone with more knowledge of this area could use this analysis as a starting point.


----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:25116] Re: [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174
  2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
@ 2009-08-25  6:45 ` Tanaka Akira
  2009-08-25 16:53 ` [ruby-core:25121] " Daniel Azuma
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Tanaka Akira @ 2009-08-25  6:45 UTC (permalink / raw
  To: ruby-core

In article <4a9382cec12f4_212ed99b56c6225@redmine.ruby-lang.org>,
  Daniel Azuma <redmine@ruby-lang.org> writes:

> IO#select (Kernel#select) fails when run on different sets of IO objects in different threads. This affects release versions 1.8.7p160, 1.8.7p173, and 1.8.7p174. It does NOT seem to affect recent versions of 1.9.1 that I have tested. It also does NOT affect release version 1.8.7p72. I have not tested 1.8.6 versions. The repro steps have been tested mostly on Mac OS X 10.5.8 on an Intel-based MacBook Pro. I have, however, seen similar behavior on a recent Fedora Linux i686.

The problem is fixed in the latest 1.8 branch.  
-- 
Tanaka Akira

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:25121] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174
  2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
  2009-08-25  6:45 ` [ruby-core:25116] " Tanaka Akira
@ 2009-08-25 16:53 ` Daniel Azuma
  2009-08-25 23:38 ` [ruby-core:25124] " Akira Tanaka
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Daniel Azuma @ 2009-08-25 16:53 UTC (permalink / raw
  To: ruby-core

Issue #1993 has been updated by Daniel Azuma.


One other note-- this CAN be difficult to reproduce, because you have to catch both threads with an IO#select scheduled at the same time. I find the repro code above pretty consistent on my setup, which is Ruby 1.8.7p174, Mac OS 10.5.8, on a MacBook Pro, 2.5 GHz core 2 duo. But as with most threading-related issues, YMMV.

Some related links that have been pointed out to me in since last night:
* This report is probably a duplicate of http://redmine.ruby-lang.org/issues/show/1484
* It looks like this is getting seen by Capistrano users. See the Capistrano bug report at https://capistrano.lighthouseapp.com/projects/8716/tickets/79-capistrano-hangs-on-shell-command-for-many-computers-on-ruby-186-p368 for some extended discussion.
* I blogged about this at http://www.daniel-azuma.com/blog/view/z2ysbx0e4c3it9/ruby_1_8_7_io_select_threading_bug with a few details on the pure Ruby workaround that I'm using for now. Essentially, use a mutex to prevent multiple threads from attempting to call IO#select at the same time.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1993

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:25124] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174
  2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
  2009-08-25  6:45 ` [ruby-core:25116] " Tanaka Akira
  2009-08-25 16:53 ` [ruby-core:25121] " Daniel Azuma
@ 2009-08-25 23:38 ` Akira Tanaka
  2009-08-26  1:59 ` [ruby-core:25126] " Daniel Azuma
  2009-09-07 11:27 ` [ruby-core:25459] [Bug #1993](Closed) " Shyouhei Urabe
  4 siblings, 0 replies; 6+ messages in thread
From: Akira Tanaka @ 2009-08-25 23:38 UTC (permalink / raw
  To: ruby-core

Issue #1993 has been updated by Akira Tanaka.

Assigned to set to Shyouhei Urabe

backport r24413, r24416, r24442.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1993

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:25126] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174
  2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
                   ` (2 preceding siblings ...)
  2009-08-25 23:38 ` [ruby-core:25124] " Akira Tanaka
@ 2009-08-26  1:59 ` Daniel Azuma
  2009-09-07 11:27 ` [ruby-core:25459] [Bug #1993](Closed) " Shyouhei Urabe
  4 siblings, 0 replies; 6+ messages in thread
From: Daniel Azuma @ 2009-08-26  1:59 UTC (permalink / raw
  To: ruby-core

Issue #1993 has been updated by Daniel Azuma.


I ran my tests against r24647 of the ruby_1_8 branch, and it looks like the problem is solved there. Thanks! Looking forward to seeing a 1.8.7 patch.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1993

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:25459] [Bug #1993](Closed) IO.select fails when called in multiple threads on 1.8.7p174
  2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
                   ` (3 preceding siblings ...)
  2009-08-26  1:59 ` [ruby-core:25126] " Daniel Azuma
@ 2009-09-07 11:27 ` Shyouhei Urabe
  4 siblings, 0 replies; 6+ messages in thread
From: Shyouhei Urabe @ 2009-09-07 11:27 UTC (permalink / raw
  To: ruby-core

Issue #1993 has been updated by Shyouhei Urabe.

Status changed from Open to Closed

Applied in changeset r24783.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1993

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-09-07 11:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-25  6:21 [ruby-core:25114] [Bug #1993] IO.select fails when called in multiple threads on 1.8.7p174 Daniel Azuma
2009-08-25  6:45 ` [ruby-core:25116] " Tanaka Akira
2009-08-25 16:53 ` [ruby-core:25121] " Daniel Azuma
2009-08-25 23:38 ` [ruby-core:25124] " Akira Tanaka
2009-08-26  1:59 ` [ruby-core:25126] " Daniel Azuma
2009-09-07 11:27 ` [ruby-core:25459] [Bug #1993](Closed) " Shyouhei Urabe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).