ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: Eric Wong <normalperson@yhbt.net>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:81027] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
Date: Mon, 8 May 2017 00:33:15 +0000	[thread overview]
Message-ID: <20170508003315.GA3789@starla> (raw)
In-Reply-To: <20170403044254.GA16328@starla>

Eric Wong <normalperson@yhbt.net> wrote:
> SASADA Koichi <ko1@atdot.net> wrote:
> > Sorry I can't understand the basic of your idea with mixing Threads and
> > Fibers. Maybe you need to define more about the model.
> 
> Sorry if I wasn't clear.  Basically, I see:
> 
> 	green Threads == Fibers + auto scheduling
> 
> So, making Threads a subclass of Fibers makes sense to me.
> Then, existing (native) threads becomes an own internal class
> only accessible to C Ruby developers; new native threads get
> spawned as-needed (after GVL releases).
> 
> > Our plan is not mixing Threads and Fibers, so that (hopefully) there are
> > no problem.
> 
> OK, I will wait for you and see.

I have been thinking about this again; think M:N green Thread is
a bad idea[1].  Instead we should improve Fibers to make them
easier-to-use for cases where non-blocking I/O is _desirable_
(not just _possible_).

Notes for auto-scheduling Fibers:

* no timeslice or timer-based scheduling.
  I think this will simplify use and avoid race conditions
  compared to Threads.  Things like "numeric += 1" can always
  be atomic with respect to other Fibers in the same native
  Thread.

* not enabled by default for compatibility, maybe have:
  Fiber.current.auto_schedule = (true|false) # per-fiber
  Fiber.auto_schedule = (true|false) # process-wide
  But I do not do Ruby API design :P

* Existing native-thread code for blocking IO must (MUST!)
  continue blocking w/o GVL as in current 2.4.
  Users relying on blocking accept4() (via BasicSocket#accept)
  still gets thundering herd protection when sharing listen
  socket across multiple processes.
  Ditto with UNIXSocket#recv_io when sharing a receiver socket.

* documented scheduling points:

  TL;DR: most existing "blocking" APIs become Fiber-aware,
  similar to 1.8 green threads.

  - IO operations on pipe and sockets inside Fibers with
    auto-scheduling enabled automatically become Fiber-aware
    and use non-blocking internal interfaces while presenting
    a synchronous API:

        IO#read/write/syswrite/sysread/readpartial/gets etc..
        IO.copy_stream, IO.select
        Socket#connect/accept/sysaccept
        UNIXSocket#recv_io/#send_io
        IO#wait_*able (in io/wait ext)

  - Ditto for some non-IO things:

        Kernel#sleep
        Process.wait/waitpid/waitpid2 family uses WNOHANG
        Queue/SizedQueue support, maybe new Fiber::Queue and
        Fiber::SizedQueue classes needed?

  - keep Mutex and ConditionVariable as-is for native Thread
    user, I don't believe they are necessary for pure Fiber use.
    Maybe add an option for Mutex locks to prevent Fiber.yield
    and disable auto-scheduling temporarily?

  - IO#open, read-write I/O on filesystem release GVL as usual

  - It will be necessary to use resolv and resolv/replace in
    stdlib for Fiber-aware name resolution.

* Implementation (steps can be done gradually):

  1. new internal IO scheduler using kqueue/epoll/select.  Native
     kqueue/epoll allow cross-native-thread operation to share
     the event loop, so they only need one new FD per-process.
     I want to avoid libev/libevent since (last I checked) they
     do not allow sharing an event loop across native threads.
     I can write kqueue/epoll/select parts; I guess win32 can use
     select until someone else implements something

     Maybe build IO scheduler into current timer thread....

  2. pipes and sockets get O_NONBLOCK flag set automatically
     when created inside Fibers with auto-scheduling set.

  3. rb_wait_single_fd can use new IO scheduler and becomes
     Fiber-aware, ditto with rb_thread_fd_select...

     Steps 2 and 3 should make most IO changes transparent.

  4. make necessary changes to Process.wait*, IO.select,
     Kernel.sleep



Side note: I consider making Fibers migratable across native
Threads out-of-scope for this.  We currently use
makecontext/swapcontext (FIBER_USE_NATIVE) for speed (which
according to cont.c comments is significant).  I am not
sure if we can keep FIBER_USE_NATIVE if allowing Fibers
to migrate across native threads.


[1] general problem with threads:
    timeslice scheduling leads to unpredictability
    like Mutex/ConditionVariables become necessary.

    M:N will be problematic, as it will be difficult for
    users to know when it is safe to use heavy native threads
    for blocking operations and when their threads will be
    lightweight; making it difficult to design apps to use
    each appropriately.

    However, native 1:1 Threads will always be useful for cases
    where users can take advantage of blocking I/O
    (#recv_io/#accept/File.open/...) as well as releasing GVL
    for CPU-intensive operations independent of Ruby VM.

Thanks for reading, I wrote most of this while waiting for
tests to r58604 to run before committing.

  reply	other threads:[~2017-05-07 23:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170402011414.AEA9B64CEE@svn.ruby-lang.org>
     [not found] ` <8a2b82e3-dc07-1945-55f9-5a474e89130b@ruby-lang.org>
2017-04-02  2:35   ` [ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip] Eric Wong
2017-04-02  3:05     ` [ruby-core:80532] " SASADA Koichi
2017-04-03  4:42       ` [ruby-core:80540] " Eric Wong
2017-05-08  0:33         ` Eric Wong [this message]
2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
2017-05-08  3:01               ` [ruby-core:81031] " Eric Wong
2017-05-08  3:42                 ` [ruby-core:81033] " SASADA Koichi
2017-05-08  6:36                   ` [ruby-core:81035] " Eric Wong
2017-05-09  2:18                     ` [ruby-core:81042] " SASADA Koichi
2017-05-09  3:38                       ` [ruby-core:81044] " Eric Wong
2017-05-09  4:11                         ` [ruby-core:81045] " SASADA Koichi
2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
2017-05-09  6:23                               ` [ruby-core:81053] " Eric Wong
2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
2017-05-10  3:24                                     ` [ruby-core:81083] " SASADA Koichi
2017-05-10 10:04                                       ` [ruby-core:81089] " Eric Wong
2017-05-19  4:34                                         ` [ruby-core:81244] " Eric Wong
2017-06-20 19:16                                   ` [ruby-core:81733] " Eric Wong
2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
2017-05-09  6:15                               ` [ruby-core:81052] " Eric Wong
2017-05-08  2:56             ` [ruby-core:81030] " Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170508003315.GA3789@starla \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).