ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: Eric Wong <normalperson@yhbt.net>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:81030] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
Date: Mon, 8 May 2017 02:56:14 +0000	[thread overview]
Message-ID: <20170508025614.GA24763@starla> (raw)
In-Reply-To: <38090d10-c6a1-5097-66af-130275d773ea@atdot.net>

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/08 9:33, Eric Wong wrote:
> > I have been thinking about this again; think M:N green Thread is
> > a bad idea[1].  Instead we should improve Fibers to make them
> > easier-to-use for cases where non-blocking I/O is _desirable_
> > (not just _possible_).
> 
> Great. That is exactly we are thinking. To discussion, let's define such
> auto-scheduling fibers "auto-fiber"

OK.

> > * not enabled by default for compatibility, maybe have:
> >   Fiber.current.auto_schedule = (true|false) # per-fiber
> >   Fiber.auto_schedule = (true|false) # process-wide
> >   But I do not do Ruby API design :P
> 
> Yes. I'm thinking to introduce new constructor like ScheduledFiber.new
> or something like that (maybe that name is not suitable, too). I believe
> we shouldn't change the behavior just after creation.
> 
> BTW, we need to define the behavior of Fiber#resume, Fiber.yield and
> Fiber#transfer for auto-fibers.

I think they should be usable as-is.  Maybe users will want to
manually switch after a certain time period.

> * Permit to use them.
> 
> > * Existing native-thread code for blocking IO must (MUST!)
> >   continue blocking w/o GVL as in current 2.4.
> >   Users relying on blocking accept4() (via BasicSocket#accept)
> >   still gets thundering herd protection when sharing listen
> >   socket across multiple processes.
> >   Ditto with UNIXSocket#recv_io when sharing a receiver socket.
> 
> Not sure about this. We need to define I/O blocking operation again for
> this auto-switching fiber (maybe the following your documents define them).

I think it's important users continue to have options and be
able to decide if blocking or non-blocking is better for
their use case.

> > * documented scheduling points:
> > 
> >   TL;DR: most existing "blocking" APIs become Fiber-aware,
> >   similar to 1.8 green threads.
> > 
> >   - IO operations on pipe and sockets inside Fibers with
> >     auto-scheduling enabled automatically become Fiber-aware
> >     and use non-blocking internal interfaces while presenting
> >     a synchronous API:
> 
> Only pipe and sockets?

Maybe some character devices/tty (I have never coded for them).
There's no standard non-blocking I/O for regular files on POSIX.

> >         IO#read/write/syswrite/sysread/readpartial/gets etc..
> >         IO.copy_stream, IO.select
> >         Socket#connect/accept/sysaccept
> >         UNIXSocket#recv_io/#send_io
> >         IO#wait_*able (in io/wait ext)
> > 
> >   - Ditto for some non-IO things:
> > 
> >         Kernel#sleep
> >         Process.wait/waitpid/waitpid2 family uses WNOHANG
> >         Queue/SizedQueue support, maybe new Fiber::Queue and
> >         Fiber::SizedQueue classes needed?
> 
> Just now, Fiber::Queue is not good idea I think because it is difficult
> to switch Thread::Queue and Fiber::Queue. However, I agree that we need
> to introduce scheduling primitives and Queue is good to use. We need to
> introduce it carefully.

Of course, I don't want to introduce too many new user-visible API.

> >   - keep Mutex and ConditionVariable as-is for native Thread
> >     user, I don't believe they are necessary for pure Fiber use.
> >     Maybe add an option for Mutex locks to prevent Fiber.yield
> >     and disable auto-scheduling temporarily?
> 
> I can't understand that. Mutex (and so on) are for Threads. Does they
> need to care Fibers?

It might make it easier to port existing code and libraries
written for Threads to use Fibers if something like Mutex
can temporarily disable auto-switch.

> >   - IO#open, read-write I/O on filesystem release GVL as usual
> 
> Not sure why they do.

Non-blocking I/O is not possible on POSIX platforms.
Existing AIO interfaces are always incomplete and inconsistent.
We can try making our own AIO for File, but there will be more
latency than current FS access with native threads.

> >   - It will be necessary to use resolv and resolv/replace in
> >     stdlib for Fiber-aware name resolution.
> 
> It seems difficult...

I'm not sure, I guess they become less-maintained nowadays
since native threads came in 1.9 for getaddrinfo.
But We can improve them.

> > * Implementation (steps can be done gradually):
> > 
> >   1. new internal IO scheduler using kqueue/epoll/select.  Native
> >      kqueue/epoll allow cross-native-thread operation to share
> >      the event loop, so they only need one new FD per-process.
> >      I want to avoid libev/libevent since (last I checked) they
> >      do not allow sharing an event loop across native threads.
> >      I can write kqueue/epoll/select parts; I guess win32 can use
> >      select until someone else implements something
> > 
> >      Maybe build IO scheduler into current timer thread....
> 
> I planned to run per-thread Fiber scheduler and to use epoll (and so on)
> on the thread because of overhead of cross-thread communication. I think
> we need to compare them (I didn't try it yet).

I don't think it will benefit to use per-thread epoll/kqueue,
especially with GVL.  epoll and kqueue have internal locking,
anyways, so we can avoid adding our own locks around them.

Sharing a single epoll/kqueue FD between dozens/hundreds of
pthreads in cmogstored(*) is no problem at all (but cmogstored
is designed for high-latency rotational disks).

I believe GHC (Glasgow Haskell Compiler) uses a similar design
based on one-shot epoll/kqueue notifications.

> >   2. pipes and sockets get O_NONBLOCK flag set automatically
> >      when created inside Fibers with auto-scheduling set.
> 
> not sure about it.

Maybe there is compatibility problem when sharing FDs with other
processes.  But we had the reverse change from 1.8 -> 1.9,
I think it is minor.


(*) git clone git://bogomips.org/cmogstored
    All C99 + Ragel at runtime; only Ruby is in the test suite
    https://bogomips.org/cmogstored/queues.txt
    https://bogomips.org/cmogstored/design.txt

      parent reply	other threads:[~2017-05-08  2:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170402011414.AEA9B64CEE@svn.ruby-lang.org>
     [not found] ` <8a2b82e3-dc07-1945-55f9-5a474e89130b@ruby-lang.org>
2017-04-02  2:35   ` [ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip] Eric Wong
2017-04-02  3:05     ` [ruby-core:80532] " SASADA Koichi
2017-04-03  4:42       ` [ruby-core:80540] " Eric Wong
2017-05-08  0:33         ` [ruby-core:81027] " Eric Wong
2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
2017-05-08  3:01               ` [ruby-core:81031] " Eric Wong
2017-05-08  3:42                 ` [ruby-core:81033] " SASADA Koichi
2017-05-08  6:36                   ` [ruby-core:81035] " Eric Wong
2017-05-09  2:18                     ` [ruby-core:81042] " SASADA Koichi
2017-05-09  3:38                       ` [ruby-core:81044] " Eric Wong
2017-05-09  4:11                         ` [ruby-core:81045] " SASADA Koichi
2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
2017-05-09  6:23                               ` [ruby-core:81053] " Eric Wong
2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
2017-05-10  3:24                                     ` [ruby-core:81083] " SASADA Koichi
2017-05-10 10:04                                       ` [ruby-core:81089] " Eric Wong
2017-05-19  4:34                                         ` [ruby-core:81244] " Eric Wong
2017-06-20 19:16                                   ` [ruby-core:81733] " Eric Wong
2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
2017-05-09  6:15                               ` [ruby-core:81052] " Eric Wong
2017-05-08  2:56             ` Eric Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170508025614.GA24763@starla \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).