[ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]

ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed

* [ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
       [not found] ` <8a2b82e3-dc07-1945-55f9-5a474e89130b@ruby-lang.org>
@ 2017-04-02  2:35   ` Eric Wong
  2017-04-02  3:05     ` [ruby-core:80532] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-04-02  2:35 UTC (permalink / raw)
  To: SASADA Koichi; +Cc: Ruby developers

SASADA Koichi <ko1@ruby-lang.org> wrote:
> Hi Eric,
> 
> I agree it is possible to use M:N model.
> 
> There are several problem mainly because of C extensions (and libraries
> which C extensions use).
> 
> 1.We can't move execution context across native threads by userland
>   because some libraries can use thread local variables used by
>   native thread system.
>   Also because some libraries can depends on C-stack layout.

Hi ko1, thank you for response.

Correct, I'm not sure if this can be changed while maintaining
compatibility.  Anyways I think I am fine with this limitation
where a Fiber is always tied to a particular native thread.

> 2.Some libraries can stop threads because of uncontrollable
>   I/O operations (wait for IO (network) with system calls),
>   uncontrollable system synchronization (mutex, semaphore, ...),
>   big computation (like mathematical computation).
> 
> Current 1:1 thread model does not have such issues (if C extensions
> release GVL correctly), so we employee it.
> However, the overhead of thread creation and thread switching is high,
> as you say.
> 
> However, the issues 1 and 2 are *possible* issues. We don't know which
> libraries have a problem. If we only use managed C extensions, there are
> no problem to use M:N mode.

2 is tricky.  I think in worst case (no modifying existing C
exts or API), M:N will degrade to current 1:1 model, which
retains 100% compatibility with current Ruby 1.9/2.x code.

> For Ruby 3, Matz want to encourage such fine grain context switching. We
> discussed before and we planed to introduce automatic Fiber switching at
> specific I/O operation. It is small version of your proposal, it is one
> possibility (and it is easy than complete M:N model). I'll change Fiber
> context management to lightwieght switching for Ruby 2.5 and try it at
> later versions.

Cool!  I was thinking along the same lines.  I think Ruby Thread
class can become a subclass of Fiber with automatic switching.

However, to spawn native threads:

If a Thread uses existing GVL release C-API, then the _next_
Thread.new call will create a native thread (and future
Thread.new will be subclass of Fiber in new native thread).

So, in pseudo code:

  class Thread < Fiber
    def self.new
      case Thread.current[:gvl_state]
      when :none
         # default
         super # M += 1
      when :released
        # this is set by BLOCKING_REGION GVL release
        # only allow a user-level thread to spawn one new native thread
	Thread.current[:gvl_state] = :spawned

        NativeThread.new { Thread.new } # N += 1
      when :spawned
        # We already spawned on native thread from this user-level
	# thread, only spawn new user-level thread for now.
	super # M += 1
      end
    end
  end

Current GVL release operations will change
Thread.current[:gvl_state] from :none -> :released

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:80532] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-04-02  2:35   ` [ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip] Eric Wong
@ 2017-04-02  3:05     ` SASADA Koichi
  2017-04-03  4:42       ` [ruby-core:80540] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-04-02  3:05 UTC (permalink / raw)
  To: Ruby developers

On 2017/04/02 11:35, Eric Wong wrote:
> However, to spawn native threads:
> 
> If a Thread uses existing GVL release C-API, then the _next_
> Thread.new call will create a native thread (and future
> Thread.new will be subclass of Fiber in new native thread).
> 
> So, in pseudo code:
> 
>   class Thread < Fiber
>     def self.new
>       case Thread.current[:gvl_state]
>       when :none
>          # default
>          super # M += 1
>       when :released
>         # this is set by BLOCKING_REGION GVL release
>         # only allow a user-level thread to spawn one new native thread
> 	Thread.current[:gvl_state] = :spawned
> 
>         NativeThread.new { Thread.new } # N += 1
>       when :spawned
>         # We already spawned on native thread from this user-level
> 	# thread, only spawn new user-level thread for now.
> 	super # M += 1
>       end
>     end
>   end
> 
> Current GVL release operations will change
> Thread.current[:gvl_state] from :none -> :released

Sorry I can't understand the basic of your idea with mixing Threads and
Fibers. Maybe you need to define more about the model.

Our plan is not mixing Threads and Fibers, so that (hopefully) there are
no problem.

Thanks,
Koichi

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:80540] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-04-02  3:05     ` [ruby-core:80532] " SASADA Koichi
@ 2017-04-03  4:42       ` Eric Wong
  2017-05-08  0:33         ` [ruby-core:81027] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-04-03  4:42 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/04/02 11:35, Eric Wong wrote:
> > However, to spawn native threads:
> > 
> > If a Thread uses existing GVL release C-API, then the _next_
> > Thread.new call will create a native thread (and future
> > Thread.new will be subclass of Fiber in new native thread).
> > 
> > So, in pseudo code:
> > 
> >   class Thread < Fiber
> >     def self.new
> >       case Thread.current[:gvl_state]
> >       when :none
> >          # default
> >          super # M += 1
> >       when :released
> >         # this is set by BLOCKING_REGION GVL release
> >         # only allow a user-level thread to spawn one new native thread
> > 	Thread.current[:gvl_state] = :spawned
> > 
> >         NativeThread.new { Thread.new } # N += 1
> >       when :spawned
> >         # We already spawned on native thread from this user-level
> > 	# thread, only spawn new user-level thread for now.
> > 	super # M += 1
> >       end
> >     end
> >   end
> > 
> > Current GVL release operations will change
> > Thread.current[:gvl_state] from :none -> :released
> 
> Sorry I can't understand the basic of your idea with mixing Threads and
> Fibers. Maybe you need to define more about the model.

Sorry if I wasn't clear.  Basically, I see:

	green Threads == Fibers + auto scheduling

So, making Threads a subclass of Fibers makes sense to me.
Then, existing (native) threads becomes an own internal class
only accessible to C Ruby developers; new native threads get
spawned as-needed (after GVL releases).

> Our plan is not mixing Threads and Fibers, so that (hopefully) there are
> no problem.

OK, I will wait for you and see.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81027] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-04-03  4:42       ` [ruby-core:80540] " Eric Wong
@ 2017-05-08  0:33         ` Eric Wong
  2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-08  0:33 UTC (permalink / raw)
  To: ruby-core

Eric Wong <normalperson@yhbt.net> wrote:
> SASADA Koichi <ko1@atdot.net> wrote:
> > Sorry I can't understand the basic of your idea with mixing Threads and
> > Fibers. Maybe you need to define more about the model.
> 
> Sorry if I wasn't clear.  Basically, I see:
> 
> 	green Threads == Fibers + auto scheduling
> 
> So, making Threads a subclass of Fibers makes sense to me.
> Then, existing (native) threads becomes an own internal class
> only accessible to C Ruby developers; new native threads get
> spawned as-needed (after GVL releases).
> 
> > Our plan is not mixing Threads and Fibers, so that (hopefully) there are
> > no problem.
> 
> OK, I will wait for you and see.

I have been thinking about this again; think M:N green Thread is
a bad idea[1].  Instead we should improve Fibers to make them
easier-to-use for cases where non-blocking I/O is _desirable_
(not just _possible_).

Notes for auto-scheduling Fibers:

* no timeslice or timer-based scheduling.
  I think this will simplify use and avoid race conditions
  compared to Threads.  Things like "numeric += 1" can always
  be atomic with respect to other Fibers in the same native
  Thread.

* not enabled by default for compatibility, maybe have:
  Fiber.current.auto_schedule = (true|false) # per-fiber
  Fiber.auto_schedule = (true|false) # process-wide
  But I do not do Ruby API design :P

* Existing native-thread code for blocking IO must (MUST!)
  continue blocking w/o GVL as in current 2.4.
  Users relying on blocking accept4() (via BasicSocket#accept)
  still gets thundering herd protection when sharing listen
  socket across multiple processes.
  Ditto with UNIXSocket#recv_io when sharing a receiver socket.

* documented scheduling points:

  TL;DR: most existing "blocking" APIs become Fiber-aware,
  similar to 1.8 green threads.

  - IO operations on pipe and sockets inside Fibers with
    auto-scheduling enabled automatically become Fiber-aware
    and use non-blocking internal interfaces while presenting
    a synchronous API:

        IO#read/write/syswrite/sysread/readpartial/gets etc..
        IO.copy_stream, IO.select
        Socket#connect/accept/sysaccept
        UNIXSocket#recv_io/#send_io
        IO#wait_*able (in io/wait ext)

  - Ditto for some non-IO things:

        Kernel#sleep
        Process.wait/waitpid/waitpid2 family uses WNOHANG
        Queue/SizedQueue support, maybe new Fiber::Queue and
        Fiber::SizedQueue classes needed?

  - keep Mutex and ConditionVariable as-is for native Thread
    user, I don't believe they are necessary for pure Fiber use.
    Maybe add an option for Mutex locks to prevent Fiber.yield
    and disable auto-scheduling temporarily?

  - IO#open, read-write I/O on filesystem release GVL as usual

  - It will be necessary to use resolv and resolv/replace in
    stdlib for Fiber-aware name resolution.

* Implementation (steps can be done gradually):

  1. new internal IO scheduler using kqueue/epoll/select.  Native
     kqueue/epoll allow cross-native-thread operation to share
     the event loop, so they only need one new FD per-process.
     I want to avoid libev/libevent since (last I checked) they
     do not allow sharing an event loop across native threads.
     I can write kqueue/epoll/select parts; I guess win32 can use
     select until someone else implements something

     Maybe build IO scheduler into current timer thread....

  2. pipes and sockets get O_NONBLOCK flag set automatically
     when created inside Fibers with auto-scheduling set.

  3. rb_wait_single_fd can use new IO scheduler and becomes
     Fiber-aware, ditto with rb_thread_fd_select...

     Steps 2 and 3 should make most IO changes transparent.

  4. make necessary changes to Process.wait*, IO.select,
     Kernel.sleep



Side note: I consider making Fibers migratable across native
Threads out-of-scope for this.  We currently use
makecontext/swapcontext (FIBER_USE_NATIVE) for speed (which
according to cont.c comments is significant).  I am not
sure if we can keep FIBER_USE_NATIVE if allowing Fibers
to migrate across native threads.


[1] general problem with threads:
    timeslice scheduling leads to unpredictability
    like Mutex/ConditionVariables become necessary.

    M:N will be problematic, as it will be difficult for
    users to know when it is safe to use heavy native threads
    for blocking operations and when their threads will be
    lightweight; making it difficult to design apps to use
    each appropriately.

    However, native 1:1 Threads will always be useful for cases
    where users can take advantage of blocking I/O
    (#recv_io/#accept/File.open/...) as well as releasing GVL
    for CPU-intensive operations independent of Ruby VM.

Thanks for reading, I wrote most of this while waiting for
tests to r58604 to run before committing.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81028] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  0:33         ` [ruby-core:81027] " Eric Wong
@ 2017-05-08  1:53           ` SASADA Koichi
  2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
  2017-05-08  2:56             ` [ruby-core:81030] " Eric Wong
  0 siblings, 2 replies; 24+ messages in thread
From: SASADA Koichi @ 2017-05-08  1:53 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/08 9:33, Eric Wong wrote:
> I have been thinking about this again; think M:N green Thread is
> a bad idea[1].  Instead we should improve Fibers to make them
> easier-to-use for cases where non-blocking I/O is _desirable_
> (not just _possible_).

Great. That is exactly we are thinking. To discussion, let's define such
auto-scheduling fibers "auto-fiber"

> * not enabled by default for compatibility, maybe have:
>   Fiber.current.auto_schedule = (true|false) # per-fiber
>   Fiber.auto_schedule = (true|false) # process-wide
>   But I do not do Ruby API design :P

Yes. I'm thinking to introduce new constructor like ScheduledFiber.new
or something like that (maybe that name is not suitable, too). I believe
we shouldn't change the behavior just after creation.

BTW, we need to define the behavior of Fiber#resume, Fiber.yield and
Fiber#transfer for auto-fibers.

* Permit to use them.

> * Existing native-thread code for blocking IO must (MUST!)
>   continue blocking w/o GVL as in current 2.4.
>   Users relying on blocking accept4() (via BasicSocket#accept)
>   still gets thundering herd protection when sharing listen
>   socket across multiple processes.
>   Ditto with UNIXSocket#recv_io when sharing a receiver socket.

Not sure about this. We need to define I/O blocking operation again for
this auto-switching fiber (maybe the following your documents define them).

> * documented scheduling points:
> 
>   TL;DR: most existing "blocking" APIs become Fiber-aware,
>   similar to 1.8 green threads.
> 
>   - IO operations on pipe and sockets inside Fibers with
>     auto-scheduling enabled automatically become Fiber-aware
>     and use non-blocking internal interfaces while presenting
>     a synchronous API:

Only pipe and sockets?

>         IO#read/write/syswrite/sysread/readpartial/gets etc..
>         IO.copy_stream, IO.select
>         Socket#connect/accept/sysaccept
>         UNIXSocket#recv_io/#send_io
>         IO#wait_*able (in io/wait ext)
> 
>   - Ditto for some non-IO things:
> 
>         Kernel#sleep
>         Process.wait/waitpid/waitpid2 family uses WNOHANG
>         Queue/SizedQueue support, maybe new Fiber::Queue and
>         Fiber::SizedQueue classes needed?

Just now, Fiber::Queue is not good idea I think because it is difficult
to switch Thread::Queue and Fiber::Queue. However, I agree that we need
to introduce scheduling primitives and Queue is good to use. We need to
introduce it carefully.

>   - keep Mutex and ConditionVariable as-is for native Thread
>     user, I don't believe they are necessary for pure Fiber use.
>     Maybe add an option for Mutex locks to prevent Fiber.yield
>     and disable auto-scheduling temporarily?

I can't understand that. Mutex (and so on) are for Threads. Does they
need to care Fibers?

>   - IO#open, read-write I/O on filesystem release GVL as usual

Not sure why they do.

>   - It will be necessary to use resolv and resolv/replace in
>     stdlib for Fiber-aware name resolution.

It seems difficult...

> * Implementation (steps can be done gradually):
> 
>   1. new internal IO scheduler using kqueue/epoll/select.  Native
>      kqueue/epoll allow cross-native-thread operation to share
>      the event loop, so they only need one new FD per-process.
>      I want to avoid libev/libevent since (last I checked) they
>      do not allow sharing an event loop across native threads.
>      I can write kqueue/epoll/select parts; I guess win32 can use
>      select until someone else implements something
> 
>      Maybe build IO scheduler into current timer thread....

I planned to run per-thread Fiber scheduler and to use epoll (and so on)
on the thread because of overhead of cross-thread communication. I think
we need to compare them (I didn't try it yet).

>   2. pipes and sockets get O_NONBLOCK flag set automatically
>      when created inside Fibers with auto-scheduling set.

not sure about it.

>   3. rb_wait_single_fd can use new IO scheduler and becomes
>      Fiber-aware, ditto with rb_thread_fd_select...
> 
>      Steps 2 and 3 should make most IO changes transparent.
> 
>   4. make necessary changes to Process.wait*, IO.select,
>      Kernel.sleep
> 
> 
> 
> Side note: I consider making Fibers migratable across native
> Threads out-of-scope for this.  We currently use
> makecontext/swapcontext (FIBER_USE_NATIVE) for speed (which
> according to cont.c comments is significant).  I am not
> sure if we can keep FIBER_USE_NATIVE if allowing Fibers
> to migrate across native threads.
> 
> 
> [1] general problem with threads:
>     timeslice scheduling leads to unpredictability
>     like Mutex/ConditionVariables become necessary.
> 
>     M:N will be problematic, as it will be difficult for
>     users to know when it is safe to use heavy native threads
>     for blocking operations and when their threads will be
>     lightweight; making it difficult to design apps to use
>     each appropriately.
> 
>     However, native 1:1 Threads will always be useful for cases
>     where users can take advantage of blocking I/O
>     (#recv_io/#accept/File.open/...) as well as releasing GVL
>     for CPU-intensive operations independent of Ruby VM.
> 
> Thanks for reading, I wrote most of this while waiting for
> tests to r58604 to run before committing.
> 
> Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>
> 


-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81029] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
@ 2017-05-08  2:16             ` SASADA Koichi
  2017-05-08  3:01               ` [ruby-core:81031] " Eric Wong
  2017-05-08  2:56             ` [ruby-core:81030] " Eric Wong
  1 sibling, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-08  2:16 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/08 10:53, SASADA Koichi wrote:
> Yes. I'm thinking to introduce new constructor like ScheduledFiber.new
> or something like that (maybe that name is not suitable, too). I believe
> we shouldn't change the behavior just after creation.

`Fiber.scheduler.add_auto_fiber{ ... }` (naming is not fixed) (operation
for scheduler) is my first idea.

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81030] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
  2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
@ 2017-05-08  2:56             ` Eric Wong
  1 sibling, 0 replies; 24+ messages in thread
From: Eric Wong @ 2017-05-08  2:56 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/08 9:33, Eric Wong wrote:
> > I have been thinking about this again; think M:N green Thread is
> > a bad idea[1].  Instead we should improve Fibers to make them
> > easier-to-use for cases where non-blocking I/O is _desirable_
> > (not just _possible_).
> 
> Great. That is exactly we are thinking. To discussion, let's define such
> auto-scheduling fibers "auto-fiber"

OK.

> > * not enabled by default for compatibility, maybe have:
> >   Fiber.current.auto_schedule = (true|false) # per-fiber
> >   Fiber.auto_schedule = (true|false) # process-wide
> >   But I do not do Ruby API design :P
> 
> Yes. I'm thinking to introduce new constructor like ScheduledFiber.new
> or something like that (maybe that name is not suitable, too). I believe
> we shouldn't change the behavior just after creation.
> 
> BTW, we need to define the behavior of Fiber#resume, Fiber.yield and
> Fiber#transfer for auto-fibers.

I think they should be usable as-is.  Maybe users will want to
manually switch after a certain time period.

> * Permit to use them.
> 
> > * Existing native-thread code for blocking IO must (MUST!)
> >   continue blocking w/o GVL as in current 2.4.
> >   Users relying on blocking accept4() (via BasicSocket#accept)
> >   still gets thundering herd protection when sharing listen
> >   socket across multiple processes.
> >   Ditto with UNIXSocket#recv_io when sharing a receiver socket.
> 
> Not sure about this. We need to define I/O blocking operation again for
> this auto-switching fiber (maybe the following your documents define them).

I think it's important users continue to have options and be
able to decide if blocking or non-blocking is better for
their use case.

> > * documented scheduling points:
> > 
> >   TL;DR: most existing "blocking" APIs become Fiber-aware,
> >   similar to 1.8 green threads.
> > 
> >   - IO operations on pipe and sockets inside Fibers with
> >     auto-scheduling enabled automatically become Fiber-aware
> >     and use non-blocking internal interfaces while presenting
> >     a synchronous API:
> 
> Only pipe and sockets?

Maybe some character devices/tty (I have never coded for them).
There's no standard non-blocking I/O for regular files on POSIX.

> >         IO#read/write/syswrite/sysread/readpartial/gets etc..
> >         IO.copy_stream, IO.select
> >         Socket#connect/accept/sysaccept
> >         UNIXSocket#recv_io/#send_io
> >         IO#wait_*able (in io/wait ext)
> > 
> >   - Ditto for some non-IO things:
> > 
> >         Kernel#sleep
> >         Process.wait/waitpid/waitpid2 family uses WNOHANG
> >         Queue/SizedQueue support, maybe new Fiber::Queue and
> >         Fiber::SizedQueue classes needed?
> 
> Just now, Fiber::Queue is not good idea I think because it is difficult
> to switch Thread::Queue and Fiber::Queue. However, I agree that we need
> to introduce scheduling primitives and Queue is good to use. We need to
> introduce it carefully.

Of course, I don't want to introduce too many new user-visible API.

> >   - keep Mutex and ConditionVariable as-is for native Thread
> >     user, I don't believe they are necessary for pure Fiber use.
> >     Maybe add an option for Mutex locks to prevent Fiber.yield
> >     and disable auto-scheduling temporarily?
> 
> I can't understand that. Mutex (and so on) are for Threads. Does they
> need to care Fibers?

It might make it easier to port existing code and libraries
written for Threads to use Fibers if something like Mutex
can temporarily disable auto-switch.

> >   - IO#open, read-write I/O on filesystem release GVL as usual
> 
> Not sure why they do.

Non-blocking I/O is not possible on POSIX platforms.
Existing AIO interfaces are always incomplete and inconsistent.
We can try making our own AIO for File, but there will be more
latency than current FS access with native threads.

> >   - It will be necessary to use resolv and resolv/replace in
> >     stdlib for Fiber-aware name resolution.
> 
> It seems difficult...

I'm not sure, I guess they become less-maintained nowadays
since native threads came in 1.9 for getaddrinfo.
But We can improve them.

> > * Implementation (steps can be done gradually):
> > 
> >   1. new internal IO scheduler using kqueue/epoll/select.  Native
> >      kqueue/epoll allow cross-native-thread operation to share
> >      the event loop, so they only need one new FD per-process.
> >      I want to avoid libev/libevent since (last I checked) they
> >      do not allow sharing an event loop across native threads.
> >      I can write kqueue/epoll/select parts; I guess win32 can use
> >      select until someone else implements something
> > 
> >      Maybe build IO scheduler into current timer thread....
> 
> I planned to run per-thread Fiber scheduler and to use epoll (and so on)
> on the thread because of overhead of cross-thread communication. I think
> we need to compare them (I didn't try it yet).

I don't think it will benefit to use per-thread epoll/kqueue,
especially with GVL.  epoll and kqueue have internal locking,
anyways, so we can avoid adding our own locks around them.

Sharing a single epoll/kqueue FD between dozens/hundreds of
pthreads in cmogstored(*) is no problem at all (but cmogstored
is designed for high-latency rotational disks).

I believe GHC (Glasgow Haskell Compiler) uses a similar design
based on one-shot epoll/kqueue notifications.

> >   2. pipes and sockets get O_NONBLOCK flag set automatically
> >      when created inside Fibers with auto-scheduling set.
> 
> not sure about it.

Maybe there is compatibility problem when sharing FDs with other
processes.  But we had the reverse change from 1.8 -> 1.9,
I think it is minor.


(*) git clone git://bogomips.org/cmogstored
    All C99 + Ragel at runtime; only Ruby is in the test suite
    https://bogomips.org/cmogstored/queues.txt
    https://bogomips.org/cmogstored/design.txt

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81031] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
@ 2017-05-08  3:01               ` Eric Wong
  2017-05-08  3:42                 ` [ruby-core:81033] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-08  3:01 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/08 10:53, SASADA Koichi wrote:
> > Yes. I'm thinking to introduce new constructor like ScheduledFiber.new
> > or something like that (maybe that name is not suitable, too). I believe
> > we shouldn't change the behavior just after creation.
> 
> `Fiber.scheduler.add_auto_fiber{ ... }` (naming is not fixed) (operation
> for scheduler) is my first idea.

Too verbose, I think.  If I want to type more, I would not be
using Ruby :)   How about adding kwarg to Fiber.new?

	Fiber.new(auto: true) { ... }

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81033] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  3:01               ` [ruby-core:81031] " Eric Wong
@ 2017-05-08  3:42                 ` SASADA Koichi
  2017-05-08  6:36                   ` [ruby-core:81035] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-08  3:42 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/08 12:01, Eric Wong wrote:
>>> .
>> `Fiber.scheduler.add_auto_fiber{ ... }` (naming is not fixed) (operation
>> for scheduler) is my first idea.
> Too verbose, I think.  If I want to type more, I would not be
> using Ruby :)   How about adding kwarg to Fiber.new?

This design introduce new aspect: can we make schedulers per a thread?

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81035] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  3:42                 ` [ruby-core:81033] " SASADA Koichi
@ 2017-05-08  6:36                   ` Eric Wong
  2017-05-09  2:18                     ` [ruby-core:81042] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-08  6:36 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/08 12:01, Eric Wong wrote:
> >>> .
> >> `Fiber.scheduler.add_auto_fiber{ ... }` (naming is not fixed) (operation
> >> for scheduler) is my first idea.
> > Too verbose, I think.  If I want to type more, I would not be
> > using Ruby :)   How about adding kwarg to Fiber.new?
> 
> This design introduce new aspect: can we make schedulers per a thread?

Maybe; if we can avoid GVL and introduce more parallelism.

However, I think having one epoll/kqueue FD is better for a
whole process; maybe one epoll/kqueue per-core (not per-thread)
at maximum.

I can easily imagine Ruby doing 100 native threads in one process
(8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.

Side note: First, I would like to make fibers smaller.
Right now rb_fiber_t stores all of the rb_thread_t
struct, but not all fields get used.  I started to work on
splitting out to a new struct rb_thread_context_t earlier:

	https://80x24.org/spew/20170508040753.24975-1-e@80x24.org/raw
	(incomplete, I will work on it some more tomorrow)

The end goal is to avoid storing all of rb_thread_t inside
rb_context_t/rb_fiber_t; and only store rb_thread_context_t.
That should reduce memory overhead and maybe make switching
faster.

Also, I think we can use uint32_t (or even uint16_t * 4096)
to store stack sizes.  Using 64-bit size_t represent a
stack size is excessive.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81042] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-08  6:36                   ` [ruby-core:81035] " Eric Wong
@ 2017-05-09  2:18                     ` SASADA Koichi
  2017-05-09  3:38                       ` [ruby-core:81044] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-09  2:18 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/08 15:36, Eric Wong wrote:
> Maybe; if we can avoid GVL and introduce more parallelism.
>
> However, I think having one epoll/kqueue FD is better for a
> whole process; maybe one epoll/kqueue per-core (not per-thread)
> at maximum.
>
> I can easily imagine Ruby doing 100 native threads in one process
> (8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.

could you elaborate more? 100 epoll threads are not effective?
Honestly, I have no experience to use epoll/kqueue.


# context switching to another topic

> Side note: First, I would like to make fibers smaller.
> Right now rb_fiber_t stores all of the rb_thread_t
> struct, but not all fields get used.  I started to work on
> splitting out to a new struct rb_thread_context_t earlier:
...
> The end goal is to avoid storing all of rb_thread_t inside
> rb_context_t/rb_fiber_t; and only store rb_thread_context_t.
> That should reduce memory overhead and maybe make switching
> faster.

This is what my goal of Ruby 2.5 (2017) I proposed to my company. If you
do it, it's great (and I achieved one of my job :)).

My plan is almost similar but I want to introduce something like
`mrb_state` which passed to all mruby functions as first argument.

Do you want to commit this patch before your final goal (lightweight
fiber switching)?


FYI: my plan.

(1) Make separate execution context and make fiber switching lightweight
  * before 2.5
  * You named `rb_thread_context_t`, but I don't want to name `thread`
    so I planned to name it `execution_context` and so on (a bit longer)
(2-1: extend Fiber) Add Fiber scheduler like you are thinking.
(2-2: toward Guild) Add `execution_context` as first argument to
    all C APIs
  * to keep compatibility, we need to introduce new prefix `rbX_...`
    for new APIs which receive first argument.
  * On mruby, `mrb_state` is passed to all of APIs. We need to consider
    the passed information carefully.

Thanks,
Koichi

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81044] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  2:18                     ` [ruby-core:81042] " SASADA Koichi
@ 2017-05-09  3:38                       ` Eric Wong
  2017-05-09  4:11                         ` [ruby-core:81045] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-09  3:38 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/08 15:36, Eric Wong wrote:
> > Maybe; if we can avoid GVL and introduce more parallelism.
> >
> > However, I think having one epoll/kqueue FD is better for a
> > whole process; maybe one epoll/kqueue per-core (not per-thread)
> > at maximum.
> >
> > I can easily imagine Ruby doing 100 native threads in one process
> > (8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.
> 
> could you elaborate more? 100 epoll threads are not effective?
> Honestly, I have no experience to use epoll/kqueue.

100 epoll FDs is a waste of FDs; especially since it is common
to have a 1024 FD limit.  I already feel bad about timer thread
taking up two FDs; but maybe epoll/kevent can cut reduce that.

In the kernel, every "struct eventpoll" + "struct file" in
Linux is at least 400 bytes of unswappable kernel memory.

Anyways, I've contributed bugfixes to both epoll in the Linux
kernel and also to libkqueue (userspace emulation lib);
and use them both in several projects in and outside of Ruby.

> # context switching to another topic
> 
> > Side note: First, I would like to make fibers smaller.
> > Right now rb_fiber_t stores all of the rb_thread_t
> > struct, but not all fields get used.  I started to work on
> > splitting out to a new struct rb_thread_context_t earlier:
> ...
> > The end goal is to avoid storing all of rb_thread_t inside
> > rb_context_t/rb_fiber_t; and only store rb_thread_context_t.
> > That should reduce memory overhead and maybe make switching
> > faster.
> 
> This is what my goal of Ruby 2.5 (2017) I proposed to my company. If you
> do it, it's great (and I achieved one of my job :)).
> 
> My plan is almost similar but I want to introduce something like
> `mrb_state` which passed to all mruby functions as first argument.
> 
> Do you want to commit this patch before your final goal (lightweight
> fiber switching)?
> 
> 
> FYI: my plan.
> 
> (1) Make separate execution context and make fiber switching lightweight
>   * before 2.5
>   * You named `rb_thread_context_t`, but I don't want to name `thread`
>     so I planned to name it `execution_context` and so on (a bit longer)

OK, I can rename my work-in-progress patch with
s/rb_thread_context_t/rb_execution_context_t/ and commit
later tonight.

> (2-1: extend Fiber) Add Fiber scheduler like you are thinking.
> (2-2: toward Guild) Add `execution_context` as first argument to
>     all C APIs
>   * to keep compatibility, we need to introduce new prefix `rbX_...`
>     for new APIs which receive first argument.
>   * On mruby, `mrb_state` is passed to all of APIs. We need to consider
>     the passed information carefully.

OK, that sounds good.

Also, can you take a look at implementing [Feature #13434]
"better method definition in C API" for rbX_*?
You are more knowledgeable in VM + method definition area;
while I can work on the epoll/kqueue parts for I/O scheduling.

(but I will need to schedule my own human time to work on Ruby :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81045] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  3:38                       ` [ruby-core:81044] " Eric Wong
@ 2017-05-09  4:11                         ` SASADA Koichi
  2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-09  4:11 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/09 12:38, Eric Wong wrote:
> 100 epoll FDs is a waste of FDs; especially since it is common
> to have a 1024 FD limit.  I already feel bad about timer thread
> taking up two FDs; but maybe epoll/kevent can cut reduce that.

1024 soft limit and 4096 hard limit is an issue. However, if we employ

> I can easily imagine Ruby doing 100 native threads in one process
> (8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.

20000-30000 fibers, it is also problem if they have corresponding fds.
So that I think people increase this limit upto 65K, don't?

> In the kernel, every "struct eventpoll" + "struct file" in
> Linux is at least 400 bytes of unswappable kernel memory.

400B * 100 = 40KB. Is it problem? I have no knowledge to evaluate this
size (10 pages seems not so small, I guess).

> OK, I can rename my work-in-progress patch with
> s/rb_thread_context_t/rb_execution_context_t/ and commit
> later tonight.

Ah, that was my plan and I'm not sure what is suitable name (always I
consumes long time for naming problem). But if you don't feel weird,
please use execution_context (ec).

Do you want to commit your patch into trunk immediately and change them
for "(2-1: extend Fiber)" later?  Another way is to make "(2-1: extend
Fiber)" first (in another branch or git repository) and commit it. The
latter can reduce total patch size.

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81047] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  4:11                         ` [ruby-core:81045] " SASADA Koichi
@ 2017-05-09  5:12                           ` Eric Wong
  2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
  2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
  0 siblings, 2 replies; 24+ messages in thread
From: Eric Wong @ 2017-05-09  5:12 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/09 12:38, Eric Wong wrote:
> > 100 epoll FDs is a waste of FDs; especially since it is common
> > to have a 1024 FD limit.  I already feel bad about timer thread
> > taking up two FDs; but maybe epoll/kevent can cut reduce that.
> 
> 1024 soft limit and 4096 hard limit is an issue. However, if we employ
> 
> > I can easily imagine Ruby doing 100 native threads in one process
> > (8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.
> 
> 20000-30000 fibers, it is also problem if they have corresponding fds.
> So that I think people increase this limit upto 65K, don't?

Yes, for people that run 20000-30000 fibers maybe it is not a
problem to have 100 epoll FD...

However, for existing apps like puma, webrick and net/http-based
scripts: they can spawn dozens/hundreds of threads and only use
one socket per thread.  It is a waste to use epoll/kqueue to
watch a few number of FD per thread (ppoll is more appropriate
for watching a single FD).

On the contrary; software like nginx and cmogstored watch
thousands of FDs with a single epoll|kqueue FD.

> > In the kernel, every "struct eventpoll" + "struct file" in
> > Linux is at least 400 bytes of unswappable kernel memory.
> 
> 400B * 100 = 40KB. Is it problem? I have no knowledge to evaluate this
> size (10 pages seems not so small, I guess).

I'd rather not use that much memory and save whereever possible.

> > OK, I can rename my work-in-progress patch with
> > s/rb_thread_context_t/rb_execution_context_t/ and commit
> > later tonight.
> 
> Ah, that was my plan and I'm not sure what is suitable name (always I
> consumes long time for naming problem). But if you don't feel weird,
> please use execution_context (ec).

OK, I committed as r58614

> Do you want to commit your patch into trunk immediately and change them
> for "(2-1: extend Fiber)" later?  Another way is to make "(2-1: extend
> Fiber)" first (in another branch or git repository) and commit it. The
> latter can reduce total patch size.

OK, I will work on implementing epoll/kqueue support late this
week or weekend.  I will also keep a select() fallback for
portability to systems w/o epoll|kqueue.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81049] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
@ 2017-05-09  5:47                             ` SASADA Koichi
  2017-05-09  6:23                               ` [ruby-core:81053] " Eric Wong
  2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
  1 sibling, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-09  5:47 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/09 14:12, Eric Wong wrote:
>> Do you want to commit your patch into trunk immediately and change them
>> for "(2-1: extend Fiber)" later?  Another way is to make "(2-1: extend
>> Fiber)" first (in another branch or git repository) and commit it. The
>> latter can reduce total patch size.
> OK, I will work on implementing epoll/kqueue support late this
> week or weekend.  I will also keep a select() fallback for
> portability to systems w/o epoll|kqueue.

I think making lightweight switching has more priority. But we need to
pre-evaluate the feature and API specifications.

Anyway, "your patch" in my comment means the patch you committed at
r58614, and you already committed it.

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81050] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
  2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
@ 2017-05-09  5:54                             ` SASADA Koichi
  2017-05-09  6:15                               ` [ruby-core:81052] " Eric Wong
  1 sibling, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-09  5:54 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/09 14:12, Eric Wong wrote:
> SASADA Koichi <ko1@atdot.net> wrote:
>> On 2017/05/09 12:38, Eric Wong wrote:
>>> 100 epoll FDs is a waste of FDs; especially since it is common
>>> to have a 1024 FD limit.  I already feel bad about timer thread
>>> taking up two FDs; but maybe epoll/kevent can cut reduce that.
>> 1024 soft limit and 4096 hard limit is an issue. However, if we employ
>>
>>> I can easily imagine Ruby doing 100 native threads in one process
>>> (8 cores, 10-20 rotational disks, 2 SSD), but 20000-30000 fibers.
>> 20000-30000 fibers, it is also problem if they have corresponding fds.
>> So that I think people increase this limit upto 65K, don't?
> Yes, for people that run 20000-30000 fibers maybe it is not a
> problem to have 100 epoll FD...
> 
> However, for existing apps like puma, webrick and net/http-based
> scripts: they can spawn dozens/hundreds of threads and only use
> one socket per thread.  It is a waste to use epoll/kqueue to
> watch a few number of FD per thread (ppoll is more appropriate
> for watching a single FD).

I see. 1000 fds -> 500 fds (with per-thread epoll) is bad.

> On the contrary; software like nginx and cmogstored watch
> thousands of FDs with a single epoll|kqueue FD.
> 
>>> In the kernel, every "struct eventpoll" + "struct file" in
>>> Linux is at least 400 bytes of unswappable kernel memory.
>> 400B * 100 = 40KB. Is it problem? I have no knowledge to evaluate this
>> size (10 pages seems not so small, I guess).
> I'd rather not use that much memory and save whereever possible.

On the other hand, aggressive I/O request can conflict by multi-thread
app. But current ruby threads don't run in parallel, so that it seems no
problem (hopefully). It seems can cause problem on parallel running
Guilds (but not available now).

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81052] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
@ 2017-05-09  6:15                               ` Eric Wong
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Wong @ 2017-05-09  6:15 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On the other hand, aggressive I/O request can conflict by multi-thread
> app. But current ruby threads don't run in parallel, so that it seems no
> problem (hopefully). It seems can cause problem on parallel running
> Guilds (but not available now).

Yes, it's pointless to have so many epoll|kqueue with GVL.

I think Guilds:(epoll|kqueue):nprocesso s can be 1:1:1
relationship.  I think memcached is similar in that way with
threads:event_loops:nprocessors.

nginx and cmogstored both allow using multiple processes for
multiple event loops.

Guild won't get around FD allocation contention in processes,
right?  In other words, guilds will still be implemented with
separate native threads in same process, but with separate
objspace?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81053] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
@ 2017-05-09  6:23                               ` Eric Wong
  2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-09  6:23 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> I think making lightweight switching has more priority. But we need to
> pre-evaluate the feature and API specifications.

OK.  I also started to work on making GVL switch nd remaining native
mutex/condvars faster on Linux by using futex.  However, it is only
faster with multi-core, single core performance is a little slower.

	https://80x24.org/spew/20170509062022.4413-1-e@80x24.org/raw
	(I still use my Pentium-M laptop from 2005 :)

> Anyway, "your patch" in my comment means the patch you committed at
> r58614, and you already committed it.

OK, also r58615.  Hopefully I didn't break any other platforms :x

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81054] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  6:23                               ` [ruby-core:81053] " Eric Wong
@ 2017-05-09  6:44                                 ` SASADA Koichi
  2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
  2017-06-20 19:16                                   ` [ruby-core:81733] " Eric Wong
  0 siblings, 2 replies; 24+ messages in thread
From: SASADA Koichi @ 2017-05-09  6:44 UTC (permalink / raw)
  To: Ruby developers

On 2017/05/09 15:23, Eric Wong wrote:
> OK.  I also started to work on making GVL switch nd remaining native
> mutex/condvars faster on Linux by using futex.  However, it is only
> faster with multi-core, single core performance is a little slower.
> 
> 	https://80x24.org/spew/20170509062022.4413-1-e@80x24.org/raw
> 	(I still use my Pentium-M laptop from 2005 :)

Let us clear the your plan.

Maybe we have several tasks.

(1) lightweight fiber switching by pointer-exchange
    (w/o copying context).
(2) auto-fiber swiching
   (2-1) implement with epoll/kqueue/select
   (2-2) design APIs to use it
(3) Implement GVL with futex (in your comment)
(4) Re-implement Queue (some days ago you wrote)

(please add your plan if you have others)

Do you have a schedule? (priority?)

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81078] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
@ 2017-05-09 18:51                                   ` Eric Wong
  2017-05-10  3:24                                     ` [ruby-core:81083] " SASADA Koichi
  2017-06-20 19:16                                   ` [ruby-core:81733] " Eric Wong
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-09 18:51 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> On 2017/05/09 15:23, Eric Wong wrote:
> > OK.  I also started to work on making GVL switch nd remaining native
> > mutex/condvars faster on Linux by using futex.  However, it is only
> > faster with multi-core, single core performance is a little slower.
> > 
> > 	https://80x24.org/spew/20170509062022.4413-1-e@80x24.org/raw
> > 	(I still use my Pentium-M laptop from 2005 :)
> 
> Let us clear the your plan.
> 
> Maybe we have several tasks.
> 
> (1) lightweight fiber switching by pointer-exchange
>     (w/o copying context).

Out of all tasks here, I am least familiar with this (1).
This will be learning experience for me.

> (2) auto-fiber swiching
>    (2-1) implement with epoll/kqueue/select
>    (2-2) design APIs to use it

I think I will start on the select implementation first for
portability, but model our internal API around epoll(*).
I will probably implement epoll support last, since I am
most familiar with it.

(*) with current GVL, I expect our kqueue+kevent implementation
    will be faster than epoll in most cases (the API requires
    fewer syscalls).  select might be fastest with few FDs.

> (3) Implement GVL with futex (in your comment)

Maybe last.  Linux-only, single (but most important)
platform; and the single CPU regression needs to be fixed.

> (4) Re-implement Queue (some days ago you wrote)

I already had some work-in-progress patches I can cleanup and
send out to redmine for review later (also ConditionVariable).
Last I remember, there was a small performance regression for
small Queue/Condvar waiter lists due to better locality on embed
structs.  However, I think avoiding O(n) rb_ary_delete behavior
is more important for busy queues.

> (please add your plan if you have others)

I might break out thread.c and io.c into smaller files
(select/epoll/kqueue/timer_thread/copy_stream/...)
to make code organization easier.

> Do you have a schedule? (priority?)

I don't know how long (1) will take, (4) is almost done.
(2) maybe 1-3 weeks.  (3) not sure how long it will take
to fix single CPU performance.

Also, I will try not to break platforms I don't use.  As you
know I only use Free Software, so I would appreciate if you or
platform maintainers can help fix portability bugs on non-Free
systems.

Do you have any deadlines or priorities?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81083] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
@ 2017-05-10  3:24                                     ` SASADA Koichi
  2017-05-10 10:04                                       ` [ruby-core:81089] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: SASADA Koichi @ 2017-05-10  3:24 UTC (permalink / raw)
  To: Ruby developers

Thank you.

(quote it first)
> Do you have any deadlines or priorities?

No. I want to make clear your priority to avoid conflicts with me.

On 2017/05/10 3:51, Eric Wong wrote:
>> (1) lightweight fiber switching by pointer-exchange
>>     (w/o copying context).
> 
> Out of all tasks here, I am least familiar with this (1).
> This will be learning experience for me.
> 
>> (2) auto-fiber swiching
>>    (2-1) implement with epoll/kqueue/select
>>    (2-2) design APIs to use it
> 
> I think I will start on the select implementation first for
> portability, but model our internal API around epoll(*).
> I will probably implement epoll support last, since I am
> most familiar with it.
> 
> (*) with current GVL, I expect our kqueue+kevent implementation
>     will be faster than epoll in most cases (the API requires
>     fewer syscalls).  select might be fastest with few FDs.

(1) and (2) are independent so that we can do it parallel.

Do you want to try (1) first or can I try (1)? Yes, Doing (1) is good to
learn core internal, but maybe (1) affect many places in VMs. So I want
to try. Anyway we should make new ticket and discuss on it.


I guess (2) is not so easy to design APIs.

* We need to survey other languages

* We need to define blocking operations:
  * blocking operation can switch Fibers automatically (I/O read, ...)
  * blocking operation can switch Fibers automatically
    (other than epoll/kqueue/select manage-able operations,
     extra-C exts providing blocking operations, ...)

  And how to provide such difference to users?
  * Idea: Documentation
    * example: POSIX singal safe functions
    * example: Java's thread-safety
    Generally, it is hard to use because users should
    them carefully (and usually people don't).
  * Idea: Provide new APIs which support auto-fibers
    (and other blocking operations don't support)
    * example: EventMachine, ... (other language example? Python?)
    * it is clear for users.
    * it is hard to import existing code
  * Idea: Provide a new TracePoint probe
    to know blocking operation which does not support auto-fibers.
    * This idea is for advanced user to check their scheduling
    * I think it is enough because
      * advanced user should be production maker.
        Automatic tools are preferable.
      * not advanced user don't care which operation can stop
        forever w/o auto-fiber switching

* We need to define auto-Fiber constructor

* ...

Ah, I remember that we have (2') providing epoll/kqueue like Ruby
interface. (2) use them in scheduler internally and only auto-fibers use
it. However, someone want to use them and want to write their own
scheduler (like nodejs culture). I'm not sure we should expose such
interface but it is valuable to consider. If we decide to provide such
APIs, we need to share the implementation (or shouldn't?). Furthermore,
it is more easy to provide such APIs compare with providing auto-fibers.



>> (3) Implement GVL with futex (in your comment)
> 
> Maybe last.  Linux-only, single (but most important)
> platform; and the single CPU regression needs to be fixed.

Cool.

>> (4) Re-implement Queue (some days ago you wrote)
> 
> I already had some work-in-progress patches I can cleanup and
> send out to redmine for review later (also ConditionVariable).
> Last I remember, there was a small performance regression for
> small Queue/Condvar waiter lists due to better locality on embed
> structs.  However, I think avoiding O(n) rb_ary_delete behavior
> is more important for busy queues.

OK.

> 
>> (please add your plan if you have others)
> 
> I might break out thread.c and io.c into smaller files
> (select/epoll/kqueue/timer_thread/copy_stream/...)
> to make code organization easier.

Not sure we can do it for io.c.
Please ask someone else.

> Also, I will try not to break platforms I don't use.  As you
> know I only use Free Software, so I would appreciate if you or
> platform maintainers can help fix portability bugs on non-Free
> systems.

Absolutely.

-- 
// SASADA Koichi at atdot dot net

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81089] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-10  3:24                                     ` [ruby-core:81083] " SASADA Koichi
@ 2017-05-10 10:04                                       ` Eric Wong
  2017-05-19  4:34                                         ` [ruby-core:81244] " Eric Wong
  0 siblings, 1 reply; 24+ messages in thread
From: Eric Wong @ 2017-05-10 10:04 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> Thank you.
> 
> (quote it first)
> > Do you have any deadlines or priorities?
> 
> No. I want to make clear your priority to avoid conflicts with me.

OK, good to know.

> On 2017/05/10 3:51, Eric Wong wrote:
> >> (1) lightweight fiber switching by pointer-exchange
> >>     (w/o copying context).
> > 
> > Out of all tasks here, I am least familiar with this (1).
> > This will be learning experience for me.
> > 
> >> (2) auto-fiber swiching
> >>    (2-1) implement with epoll/kqueue/select
> >>    (2-2) design APIs to use it
> > 
> > I think I will start on the select implementation first for
> > portability, but model our internal API around epoll(*).
> > I will probably implement epoll support last, since I am
> > most familiar with it.
> > 
> > (*) with current GVL, I expect our kqueue+kevent implementation
> >     will be faster than epoll in most cases (the API requires
> >     fewer syscalls).  select might be fastest with few FDs.
> 
> (1) and (2) are independent so that we can do it parallel.
> 
> Do you want to try (1) first or can I try (1)? Yes, Doing (1) is good to
> learn core internal, but maybe (1) affect many places in VMs. So I want
> to try. Anyway we should make new ticket and discuss on it.

You should do (1) first, you are the VM expert :)

> I guess (2) is not so easy to design APIs.

Lets keep changes to C-API internal and experiment, first.
First start with modifying rb_wait_for_single_fd() and
rb_waitpid() to be auto-Fiber-aware.  They will register event
watcher and call Fiber.yield instead of releasing GVL to sleep
when waiting.

Later, we can modify rb_thread_fd_select() and rb_thread_sleep*()
and maybe others.

> * We need to survey other languages

I will study the GHC IO manager, I think they are similar to my
vision of using EV_ONESHOT/EPOLLONESHOT with multi-core support:
http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask035-voellmy.pdf
(and also similar to what I used for cmogstored)

I do not know the Haskell language, so I will need to study it some.

> * We need to define blocking operations:
>   * blocking operation can switch Fibers automatically (I/O read, ...)
>   * blocking operation can switch Fibers automatically
>     (other than epoll/kqueue/select manage-able operations,
>      extra-C exts providing blocking operations, ...)

Basically, I want auto-Fiber to behave like 1.8 green threads,
but without timer-based switching.   Fiber switch should only
happen when operations cannot proceed (I/O, waitpid, sleep,
etc), or when user calls Fiber.yield.

>   And how to provide such difference to users?
>   * Idea: Documentation
>     * example: POSIX singal safe functions
>     * example: Java's thread-safety
>     Generally, it is hard to use because users should
>     them carefully (and usually people don't).
>   * Idea: Provide new APIs which support auto-fibers
>     (and other blocking operations don't support)
>     * example: EventMachine, ... (other language example? Python?)
>     * it is clear for users.
>     * it is hard to import existing code

Exactly, new APIs will take more time to adopt.  I don't think
it is necessary to introduce new IO APIs.  Currently, users
expect Thread switch when doing blocking IO (GVL release);
it should be easy to understand auto-Fiber switch if IO
would block (like 1.8 Thread)

Also, there is a NeverBlock RubyGem which made Fibers automatic
(like 1.8 threads), but development stopped years ago.

Ideally, I want existing code to be able to use net/* in stdlib
(and similar) with minimal modification: s/Thread.new/auto-Fiber.new/

>   * Idea: Provide a new TracePoint probe
>     to know blocking operation which does not support auto-fibers.
>     * This idea is for advanced user to check their scheduling
>     * I think it is enough because
>       * advanced user should be production maker.
>         Automatic tools are preferable.
>       * not advanced user don't care which operation can stop
>         forever w/o auto-fiber switching

Yes, we can add this once the auto-switch is implemented :)

> * We need to define auto-Fiber constructor

Perhaps that is a job for matz :)

> * ...
> 
> Ah, I remember that we have (2') providing epoll/kqueue like Ruby
> interface. (2) use them in scheduler internally and only auto-fibers use
> it. However, someone want to use them and want to write their own
> scheduler (like nodejs culture). I'm not sure we should expose such
> interface but it is valuable to consider. If we decide to provide such
> APIs, we need to share the implementation (or shouldn't?). Furthermore,
> it is more easy to provide such APIs compare with providing auto-fibers.

I don't think exposing new API is necessary, yet.  I prefer we
focus on internal implementation changes, first, and expose
user-visible changes later.

<snip>
 
> >> (4) Re-implement Queue (some days ago you wrote)
> > 
> > I already had some work-in-progress patches I can cleanup and
> > send out to redmine for review later (also ConditionVariable).
> > Last I remember, there was a small performance regression for
> > small Queue/Condvar waiter lists due to better locality on embed
> > structs.  However, I think avoiding O(n) rb_ary_delete behavior
> > is more important for busy queues.
> 
> OK.

Btw, that is [Feature #13552] - it might be ready.

> >> (please add your plan if you have others)
> > 
> > I might break out thread.c and io.c into smaller files
> > (select/epoll/kqueue/timer_thread/copy_stream/...)
> > to make code organization easier.
> 
> Not sure we can do it for io.c.
> Please ask someone else.

akr / nobu?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81244] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-10 10:04                                       ` [ruby-core:81089] " Eric Wong
@ 2017-05-19  4:34                                         ` Eric Wong
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Wong @ 2017-05-19  4:34 UTC (permalink / raw)
  To: ruby-core

Work-in-progress patch:

   https://80x24.org/spew/20170519042738.7174-1-e@80x24.org/raw

Currently, IO scheduling seems to work, waitpid/sleep/other scheduling
is not done, yet; but we do not need to support everything at
once during dev.

main API:

    Fiber#start -> enable auto-scheduling and run Fiber until it
                   automatically yields (due to EAGAIN/EWOULDBLOCK)

The following behave like their Thread counterparts:

    Fiber#join - run internal scheduler until Fiber is terminated
    Fiber#value - ditto

    Fiber#run (in prelude.rb)
    Fiber.start (ditto)

I think we can iron out the internal APIs and behavior, first,
and gradually add support for auto-Fiber.yield points.

Right now, it takes over rb_wait_for_single_fd() function
if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p)

Changes to existing functions are minimal.

New files (important structs and relations should be documented):

    iom.h - internal API for the rest of RubyVM (incomplete?)
    iom_common.h - common stuff internal to iom_*.h
    iom_select.h - select()-specific pieces

Changes to existing data structures:

    rb_thread_t.afhead   - list of fibers to auto-resume
    rb_fiber_t.afnode    - link to th->afhead
    rb_vm_t.iom          - Ruby I/O Manager (rb_iom_t) :)

Besides rb_iom_t, all the new structs are stack-only and
relies extensively on ccan/list for O(1) insert/delete.

Right now, I reuse some static functions in thread.c,
so thread.c includes iom_select.h

TODO:

    Hijack other blocking functions (waitpid, IO.select, ...)
    iom_epoll.h + iom_kqueue.h (easy once iom.h definitions are done)

I am using "double" for timeout since it is more convenient for
arithmetic like parts of thread.c.   Most platforms have good FP,
I think.  Also, all "blocking" functions (rb_iom_wait*) will
have timeout support

git repo info:

The following changes since commit c26a9a733848a0696976bb98abfe623e15ba2979:

  Fix strange indentation (2017-05-18 15:13:30 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git iom_select

for you to fetch changes up to 8ee92fbc908fe67f52372443a1492fc490431de0:

  auto fiber scheduling and friends (VERY LIGHTLY TESTED) (2017-05-19 03:51:28 +0000)

----------------------------------------------------------------
Eric Wong (1):
      auto fiber scheduling and friends (VERY LIGHTLY TESTED)

 common.mk         |   3 +
 configure.in      |   2 +-
 cont.c            | 156 +++++++++++++++++++-
 include/ruby/io.h |   2 +
 iom.h             |  82 +++++++++++
 iom_common.h      |  93 ++++++++++++
 iom_select.h      | 419 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 prelude.rb        |  12 ++
 thread.c          |  22 +++
 vm.c              |   7 +-
 vm_core.h         |   5 +-
 11 files changed, 795 insertions(+), 8 deletions(-)
 create mode 100644 iom.h
 create mode 100644 iom_common.h
 create mode 100644 iom_select.h
 (I will revert the -O0 change in configure.in, of course :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [ruby-core:81733] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]
  2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
  2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
@ 2017-06-20 19:16                                   ` Eric Wong
  1 sibling, 0 replies; 24+ messages in thread
From: Eric Wong @ 2017-06-20 19:16 UTC (permalink / raw)
  To: ruby-core

SASADA Koichi <ko1@atdot.net> wrote:
> (1) lightweight fiber switching by pointer-exchange
>     (w/o copying context).

Hi, do you have a plan/time for when this will be ready?

Just curious; I'm not in a big rush since I have plenty of work
to do on projects outside of Ruby.

I assume this will require no public API changes, only internal
changes right?  Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-06-20 19:17 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170402011414.AEA9B64CEE@svn.ruby-lang.org>
     [not found] ` <8a2b82e3-dc07-1945-55f9-5a474e89130b@ruby-lang.org>
2017-04-02  2:35   ` [ruby-core:80531] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip] Eric Wong
2017-04-02  3:05     ` [ruby-core:80532] " SASADA Koichi
2017-04-03  4:42       ` [ruby-core:80540] " Eric Wong
2017-05-08  0:33         ` [ruby-core:81027] " Eric Wong
2017-05-08  1:53           ` [ruby-core:81028] " SASADA Koichi
2017-05-08  2:16             ` [ruby-core:81029] " SASADA Koichi
2017-05-08  3:01               ` [ruby-core:81031] " Eric Wong
2017-05-08  3:42                 ` [ruby-core:81033] " SASADA Koichi
2017-05-08  6:36                   ` [ruby-core:81035] " Eric Wong
2017-05-09  2:18                     ` [ruby-core:81042] " SASADA Koichi
2017-05-09  3:38                       ` [ruby-core:81044] " Eric Wong
2017-05-09  4:11                         ` [ruby-core:81045] " SASADA Koichi
2017-05-09  5:12                           ` [ruby-core:81047] " Eric Wong
2017-05-09  5:47                             ` [ruby-core:81049] " SASADA Koichi
2017-05-09  6:23                               ` [ruby-core:81053] " Eric Wong
2017-05-09  6:44                                 ` [ruby-core:81054] " SASADA Koichi
2017-05-09 18:51                                   ` [ruby-core:81078] " Eric Wong
2017-05-10  3:24                                     ` [ruby-core:81083] " SASADA Koichi
2017-05-10 10:04                                       ` [ruby-core:81089] " Eric Wong
2017-05-19  4:34                                         ` [ruby-core:81244] " Eric Wong
2017-06-20 19:16                                   ` [ruby-core:81733] " Eric Wong
2017-05-09  5:54                             ` [ruby-core:81050] " SASADA Koichi
2017-05-09  6:15                               ` [ruby-core:81052] " Eric Wong
2017-05-08  2:56             ` [ruby-core:81030] " Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).