From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id D358F1F62D for ; Fri, 6 Jul 2018 20:33:25 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id DCDE6120D75; Sat, 7 Jul 2018 03:11:44 +0900 (JST) Received: from o1678916x28.outbound-mail.sendgrid.net (unknown [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id CEED7120D74 for ; Sat, 7 Jul 2018 03:11:41 +0900 (JST) Received: by filter0043p3mdw1.sendgrid.net with SMTP id filter0043p3mdw1-30487-5B3FB0A1-2E 2018-07-06 18:10:41.806650248 +0000 UTC Received: from herokuapp.com (ec2-184-73-60-204.compute-1.amazonaws.com [184.73.60.204]) by ismtpd0010p1iad1.sendgrid.net (SG) with ESMTP id AIB0Yz8xRDGlMZqklNDGPg Fri, 06 Jul 2018 18:10:41.772 +0000 (UTC) Date: Fri, 06 Jul 2018 18:10:45 +0000 (UTC) From: funny.falcon@gmail.com To: ruby-core@ruby-lang.org Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 63226 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 13618 X-Redmine-Issue-Author: normalperson X-Redmine-Issue-Assignee: normalperson X-Redmine-Sender: funny_falcon X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: ync6xU2WACa70kv/Ymy4QrNMhiuLXJG8OTL2vJD1yS4AjTByla3egl4oYrbkfsX10wy/rLjteb9AVN tR/CUhEOdw3MpEwUI5KEQtXKp0FHRhaRYmrKkOdjjHxvApYYXvSNPkaWF9kjfpfOp5aLh7uhOqEki4 eh2UJb9EHcMQIe6zDQTAF/2nIqkywpZZDFDTrVS+BWX3PgfN43py9ORI9w== X-ML-Name: ruby-core X-Mail-Count: 87839 Subject: [ruby-core:87839] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #13618 has been updated by funny_falcon (Yura Sokolov). > Yes, they are great, but it's probably impossible to implement in Ruby. It is impossible to implement Thread migration between native threads. All other is possible. >> Bunch of hybrid threads scheduled on one native thread will be as fast as Fiber's, they will scale. >Yes they will, but they will not be as pleasant to program. The memory and execution model is very complex for normal people to grok. >The memory model and execution model of fibers is very simple. I've had feedback from people who have used Async, and it's all been really great. Don't get me wrong: most of code is linear and thread-safe. Synchronization is needed only for framework authors. You (as Async author) see the need for synchronyzation. Most of ordinal programmers will never touch it. More over: if language gives suitable synchronization primitives, it is not difficult to explain to average programmer how to use it. Go (excuse me again for mention it) has a rich story on that: 99.9% of code written daily doesn't bother with concurrency at all. I see every day hundreds lines are committed without any call to "Mutex" or even channel. That is usual code, and it will work despite working on Thread or Fiber. But, when programmer want to dig into concurrency, it is better to have tool that will work always, ie it is better to have builtin scheduler and single set of synchronization primitives. >> Fibers are also exclusive relative to each other. > Yes, but by design, not by limitation of the interpreter (ala Threads/GVL). Again, I really think, there is no need to rely on "exclusive by design" in 99% cases. And Ruby is not that language, where that 1% will gain much difference. More over, my experience taught me, that every time I rely on "exclusive by design", it bites me. Because today it has no yield point, tomorrow some one added "log.write()", which wrote to network log collector, and BAHM, it yields and breaks everything. If something should be protected, it have to be protected with primitive language gives me, otherwise it is sleeping bug that will fire in a future. > You cannot use primitives designed for thread synchronization because it will block the entire thread, and it won't allow other fibers to execute. Unless you have green Thread, that is scheduled on same native thread, and synchronization primitive is concerned about. Go's sync.Mutex works very well, regardless of number of native threads it works over. And I don't see any reason, why Ruby's Mutex will be worse. > Async doesn't have Mutex, since all fibers in a thread/reactor is naturally mutually exclusive. What is Async::Semaphore.new(1) ? It is Async::Mutex, just without separate name. > The implementation of the Async primitives leverages the concurrency model of fibers to make them simple, deterministic and robust. An implementation of standard library's primitives will leverage the concurrency model of hybrid Threads to make them simple, deterministic and robust. > In my mind Thread.scheduler doesn't require built in primitives It will require primitives, provided by same library, that provides scheduler. > Anyone can build "primitives" like semaphore, queue, condition, etc. Anyone will have to use primitives shipped with the library that provides scheduler. You told about "simplicity for user", but building this primitives is not easy. Worse thing: any generic library, that wants to use primitives, but don't want to rely on single "scheduler library" will have to have a way to find correct primitive for each possible scheduler library. > The same can not be said for Threads. With Threads, there will be Mutex, Queue, ConditionVariable in standard library. Programmer doesn't have to reimplement the wheel. Especially if it is in standard library. That is my point: - you says "it will be easy to reimplement wheels", - I say "but I don't want to reimplement wheels". I really did implement wheels. On top of EventMachine and EM::Synchrony I've made a lot of things many years ago. I really do not wish for average programmer to step through that. I want average programmer to take base Ruby installation, make a program with standard library, and that program should run smoothly and fast. I want they could combine any gems, and that gems doesn't fight against each other because they wants different schedulers. Do you know, why Go is great (excuse me again)? Because you have no much choice, but default choice is already great. People get standard library, they get standard (and single possible) runtime, and they already can do great things. People don't want to make a choice. People wants to make a product. That is what people got from Ruby too in a past. It was before "asynchronous" programming became to be main stream. But now there too many choices (for asynchronous programming), and they are dying with the speed of birth. I wish it will be: "Use standard `Thread.create`. It is fast, and scheduled asynchronously using hidden eventloop by default. But if you really need to deal heavy with disk, or to do CPU calculation, implemented in C, then spawn `@pool=Thread::NativePool.new(10)`, and pass jobs to that pool with `result = @pool.do{ mytask }` or `future = @pool.push{ mytask }; future.get`". -------- Looks like I'm too wordy and too emotional. Excuse me :-( ---------------------------------------- Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid https://bugs.ruby-lang.org/issues/13618#change-72858 * Author: normalperson (Eric Wong) * Status: Assigned * Priority: Normal * Assignee: normalperson (Eric Wong) * Target version: ---------------------------------------- ``` auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Implement automatic Fiber yield and resume when running rb_wait_for_single_fd and rb_waitpid. The Ruby API changes for Fiber are named after existing Thread methods. main Ruby API: Fiber#start -> enable auto-scheduling and run Fiber until it automatically yields (due to EAGAIN/EWOULDBLOCK) The following behave like their Thread counterparts: Fiber.start - Fiber.new + Fiber#start (prelude.rb) Fiber#join - run internal scheduler until Fiber is terminated Fiber#value - ditto Fiber#run - like Fiber#start (prelude.rb) Right now, it takes over rb_wait_for_single_fd() and rb_waitpid() function if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p) Changes to existing functions are minimal. New files (all new structs and relations should be documented): iom.h - internal API for the rest of RubyVM (incomplete?) iom_internal.h - internal header for iom_(select|epoll|kqueue).h iom_epoll.h - epoll-specific pieces iom_kqueue.h - kqueue-specific pieces iom_select.h - select-specific pieces iom_pingable_common.h - common code for iom_(epoll|kqueue).h iom_common.h - common footer for iom_(select|epoll|kqueue).h Changes to existing data structures: rb_thread_t.afrunq - list of fibers to auto-resume rb_vm_t.iom - Ruby I/O Manager (rb_iom_t) :) Besides rb_iom_t, all the new structs are stack-only and relies extensively on ccan/list for branch-less, O(1) insert/delete. As usual, understanding the data structures first should help you understand the code. Right now, I reuse some static functions in thread.c, so thread.c includes iom_(select|epoll|kqueue).h TODO: Hijack other blocking functions (IO.select, ...) I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think. Also, all "blocking" functions (rb_iom_wait*) will have timeout support. ./configure gains a new --with-iom=(select|epoll|kqueue) switch libkqueue: libkqueue support is incomplete; corner cases are not handled well: 1) multiple fibers waiting on the same FD 2) waiting for both read and write events on the same FD Bugfixes to libkqueue may be necessary to support all corner cases. Supporting these corner cases for native kqueue was challenging, even. See comments on iom_kqueue.h and iom_epoll.h for nuances. Limitations Test script I used to download a file from my server: ----8<--- require 'net/http' require 'uri' require 'digest/sha1' require 'fiber' url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx' uri = URI(url) use_ssl = "https" == uri.scheme fibs = 10.times.map do Fiber.start do cur = Fiber.current.object_id # XXX getaddrinfo() and connect() are blocking # XXX resolv/replace + connect_nonblock Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http| req = Net::HTTP::Get.new(uri) http.request(req) do |res| dig = Digest::SHA1.new res.read_body do |buf| dig.update(buf) #warn "#{cur} #{buf.bytesize}\n" end warn "#{cur} #{dig.hexdigest}\n" end end warn "done\n" :done end end warn "joining #{Time.now}\n" fibs[-1].join(4) warn "joined #{Time.now}\n" all = fibs.dup warn "1 joined, wait for the rest\n" until fibs.empty? fibs.each(&:join) fibs.keep_if(&:alive?) warn fibs.inspect end p all.map(&:value) Fiber.new do puts 'HI' end.run.join ``` ---Files-------------------------------- 0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB) -- https://bugs.ruby-lang.org/