From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.4 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 318AA21847 for ; Wed, 2 May 2018 05:21:04 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 35AFC120A10; Wed, 2 May 2018 14:21:03 +0900 (JST) Received: from o1678916x28.outbound-mail.sendgrid.net (o1678916x28.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id CB532120A0C for ; Wed, 2 May 2018 14:20:59 +0900 (JST) Received: by filter0013p3iad2.sendgrid.net with SMTP id filter0013p3iad2-4079-5AE94AB7-6C 2018-05-02 05:20:55.921138068 +0000 UTC Received: from herokuapp.com (ec2-54-159-37-116.compute-1.amazonaws.com [54.159.37.116]) by ismtpd0008p1iad2.sendgrid.net (SG) with ESMTP id 4Nodw9HQSFeBdw2zKRR-nw Wed, 02 May 2018 05:20:55.693 +0000 (UTC) Date: Wed, 02 May 2018 05:20:56 +0000 (UTC) From: samuel@oriontransfer.org To: ruby-core@ruby-lang.org Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 62176 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 13618 X-Redmine-Issue-Author: normalperson X-Redmine-Issue-Assignee: normalperson X-Redmine-Sender: ioquatix X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: ync6xU2WACa70kv/Ymy4QrNMhiuLXJG8OTL2vJD1yS6RKsgylmyj35fsqvH2YJtgjGfzJVDeIMro5X XCfAHUAHNw7pmIDpN1BXnqOko/4+0TJF/U73/Dn1eUWdWOYZ9kSaW4M8ZgcTClr+psz5lkbFVhf0Vm tD5p3cLEwzg38OX+VF3Z5nqV+iRvJrgvKxVqBKGpPSEW0UsEXWywhlIvUw== X-ML-Name: ruby-core X-Mail-Count: 86821 Subject: [ruby-core:86821] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #13618 has been updated by ioquatix (Samuel Williams). Thanks for the detailed information. So, it seems like your design has unavoidable contention (and therefore latency) because you need to send events between threads, which is what I expected. However, you argue this overhead should be small. I'd like to see actual numbers TBH. And as you state, it's not possible (nor desirable IMHO) to move fibers between threads. Yes, head-of-line blocking might be an issue. Moving stacks between CPU cores is not without it's own set of overheads. If you have serious issues with head-of-line blocking it's more likely to be a problem with your code (I've directly experienced this and the result was: https://github.com/socketry/async-http/blob/ca655aa190ed7a89b601e267906359793271ec8a/lib/async/http/protocol/http11.rb#L93). It would be interesting to see exactly how much overhead is incurred using a shared epoll. I know from my testing that I remember in my tests, the latency of yahns was a lot higher than async-http: ``` async-http koyoko% wrk -c 16 -t 16 -d 10 http://localhost:9292/wiki/index Running 10s test @ http://localhost:9292/wiki/index 16 threads and 16 connections Thread Stats Avg Stdev Max +/- Stdev Latency 9.18ms 3.23ms 86.46ms 98.51% Req/Sec 109.47 17.70 121.00 95.49% 8954 requests in 10.02s, 29.99MB read Socket errors: connect 0, read 0, write 0, timeout 4 Requests/sec: 893.68 Transfer/sec: 2.99MB yahns koyoko% wrk -c 16 -t 16 -d 10 http://localhost:9292/wiki/index Running 10s test @ http://localhost:9292/wiki/index 16 threads and 16 connections Thread Stats Avg Stdev Max +/- Stdev Latency 20.51ms 16.04ms 190.57ms 85.79% Req/Sec 54.43 32.59 191.00 65.56% 8702 requests in 10.10s, 29.68MB read Requests/sec: 861.61 Transfer/sec: 2.94MB ``` This was a long time ago, async-http performance has also improved and the issue regarding timeouts was resolved. When I have some time I can repeat these tests. Tangentially related: In my own IO scheduler/reactor, I choose to use `EPOLLET` and manually add/remove the handlers. The basic implementation looks something like this: Implementation of Readable which is what manages adding events to epoll/kqueue: https://github.com/kurocha/async/blob/master/source/Async/Readable.cpp It's prepared but not added to the reactor here (it's lazy): https://github.com/kurocha/async-http/blob/eff77f61f7a85a3ac21f7a8f51ba07f069063cbe/source/Async/HTTP/V1/Protocol.cpp#L34 By calling wait, the fd is inserted into the reactor/selector: https://github.com/kurocha/async/blob/2edef4d6990259cc60cc307b6de2ab35b97560f1/source/Async/Protocol/Buffer.cpp#L254 The cost of adding/removing FDs is effectively constant time given an arbitrary number of reads or writes. We shouldn't preclude implementing this model in Ruby if it makes sense. As you say, the overhead of the system call is pretty minimal. Now that you mention it, I'd like to compare EPOLLET vs EPOLLONESHOT. It's an interesting design choice and it makes a lot of sense if you are doing only one read in the context of a blocking operation. ---------------------------------------- Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid https://bugs.ruby-lang.org/issues/13618#change-71784 * Author: normalperson (Eric Wong) * Status: Assigned * Priority: Normal * Assignee: normalperson (Eric Wong) * Target version: ---------------------------------------- ``` auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Implement automatic Fiber yield and resume when running rb_wait_for_single_fd and rb_waitpid. The Ruby API changes for Fiber are named after existing Thread methods. main Ruby API: Fiber#start -> enable auto-scheduling and run Fiber until it automatically yields (due to EAGAIN/EWOULDBLOCK) The following behave like their Thread counterparts: Fiber.start - Fiber.new + Fiber#start (prelude.rb) Fiber#join - run internal scheduler until Fiber is terminated Fiber#value - ditto Fiber#run - like Fiber#start (prelude.rb) Right now, it takes over rb_wait_for_single_fd() and rb_waitpid() function if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p) Changes to existing functions are minimal. New files (all new structs and relations should be documented): iom.h - internal API for the rest of RubyVM (incomplete?) iom_internal.h - internal header for iom_(select|epoll|kqueue).h iom_epoll.h - epoll-specific pieces iom_kqueue.h - kqueue-specific pieces iom_select.h - select-specific pieces iom_pingable_common.h - common code for iom_(epoll|kqueue).h iom_common.h - common footer for iom_(select|epoll|kqueue).h Changes to existing data structures: rb_thread_t.afrunq - list of fibers to auto-resume rb_vm_t.iom - Ruby I/O Manager (rb_iom_t) :) Besides rb_iom_t, all the new structs are stack-only and relies extensively on ccan/list for branch-less, O(1) insert/delete. As usual, understanding the data structures first should help you understand the code. Right now, I reuse some static functions in thread.c, so thread.c includes iom_(select|epoll|kqueue).h TODO: Hijack other blocking functions (IO.select, ...) I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think. Also, all "blocking" functions (rb_iom_wait*) will have timeout support. ./configure gains a new --with-iom=(select|epoll|kqueue) switch libkqueue: libkqueue support is incomplete; corner cases are not handled well: 1) multiple fibers waiting on the same FD 2) waiting for both read and write events on the same FD Bugfixes to libkqueue may be necessary to support all corner cases. Supporting these corner cases for native kqueue was challenging, even. See comments on iom_kqueue.h and iom_epoll.h for nuances. Limitations Test script I used to download a file from my server: ----8<--- require 'net/http' require 'uri' require 'digest/sha1' require 'fiber' url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx' uri = URI(url) use_ssl = "https" == uri.scheme fibs = 10.times.map do Fiber.start do cur = Fiber.current.object_id # XXX getaddrinfo() and connect() are blocking # XXX resolv/replace + connect_nonblock Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http| req = Net::HTTP::Get.new(uri) http.request(req) do |res| dig = Digest::SHA1.new res.read_body do |buf| dig.update(buf) #warn "#{cur} #{buf.bytesize}\n" end warn "#{cur} #{dig.hexdigest}\n" end end warn "done\n" :done end end warn "joining #{Time.now}\n" fibs[-1].join(4) warn "joined #{Time.now}\n" all = fibs.dup warn "1 joined, wait for the rest\n" until fibs.empty? fibs.each(&:join) fibs.keep_if(&:alive?) warn fibs.inspect end p all.map(&:value) Fiber.new do puts 'HI' end.run.join ``` ---Files-------------------------------- 0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB) -- https://bugs.ruby-lang.org/