From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.1 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_MED,SPF_PASS,T_RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 287F61F8CF for ; Fri, 9 Jun 2017 20:32:45 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 031AB1207CB; Sat, 10 Jun 2017 05:32:43 +0900 (JST) Received: from dcvr.yhbt.net (dcvr.yhbt.net [64.71.152.64]) by neon.ruby-lang.org (Postfix) with ESMTPS id E4B711207C7 for ; Sat, 10 Jun 2017 05:32:38 +0900 (JST) Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 43D331F8CF; Fri, 9 Jun 2017 20:32:37 +0000 (UTC) Date: Fri, 9 Jun 2017 20:32:37 +0000 From: Eric Wong To: ruby-core@ruby-lang.org Message-ID: <20170609203237.GA19887@dcvr> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-ML-Name: ruby-core X-Mail-Count: 81643 Subject: [ruby-core:81643] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" samuel@oriontransfer.org wrote: > To a certain extent, things discussed here are already implemented in > > https://github.com/socketry/async > > and > > https://github.com/socketry/async-io > > What are the benefits of having this implemented in core Ruby as opposed to a gem which can be versioned independently and works with all Rubies 2.x, including JRuby and (in theory) Rubinius? Neverblock basically tried the same thing with EM and never took off. I don't know much about getting software adopted or popularized, but maybe being in core has a better chance of gaining adoption and being sustainable. Being in core provides greater compatibility with external libraries which are not aware of existing event loops. So 3rd-party DB adapters (e.g. mysql2) will be able to take advantage of these changes transparently if they use rb_wait_for_single_fd (and I will add a hook for rb_thread_fd_select, too). It will also be easily possible to get existing primitives like Queue/SizedQueue to work with Fibers out-of-the-box. Maybe even Mutex+ConditionVariable, if approved. One current example is being able to hook rb_waitpid: any existing code using trap(:CHLD) continues to work transparently even if using auto-Fiber for I/O; but auto-Fiber users can also rely on "blocking" Process.waitpid if they desire. Anyways, accepting any of this into core is not my decision to make. I will only provide implementation and advice/hints. A small rant about existing event loops: Most existing event loop implementations (libev, libevent, EM) seem stuck in single-thread mentality from legacy select/poll APIs. They handle MT by having one event loop per-thread; instead of taking advantage of the fact that modern primitives like kqueue and epoll are both MT-friendly queues which are populated by threads running inside the kernel. In a world where memory and CPU are your only constraints, you can run one (native thread|process) per-core and thus one event loop per-core. This is perfectly fine for things like memcached which are only memory+CPU bound. That falls down once you have other constraints, such as physical disks to deal with. I maintain software which reads and writes simultaneously to dozens, if not hundreds of rotational disks (JBOD) in a single process. With current APIs on GNU/Linux and FreeBSD, the only way I've found(*) to deal with this effectively is to use >=1 pthread per disk. (*) Various AIO implementations are lacking, too. They pessimize the hot cache case, lack open/unlink/rename/stat equivalents, and userland implementations tend to not be mountpoint/device-aware. Native AIO requires O_DIRECT in Linux, so no page cache at all :< > Why not focus on making core part of Ruby fast, and providing the appropriate hooks, rather than expanding her scope and complexity, in a way which has a proven track record for frustration (poorly designed stdlib which can't be fixed or improved due to breaking backwards compatibility). I think core and stdlib can evolve best if done together. Fiber has been in production Ruby for nearly a decade now, with only minor improvements, and seems largely ignored in the wider scheme of things. I guess they're not that useful in practice. And just because we're adding new features does not mean we're not also finding places to optimize our code. Mutex/Queue/SizedQueue/ConditionVariable are already faster in trunk because of preparation work to make them auto-Fiber aware: https://bugs.ruby-lang.org/issues/13517 https://bugs.ruby-lang.org/issues/13552 Why can't stdlib be fixed? Just because we need to support old behaviors and APIs does not mean we cannot improve things. Having a solid stdlib is a great way to improve core and vice-versa, and helps us bridge the gap for end user code. Finally, keep in mind there are Rubyists who are not enthusiastic users willing to explore, they're the "distro users". It'll be easier for them to pick up Ruby and use Ruby apps if stdlib were better. Despite using Perl more than Ruby, I'm a conservative "distro user" myself with Perl. So I'm hesitant to use or depend on stuff which isn't packaged by distros, especially when it comes to end user convenience (some who do not even know or care about what a programming language is). So yes, I still write Perl 5.8-compatible code, and still support legacy CentOS 5.x and 6.x systems.