From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=AWL,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=no autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 0CDB91F66F for ; Tue, 17 Nov 2020 15:02:26 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 35A28120A6C; Wed, 18 Nov 2020 00:01:41 +0900 (JST) Received: from xtrwkhkc.outbound-mail.sendgrid.net (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id 805C7120A69 for ; Wed, 18 Nov 2020 00:01:39 +0900 (JST) Received: by filterdrecv-p3iad2-5dc87598f5-n8wll with SMTP id filterdrecv-p3iad2-5dc87598f5-n8wll-20-5FB3E5FB-52 2020-11-17 15:02:19.159693378 +0000 UTC m=+66695.613672259 Received: from herokuapp.com (unknown) by geopod-ismtpd-2-6 (SG) with ESMTP id alFHFKwxT-GLCqvKdGyUtg for ; Tue, 17 Nov 2020 15:02:19.078 +0000 (UTC) Date: Tue, 17 Nov 2020 15:02:19 +0000 (UTC) From: nicholas.evans@gmail.com Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 76775 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Feature X-Redmine-Issue-Id: 17325 X-Redmine-Issue-Author: nevans X-Redmine-Sender: nevans X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?8M9XtQFPepB0Vyl+xsp5GpqSbmdIIu5RpRwAjm7cAEXHP6X3IHIpuOwxx5WnsD?= =?us-ascii?Q?jzICEPS2ROXsg4YtBxgRuqZT=2FSe5WpRkRlWKnNR?= =?us-ascii?Q?+0JDTwPv7FO+dNukHG84O4nweuLlmSH+AV1pP1w?= =?us-ascii?Q?tn9VZfOqCn3RwVIPn8lp9mKdjHKx1bwBO8RyCfA?= =?us-ascii?Q?U87twZI9q2XJ4wHAYRYcX6+zLE+kgWnf2DOp8xM?= =?us-ascii?Q?98DCXljBJCpotsPZ8=3D?= To: ruby-core@ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== X-ML-Name: ruby-core X-Mail-Count: 100901 Subject: [ruby-core:100901] [Ruby master Feature#17325] Adds Fiber#cancel, which forces a Fiber to break/return X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #17325 has been updated by nevans (Nicholas Evans). Thanks for taking a look at this, Benoit. I agree it's not obvious why this is necessary with Fiber#raise, so I'll try to explain my reasoning in more detail: Yes, a library (e.g. `async`) could write a "suspend" function that wraps resume, yield, and transfer. That library could implement cancel _(and raise)_ by tunneling resume types & values through a struct. That library can't directly control fibers that are created outside its wrapper (e.g. `Enumerator`), but our "suspend" wrapper could implement a "trampoline" fiber to interoperate with most simple cases like `Enumerable`. I know all of this is possible, because I've been using a suspend wrapper like this on one of my work projects for a few years now! :) Still, it's not hard to imagine incompatible fiber-creating libraries that circumvent our wrapper in unpredictable ways. Putting these into `Fiber` gives all fiber-using applications and libraries a shared baseline they can rely upon. I disagree on "not a common operation". I think that relatively short-lived fibers could become as common-place as relatively short-lived `goroutines` are in `go` (and I hope they do). And I think non-exceptional cancellation is a very important concept for asynchronous code to handle explicitly. Other languages with coroutines special case "cancel" (even if some of them tunnel cancelation via error types). E.g. go's `Context` type uses `Done()` and the go vet tool checks that `CancelFuncs` are used on all control-flow paths. Kotlin coroutine's `Job` class has cancel methods but *not* raise methods. ### Propagation The most important difference from `Fiber#raise` isn't that it is unrescuable or uses fewer resources, but how propagation works: * You can cancel *any* living fiber except the root-fiber. * I've disallowed canceling the root to avoid accidentally shutting the thread down. * I'm currently raising `FiberError` for terminated fibers, but I think I'd like to change that. I think canceling dead fibers should simply return `nil` or `false`. That way we can safely call cancel on any fiber, and the only reason it would raise an exception is if the child fiber terminates with an exception during cancellation. * Cancel propagates _down_ through the linked list of resumed fibers. * Execution of `ensure` still runs from bottom up. * It does _not_ propagate an error or cancel _up_ from the canceled fiber. E.g. if an async task fiber is resuming into an lazy enumerator which is resuming into another fiber (and so on) and your thread scheduler wants to cancel your task's fiber but doesn't know about those other fibers, `Fiber#raise` won't work. If your fiber is transferring, `Fiber#raise` won't work. `Fiber#cancel` will work on any living fiber while still following the rules (#17221) for transferring control. Just as `break` doesn't generally need to know or care about the implementation of intervening library code, canceling a fiber shouldn't need to know or care about what the implementation of any sub-fibers may have been resumed. ### semantics of raise/rescue vs unexceptional break/return I'm not against temporarily or explicitly blocking cancellation. Avoiding swallowed exceptions is not the most important feature, but it's still a useful one I think. And any small performance gain would be desirable, but not the primary driver. Aside from either of those, `raise` has different semantics from `break` or `return` (or `throw`). (As currently written) this is only for semantically *unexceptional* flow-control. And this isn't a matter of "application code can handle it" because applications can't control their intervening library code, nor can they control fibers created by intervening library code. It's quite common to see code like the following: ```ruby def foo some_library_code_runs_this_block_many_stack_layers_deep do result = etc_etc_etc return result.retval if result.finished? end end ``` We *could* wrap this in an exception handler, but it would be more confusing to the casual reader than simply using `return` or `break` (or maybe `catch` and `throw`). For jumping to the `ensure` block of a particular method we use `return`. For block scope, `break`. For fiber-scope: `Fiber#cancel`. I expect that `return` statement to non-exceptionally rewind the stack without being caught by any `catch` or `rescue`. I don't want a library's hidden `rescue Exception` to subvert my `break` or `return` (libraries shouldn't do this, but sometimes they do). It's not as simple as an "application problem". Task cancellation could be triggered by application code or by library code. A task-scheduler library might call `Fiber#cancel`, and the fibers being canceled might be in application code or in library code or might be suspended by resuming into fibers that are uncontrolled by or even unknown to task-scheduler. None of that should matter. Wrapping a task with `catch {|tag| ... }` would be conceptually better than exception handling... but `throw tag` from an `Enumerator` won't propagate up to the return fiber. (I don't want to change this behavior.) ``` ruby -e 'f = Fiber.new { throw :foo }; p((catch(:foo) { f.resume } rescue nil))' # ``` ### Examples To be clear, these are toy examples and I'd want most of the following to be handled for me by a fiber-task-scheduler library (like `async`). But that library itself should have a mechanism for canceling resuming tasks, even when it doesn't (or can't) know about the resumed child fibers of those tasks. `Fiber#raise` (as currently written) can't do that. ```ruby def run_server server = MyFiberyTCPServer.open # Do stuff. e.g. accept clients, assign connections to fibers, etc. # Those connections can create their own sub-fibers. # The server may know nothing about those sub-fibers. It shouldn't need to. # Those subfibers might even use an entirely different scheduler. That's okay. # Connection fibers might be un-resumable because they are resuming. No prob. wait_for_shutdown_signal # => transfers to some sort of fiber scheduler ensure # cancels all connection-handler fibers server.connections.each do |c| # Are those connection fibers resuming other sub-fibers tasks? # Do we even know about those sub-tasks? # Can we even know about them from here? # Who cares? Those need to be canceled too! c.cancel :closing if c.alive? # I'd like to make dead_fiber.cancel unexceptional too end end # fetching a resource may depend on fetching several other resources first def resource_client a = schedule_future { http.get("/a") } b = schedule_future { http.get("/b") } items = a.value.item_ids.map {|id| http.get("/items/#{id}") } combine_results(b, ary) ensure # if any of the above raises an exception # or if *this* fiber is canceled # of if combine_results completed successfully before all subtasks complete a&.cancel rescue nil # is it resuming another fiber? don't know, don't care. b&.cancel rescue nil # is it resuming another fiber? don't know, don't care. ary&.each do |item| item.cancel rescue nil end # ditto end # yes, task library code would normally provide a better pattern for this def with_timeout(seconds) timer = Task.schedule do sleep seconds ensure task.cancel :timeout end task = Task.schedule do yield # does this resume into sub-tasks? we shouldn't need to know. ensure timer.cancel end task.value end ``` ### No guarantees And yes, we can always have misbehaving code and I'm not trying to guarantee against every case. We can't guard against certain categories of bugs nor infinite loops. It's always possible someone's written: ```ruby def foo bar ensure while true begin while true Fiber.yield :misbehaving end rescue Exception # evil code being evil end end end ``` But that's entirely outside the scope of this. :) We can have bugs here just like any code can have bugs. But in my experience, `ensure` code is usually *much* shorter and simpler than other code. Shut down, clean up, release, and reset. ---------------------------------------- Feature #17325: Adds Fiber#cancel, which forces a Fiber to break/return https://bugs.ruby-lang.org/issues/17325#change-88553 * Author: nevans (Nicholas Evans) * Status: Open * Priority: Normal ---------------------------------------- Calling `Fiber#cancel` will force a fiber to return, skipping rescue and catch blocks but running all ensure blocks. It behaves as if a `break` or `return` were used to jump from the last suspension point to the top frame of the fiber. Control will be transferred to the canceled fiber so it can run its ensure blocks. ## Propagation from resuming to resumed fibers Any non-root living fiber can be canceled and cancellation will propagate to child (resumed) fibers. In this way, a suspended task can be canceled even if it is e.g. resuming into an enumerator, and the enumerator will be canceled as well. Transfer of control should match #17221's *(much improved)* transfer/resume semantics. After the cancellation propagates all the way to the bottom of the fiber resume stack, the last fiber in the chain will then be resumed. Resuming fibers will not run until they are yielded back into. ## Suspension of canceled fibers Canceled fibers can still transfer control with `resume`, `yield`, and `transfer`, which may be necessary in order to release resources from `ensure` blocks. For simplicity, subsequent cancels will behave similarly to calling `break` or `return` inside an `ensure` block, and the last cancellation reason will overwrite earlier reasons. ## Alternatives `Fiber#raise` could be used, but: * Exceptions are bigger and slower than `break`. * `#raise` can't (and shouldn't) be sent to resuming fibers. (It can't propagate.) * Exceptions can be caught. This might be desirable, but that should be at the discretion of the calling fiber. Catch/Throw could be used (with an anonymous `Object.new`), but: * `catch` adds an extra stack frame. * It would need to add `Fiber#throw` (or wrap/intercept `Fiber.yield`). * A hypothetical `Fiber#throw` should probably only be allowed on yielding fibers (like `Fiber#resume`). (It wouldn't propagate.) Implementation: https://github.com/ruby/ruby/pull/3766 -- https://bugs.ruby-lang.org/