From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 59E981F4B4 for ; Fri, 16 Oct 2020 01:58:21 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 6BDCF1209D6; Fri, 16 Oct 2020 10:57:42 +0900 (JST) Received: from xtrwkhkc.outbound-mail.sendgrid.net (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id 5C39C1209AD for ; Fri, 16 Oct 2020 10:57:40 +0900 (JST) Received: by filterdrecv-p3iad2-fb4b446ff-4772q with SMTP id filterdrecv-p3iad2-fb4b446ff-4772q-18-5F88FE32-29 2020-10-16 01:58:10.439022602 +0000 UTC m=+267346.433501286 Received: from herokuapp.com (unknown) by ismtpd0139p1mdw1.sendgrid.net (SG) with ESMTP id -1-_2WIbTMqYvaxSAWZjig for ; Fri, 16 Oct 2020 01:58:10.333 +0000 (UTC) Date: Fri, 16 Oct 2020 01:58:10 +0000 (UTC) From: samuel@oriontransfer.net Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 76277 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 17263 X-Redmine-Issue-Author: ciconia X-Redmine-Sender: ioquatix X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?cjxb6GWHefMLoR50bkJBcGo6DRiDl=2FNYcMZdY+Wj30QH2Ce=2FkNZYc6N3eMQDAm?= =?us-ascii?Q?Rexa+bXUS32AYAcKnJg0M2PPRVQLnFKVVuXJV1a?= =?us-ascii?Q?py5vLOtRVJwtSXjmCpZLzJEwV5IIxDTQFdj=2FPWy?= =?us-ascii?Q?ZgVSWELIxeZvUXAtJoInDNtSyx1sTABiX9UyPLR?= =?us-ascii?Q?vAzmph2TOtQdr15fI8rlp1pj4HmOSPArXWDUjvx?= =?us-ascii?Q?xvT82d8iXyEvTgBg8=3D?= To: ruby-core@ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== X-ML-Name: ruby-core X-Mail-Count: 100412 Subject: [ruby-core:100412] [Ruby master Bug#17263] Fiber context switch degrades with number of fibers, limit on number of fibers X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #17263 has been updated by ioquatix (Samuel Williams). On my computer, I found the following. I changed your script to run with the given number of fibers as an argument. ``` > perf stat -e page-faults,cpu-cycles ./ruby ../test.rb 10000 fibers: 10000 rss: 143324 count: 1000000 rate: 1351343.9828743637 Performance counter stats for './ruby ../test.rb 10000': 34,693 page-faults:u 6,565,342,771 cpu-cycles:u 1.873819017 seconds time elapsed 1.779125000 seconds user 0.069029000 seconds sys > perf stat -e page-faults,cpu-cycles ./ruby ../test.rb 100000 fibers: 100000 rss: 1302460 count: 1000000 rate: 1143930.98944375 Performance counter stats for './ruby ../test.rb 100000': 325,269 page-faults:u 6,896,098,554 cpu-cycles:u 2.506782962 seconds time elapsed 1.846807000 seconds user 0.610185000 seconds sys ``` Even thought the cpu-cycles is roughtly the same (and user time), we can see the system time varies by an order of magnitude. I had to run several times to get this clear picture, but I believe the overhead is coming from page-faults as the stacks are swapped. The more stacks you have, the more page faults you have in your L1/L2 cache. I'm not sure if we can verify this further, but one way might be to change the defaults stack size. I'll play around with it a bit more. Certainly, a CPU with a bigger L1/L2 cache should perform better if this theory is true. ---------------------------------------- Bug #17263: Fiber context switch degrades with number of fibers, limit on number of fibers https://bugs.ruby-lang.org/issues/17263#change-88027 * Author: ciconia (Sharon Rosner) * Status: Open * Priority: Normal * ruby -v: 2.7.1 * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN ---------------------------------------- I'm working on developing [Polyphony](https://github.com/digital-fabric/polyphony), a Ruby gem for writing highly-concurrent Ruby programs with fibers. In the course of my work I have come up against two problems using Ruby fibers: 1. Fiber context switching performance seem to degrade as the number of fibers is increased. This is both with `Fiber#transfer` and `Fiber#resume/Fiber.yield`. 2. The number of concurrent fibers that can exist at any time seems to be limited. Once a certain number is reached (on my system this seems to be 31744 fibers), calling `Fiber#transfer` will raise a `FiberError` with the message `can't set a guard page: Cannot allocate memory`. This is not due to RAM being saturated. With 10000 fibers, my test program hovers at around 150MB RSS (on Ruby 2.7.1). Here's a program for testing the performance of `Fiber#transfer`: ```ruby # frozen_string_literal: true require 'fiber' class Fiber attr_accessor :next end def run(num_fibers) count = 0 GC.start GC.disable first = nil last = nil supervisor = Fiber.current num_fibers.times do fiber = Fiber.new do loop do count += 1 if count == 1_000_000 supervisor.transfer else Fiber.current.next.transfer end end end first ||= fiber last.next = fiber if last last = fiber end last.next = first t0 = Time.now first.transfer elapsed = Time.now - t0 rss = `ps -o rss= -p #{Process.pid}`.to_i puts "fibers: #{num_fibers} rss: #{rss} count: #{count} rate: #{count / elapsed}" rescue Exception => e puts "Stopped at #{count} fibers" p e end run(100) run(1000) run(10000) run(100000) ``` With Ruby 2.6.5 I'm getting: ``` fibers: 100 rss: 23212 count: 1000000 rate: 3357675.1688139187 fibers: 1000 rss: 31292 count: 1000000 rate: 2455537.056439736 fibers: 10000 rss: 127388 count: 1000000 rate: 954251.1674325482 Stopped at 22718 fibers # ``` With Ruby 2.7.1 I'm getting: ``` fibers: 100 rss: 23324 count: 1000000 rate: 3443916.967616508 fibers: 1000 rss: 34676 count: 1000000 rate: 2333315.3862491543 fibers: 10000 rss: 151364 count: 1000000 rate: 916772.1008060966 Stopped at 31744 fibers # ``` With ruby-head I get an almost identical result to that of 2.7.1. As you can see, the performance degradation is similar in all the three versions of Ruby, going from ~3.4M context switches per second for 100 fibers to less then 1M context switches per second for 10000 fibers. Running with 100000 fibers fails to complete. Here's a program for testing the performance of `Fiber#resume/Fiber.yield`: ```ruby # frozen_string_literal: true require 'fiber' class Fiber attr_accessor :next end # This program shows how the performance of Fiber.transfer degrades as the fiber # count increases def run(num_fibers) count = 0 GC.start GC.disable fibers = [] num_fibers.times do fibers << Fiber.new { loop { Fiber.yield } } end t0 = Time.now while count < 1000000 fibers.each do |f| count += 1 f.resume end end elapsed = Time.now - t0 puts "fibers: #{num_fibers} count: #{count} rate: #{count / elapsed}" rescue Exception => e puts "Stopped at #{count} fibers" p e end run(100) run(1000) run(10000) run(100000) ``` With Ruby 2.7.1 I'm getting the following output: ``` fibers: 100 count: 1000000 rate: 3048230.049946255 fibers: 1000 count: 1000000 rate: 2362235.6455160403 fibers: 10000 count: 1000000 rate: 950251.7621725246 Stopped at 21745 fibers # ``` As I understand it, theoretically at least switching between fibers should have a constant cost in terms of CPU cycles, irrespective of the number of fibers currently existing in memory. I am completely ignorant the implementation details of Ruby fibers, so at least for now I don't have any idea where this problem is coming from. -- https://bugs.ruby-lang.org/