From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ruby-core-bounces@ruby-lang.org>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,
	SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=ham
	autolearn_force=no version=3.4.2
Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75])
	by dcvr.yhbt.net (Postfix) with ESMTP id 59E981F4B4
	for <normalperson@yhbt.net>; Fri, 16 Oct 2020 01:58:21 +0000 (UTC)
Received: from neon.ruby-lang.org (localhost [IPv6:::1])
	by neon.ruby-lang.org (Postfix) with ESMTP id 6BDCF1209D6;
	Fri, 16 Oct 2020 10:57:42 +0900 (JST)
Received: from xtrwkhkc.outbound-mail.sendgrid.net
 (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28])
 by neon.ruby-lang.org (Postfix) with ESMTPS id 5C39C1209AD
 for <ruby-core@ruby-lang.org>; Fri, 16 Oct 2020 10:57:40 +0900 (JST)
Received: by filterdrecv-p3iad2-fb4b446ff-4772q with SMTP id
 filterdrecv-p3iad2-fb4b446ff-4772q-18-5F88FE32-29
 2020-10-16 01:58:10.439022602 +0000 UTC m=+267346.433501286
Received: from herokuapp.com (unknown) by ismtpd0139p1mdw1.sendgrid.net (SG)
 with ESMTP id -1-_2WIbTMqYvaxSAWZjig for <ruby-core@ruby-lang.org>;
 Fri, 16 Oct 2020 01:58:10.333 +0000 (UTC)
Date: Fri, 16 Oct 2020 01:58:10 +0000 (UTC)
From: samuel@oriontransfer.net
Message-ID: <redmine.journal-88027.20201016015809.15231@ruby-lang.org>
References: <redmine.issue-17263.20201015100437.15231@ruby-lang.org>
Mime-Version: 1.0
X-Redmine-MailingListIntegration-Message-Ids: 76277
X-Redmine-Project: ruby-master
X-Redmine-Issue-Tracker: Bug
X-Redmine-Issue-Id: 17263
X-Redmine-Issue-Author: ciconia
X-Redmine-Sender: ioquatix
X-Mailer: Redmine
X-Redmine-Host: bugs.ruby-lang.org
X-Redmine-Site: Ruby Issue Tracking System
X-Auto-Response-Suppress: All
Auto-Submitted: auto-generated
X-SG-EID: =?us-ascii?Q?cjxb6GWHefMLoR50bkJBcGo6DRiDl=2FNYcMZdY+Wj30QH2Ce=2FkNZYc6N3eMQDAm?=
 =?us-ascii?Q?Rexa+bXUS32AYAcKnJg0M2PPRVQLnFKVVuXJV1a?=
 =?us-ascii?Q?py5vLOtRVJwtSXjmCpZLzJEwV5IIxDTQFdj=2FPWy?=
 =?us-ascii?Q?ZgVSWELIxeZvUXAtJoInDNtSyx1sTABiX9UyPLR?=
 =?us-ascii?Q?vAzmph2TOtQdr15fI8rlp1pj4HmOSPArXWDUjvx?=
 =?us-ascii?Q?xvT82d8iXyEvTgBg8=3D?=
To: ruby-core@ruby-lang.org
X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA==
X-ML-Name: ruby-core
X-Mail-Count: 100412
Subject: [ruby-core:100412] [Ruby master Bug#17263] Fiber context switch
 degrades with number of fibers, limit on number of fibers
X-BeenThere: ruby-core@ruby-lang.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: Ruby developers <ruby-core@ruby-lang.org>
List-Id: Ruby developers <ruby-core.ruby-lang.org>
List-Unsubscribe: <https://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>, 
 <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
List-Post: <mailto:ruby-core@ruby-lang.org>
List-Help: <mailto:ruby-core-request@ruby-lang.org?subject=help>
List-Subscribe: <https://lists.ruby-lang.org/cgi-bin/mailman/listinfo/ruby-core>, 
 <mailto:ruby-core-request@ruby-lang.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ruby-core-bounces@ruby-lang.org
Sender: "ruby-core" <ruby-core-bounces@ruby-lang.org>

Issue #17263 has been updated by ioquatix (Samuel Williams).


On my computer, I found the following.

I changed your script to run with the given number of fibers as an argument.

```
> perf stat -e page-faults,cpu-cycles ./ruby ../test.rb 10000
fibers: 10000 rss: 143324 count: 1000000 rate: 1351343.9828743637

 Performance counter stats for './ruby ../test.rb 10000':

            34,693      page-faults:u                                               
     6,565,342,771      cpu-cycles:u                                                

       1.873819017 seconds time elapsed

       1.779125000 seconds user
       0.069029000 seconds sys


> perf stat -e page-faults,cpu-cycles ./ruby ../test.rb 100000
fibers: 100000 rss: 1302460 count: 1000000 rate: 1143930.98944375

 Performance counter stats for './ruby ../test.rb 100000':

           325,269      page-faults:u                                               
     6,896,098,554      cpu-cycles:u                                                

       2.506782962 seconds time elapsed

       1.846807000 seconds user
       0.610185000 seconds sys

```

Even thought the cpu-cycles is roughtly the same (and user time), we can see the system time varies by an order of magnitude. I had to run several times to get this clear picture, but I believe the overhead is coming from page-faults as the stacks are swapped. The more stacks you have, the more page faults you have in your L1/L2 cache. I'm not sure if we can verify this further, but one way might be to change the defaults stack size. I'll play around with it a bit more. Certainly, a CPU with a bigger L1/L2 cache should perform better if this theory is true.

----------------------------------------
Bug #17263: Fiber context switch degrades with number of fibers, limit on number of fibers
https://bugs.ruby-lang.org/issues/17263#change-88027

* Author: ciconia (Sharon Rosner)
* Status: Open
* Priority: Normal
* ruby -v: 2.7.1
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
I'm working on developing [Polyphony](https://github.com/digital-fabric/polyphony), a Ruby gem for writing
highly-concurrent Ruby programs with fibers. In the course of my work I have
come up against two problems using Ruby fibers:

1. Fiber context switching performance seem to degrade as the number of fibers
   is increased. This is both with `Fiber#transfer` and
   `Fiber#resume/Fiber.yield`.
2. The number of concurrent fibers that can exist at any time seems to be
   limited. Once a certain number is reached (on my system this seems to be
   31744 fibers), calling `Fiber#transfer` will raise a `FiberError` with the
   message `can't set a guard page: Cannot allocate memory`. This is not due to
   RAM being saturated. With 10000 fibers, my test program hovers at around 150MB
   RSS (on Ruby 2.7.1).

Here's a program for testing the performance of `Fiber#transfer`:

```ruby
# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  first = nil
  last = nil
  supervisor = Fiber.current
  num_fibers.times do
    fiber = Fiber.new do
      loop do
        count += 1
        if count == 1_000_000
          supervisor.transfer
        else
          Fiber.current.next.transfer
        end
      end
    end
    first ||= fiber
    last.next = fiber if last
    last = fiber
  end

  last.next = first
  
  t0 = Time.now
  first.transfer
  elapsed = Time.now - t0

  rss = `ps -o rss= -p #{Process.pid}`.to_i

  puts "fibers: #{num_fibers} rss: #{rss} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)
```

With Ruby 2.6.5 I'm getting:

```
fibers: 100 rss: 23212 count: 1000000 rate: 3357675.1688139187
fibers: 1000 rss: 31292 count: 1000000 rate: 2455537.056439736
fibers: 10000 rss: 127388 count: 1000000 rate: 954251.1674325482
Stopped at 22718 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

With Ruby 2.7.1 I'm getting:

```
fibers: 100 rss: 23324 count: 1000000 rate: 3443916.967616508
fibers: 1000 rss: 34676 count: 1000000 rate: 2333315.3862491543
fibers: 10000 rss: 151364 count: 1000000 rate: 916772.1008060966
Stopped at 31744 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

With ruby-head I get an almost identical result to that of 2.7.1.

As you can see, the performance degradation is similar in all the three versions
of Ruby, going from ~3.4M context switches per second for 100 fibers to less
then 1M context switches per second for 10000 fibers. Running with 100000 fibers
fails to complete.

Here's a program for testing the performance of `Fiber#resume/Fiber.yield`:

```ruby
# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

# This program shows how the performance of Fiber.transfer degrades as the fiber
# count increases

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  fibers = []
  num_fibers.times do
    fibers << Fiber.new { loop { Fiber.yield } }
  end

  t0 = Time.now

  while count < 1000000
    fibers.each do |f|
      count += 1
      f.resume
    end
  end

  elapsed = Time.now - t0

  puts "fibers: #{num_fibers} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)
```

With Ruby 2.7.1 I'm getting the following output:

```
fibers: 100 count: 1000000 rate: 3048230.049946255
fibers: 1000 count: 1000000 rate: 2362235.6455160403
fibers: 10000 count: 1000000 rate: 950251.7621725246
Stopped at 21745 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

As I understand it, theoretically at least switching between fibers should have
a constant cost in terms of CPU cycles, irrespective of the number of fibers
currently existing in memory. I am completely ignorant the implementation
details of Ruby fibers, so at least for now I don't have any idea where this
problem is coming from.


-- 
https://bugs.ruby-lang.org/