ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "ioquatix (Samuel Williams) via ruby-core" <ruby-core@ml.ruby-lang.org>
To: ruby-core@ml.ruby-lang.org
Cc: "ioquatix (Samuel Williams)" <noreply@ruby-lang.org>
Subject: [ruby-core:114519] [Ruby master Bug#17263] Fiber context switch degrades with number of fibers, limit on number of fibers
Date: Fri, 25 Aug 2023 00:13:12 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-104315.20230825001312.15231@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17263.20201015100437.15231@ruby-lang.org

Issue #17263 has been updated by ioquatix (Samuel Williams).


The difference is negligible but there did appear to be some improvement. We obviously need better benchmark tools, because this is total eye-ball statistics, but it's expected that less memory dependency between instructions should be better.

https://github.com/ruby/ruby/pull/8284

----------------------------------------
Bug #17263: Fiber context switch degrades with number of fibers, limit on number of fibers
https://bugs.ruby-lang.org/issues/17263#change-104315

* Author: ciconia (Sharon Rosner)
* Status: Open
* Priority: Normal
* ruby -v: 2.7.1
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
I'm working on developing [Polyphony](https://github.com/digital-fabric/polyphony), a Ruby gem for writing
highly-concurrent Ruby programs with fibers. In the course of my work I have
come up against two problems using Ruby fibers:

1. Fiber context switching performance seem to degrade as the number of fibers
   is increased. This is both with `Fiber#transfer` and
   `Fiber#resume/Fiber.yield`.
2. The number of concurrent fibers that can exist at any time seems to be
   limited. Once a certain number is reached (on my system this seems to be
   31744 fibers), calling `Fiber#transfer` will raise a `FiberError` with the
   message `can't set a guard page: Cannot allocate memory`. This is not due to
   RAM being saturated. With 10000 fibers, my test program hovers at around 150MB
   RSS (on Ruby 2.7.1).

Here's a program for testing the performance of `Fiber#transfer`:

```ruby
# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  first = nil
  last = nil
  supervisor = Fiber.current
  num_fibers.times do
    fiber = Fiber.new do
      loop do
        count += 1
        if count == 1_000_000
          supervisor.transfer
        else
          Fiber.current.next.transfer
        end
      end
    end
    first ||= fiber
    last.next = fiber if last
    last = fiber
  end

  last.next = first
  
  t0 = Time.now
  first.transfer
  elapsed = Time.now - t0

  rss = `ps -o rss= -p #{Process.pid}`.to_i

  puts "fibers: #{num_fibers} rss: #{rss} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)
```

With Ruby 2.6.5 I'm getting:

```
fibers: 100 rss: 23212 count: 1000000 rate: 3357675.1688139187
fibers: 1000 rss: 31292 count: 1000000 rate: 2455537.056439736
fibers: 10000 rss: 127388 count: 1000000 rate: 954251.1674325482
Stopped at 22718 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

With Ruby 2.7.1 I'm getting:

```
fibers: 100 rss: 23324 count: 1000000 rate: 3443916.967616508
fibers: 1000 rss: 34676 count: 1000000 rate: 2333315.3862491543
fibers: 10000 rss: 151364 count: 1000000 rate: 916772.1008060966
Stopped at 31744 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

With ruby-head I get an almost identical result to that of 2.7.1.

As you can see, the performance degradation is similar in all the three versions
of Ruby, going from ~3.4M context switches per second for 100 fibers to less
then 1M context switches per second for 10000 fibers. Running with 100000 fibers
fails to complete.

Here's a program for testing the performance of `Fiber#resume/Fiber.yield`:

```ruby
# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

# This program shows how the performance of Fiber.transfer degrades as the fiber
# count increases

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  fibers = []
  num_fibers.times do
    fibers << Fiber.new { loop { Fiber.yield } }
  end

  t0 = Time.now

  while count < 1000000
    fibers.each do |f|
      count += 1
      f.resume
    end
  end

  elapsed = Time.now - t0

  puts "fibers: #{num_fibers} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)
```

With Ruby 2.7.1 I'm getting the following output:

```
fibers: 100 count: 1000000 rate: 3048230.049946255
fibers: 1000 count: 1000000 rate: 2362235.6455160403
fibers: 10000 count: 1000000 rate: 950251.7621725246
Stopped at 21745 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>
```

As I understand it, theoretically at least switching between fibers should have
a constant cost in terms of CPU cycles, irrespective of the number of fibers
currently existing in memory. I am completely ignorant the implementation
details of Ruby fibers, so at least for now I don't have any idea where this
problem is coming from.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

  parent reply	other threads:[~2023-08-25  0:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 10:04 [ruby-core:100401] [Ruby master Bug#17263] Fiber context switch degrades with number of fibers, limit on number of fibers ciconia
2020-10-15 11:13 ` [ruby-core:100402] " samuel
2020-10-15 11:25 ` [ruby-core:100403] " samuel
2020-10-16  1:58 ` [ruby-core:100412] " samuel
2020-10-16  6:19 ` [ruby-core:100418] " samuel
2020-10-20 20:40 ` [ruby-core:100453] " ciconia
2020-10-22 10:43 ` [ruby-core:100499] " samuel
2022-01-31 14:47 ` [ruby-core:107390] " rmosolgo (Robert Mosolgo)
2023-08-25  0:13 ` ioquatix (Samuel Williams) via ruby-core [this message]
2023-08-25  0:13 ` [ruby-core:114520] " ioquatix (Samuel Williams) via ruby-core
2023-08-25  3:16 ` [ruby-core:114523] " ioquatix (Samuel Williams) via ruby-core
2023-08-25  3:39 ` [ruby-core:114524] " ioquatix (Samuel Williams) via ruby-core
2023-08-25  4:28 ` [ruby-core:114525] " ioquatix (Samuel Williams) via ruby-core
2023-09-18  8:21 ` [ruby-core:114794] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-104315.20230825001312.15231@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    --cc=ruby-core@ml.ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).