ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "mame (Yusuke Endoh)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:109409] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations)
Date: Tue, 02 Aug 2022 08:56:42 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-98558.20220802085642.7941@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-18885.20220628132154.7941@ruby-lang.org

Issue #18885 has been updated by mame (Yusuke Endoh).


We discussed this issue at the dev meeting. We did not reach any conclusion, but I'd like to share some comments.

### What and how efficient is this proposal?

Some attendees wanted to confirm quantitative evaluation of the benefits this proposal would bring.
@ko1 said that he created nakayoshi_fork as a joke gem. He didn't expect people to use it seriously, and he didn't have serious quantitative measurements.

(I've heard people say that memory usage has been reduced by nakayoshi_fork, but it would be nice to be properly confirm this advantage before introduction.)

### How is it integrated with `Process._fork`?

`Process._fork` has been introduced as an zero-argument API. This API is supposed to be overridden, so we cannot add an argument easily.
If we keep `Process._fork` as is, we need to do some GC processes like nakayoshi_fork *before* the hook of `Process._fork`. Is it OK?

### Are "short-lived" forks needed?

How much are "short-lived" forks used nowadays? The major use case where `Process.exec` is called shortly after `Process.fork`, is covered by `Process.spawn`.
If there is few use cases for "short-lived" forks, we may change the default behavior to "long-lived".
However, we sometimes use fork in tests, to invoke a temporal web server, for example. Calling GC whenever calling fork might be too heavy.

### Is GC called whenever `fork(long_lived: true)` is called?

Here is a typical server code that uses fork:

```
loop do
  sock = servsock.accept
  if fork(long_lived: true)
    ...
  end
end
```

The parent process creates only a socket object for each iteration. It looks somewhat useless to call full GC in the parent process every time `fork(long_lived: true)` is called. A more intelligent strategy may be preferable here.

----------------------------------------
Feature #18885: Long lived fork advisory API (potential Copy on Write optimizations)
https://bugs.ruby-lang.org/issues/18885#change-98558

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

It is rather common to deploy Ruby with forking servers. A process first load the code and data of the application, and then forks a number of workers to handle an incoming workload.
The advantage is that each child has its own GVL and its own GC, so they don't impact each others latency. The downside however is that in uses more memory than using threads or fibers.
That increased memory usage is largely mitigated by Copy on Write, but it's far from perfect. Over time various memory regions will be written into and unshared.

The classic example is the objects generation, young objects must be promoted to the old generation before forking, otherwise they'll get invalidated on the next GC run. That's what https://github.com/ko1/nakayoshi_fork addresses.

But there are other sources of CoW invalidation that could be addressed by MRI if it had a clear notification when it needs to be done.

### Proposal

MRI could assume than any `fork` may be long lived and perform all the optimizations it can then, but It may be preferable to have a dedicated API for that. e.g.

  - `Process.fork(long_lived: true)`
  - `Process.long_lived_fork`
  - `RubyVM.prepare_for_long_lived_fork`

### Potential optimizations

`nakayoshi_fork` already does the following:

  - Do a major GC run to get rid of as many dangling objects as possible.
  - Promote all surviving objects to the highest generation
  - Compact the heap.

But it would be much simpler to do this from inside the VM rather than do cryptic things such as `4.times { GC.start }` from the Ruby side.

Also after discussing with @jhawthorn, @tenderlovemaking and @alanwu, we believe this would open the door to several other CoW optimizations:

#### Precompute inline caches

Even though we don't have hard data to prove it, we are convinced that a big source of CoW invalidation are inline caches. Most ISeq are never invoked during initialization, so child processed are forked with mostly cold caches. As a result the first time a method is executed in the child, many memory pages holding ISeq are invalidated as caches get updated.

We think MRI could try to precompute these caches before forking children. Constant cache particularly should be resolvable statically (somewhat related https://github.com/ruby/ruby/pull/6049).

Method caches are harder to resolve statically, but we can probably apply some heuristics to at least reduce the cache misses.

#### Copy on Write aware GC

We could also keep some metadata about which memory pages are shared, or even introduce a "permanent" generation. [The Instagram engineering team introduced something like that in Python](https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf) ([ticket](https://bugs.python.org/issue31558), [PR](https://github.com/python/cpython/pull/3705)).

That makes the GC aware of which objects live on a shared page. With this information the GC can decide to no free dangling objects leaving on these pages, not to compact these pages, etc.

#### Scan the coderange of all strings

Strings have a lazily computed `coderange` attribute in their flags. So if a string is allocated at boot, but only used after fork, its coderange may be computed and the string mutated.

Using https://github.com/ruby/ruby/pull/6076, I noticed that 58% of the strings retained at the end of the boot sequence had an `UNKNOWN` coderange.

So eagerly scanning the coderange of all strings could also improve Copy on Write performance.




-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2022-08-02  8:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-28 13:21 [ruby-core:109081] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations) byroot (Jean Boussier)
2022-06-30  9:27 ` [ruby-core:109098] " byroot (Jean Boussier)
2022-07-16  3:19 ` [ruby-core:109227] " ioquatix (Samuel Williams)
2022-07-27 16:55 ` [ruby-core:109339] " byroot (Jean Boussier)
2022-07-30  2:33 ` [ruby-core:109380] " Dan0042 (Daniel DeLorme)
2022-07-30  6:19 ` [ruby-core:109381] " byroot (Jean Boussier)
2022-08-02  8:56 ` mame (Yusuke Endoh) [this message]
2022-08-02  9:03 ` [ruby-core:109410] " byroot (Jean Boussier)
2022-08-03  1:32 ` [ruby-core:109417] " mame (Yusuke Endoh)
2022-08-03  6:42 ` [ruby-core:109420] " byroot (Jean Boussier)
2022-08-03  7:10 ` [ruby-core:109421] [Ruby master Feature#18885] End of boot advisory API for RubyVM byroot (Jean Boussier)
2022-08-10 18:21 ` [ruby-core:109469] " Dan0042 (Daniel DeLorme)
2022-08-10 18:24 ` [ruby-core:109470] " byroot (Jean Boussier)
2022-08-18  6:51 ` [ruby-core:109528] " matz (Yukihiro Matsumoto)
2022-08-18  6:55 ` [ruby-core:109529] " byroot (Jean Boussier)
2022-08-18  7:14 ` [ruby-core:109531] " Eregon (Benoit Daloze)
2022-08-18  7:16 ` [ruby-core:109533] " byroot (Jean Boussier)
2022-09-15 13:16 ` [ruby-core:109901] " byroot (Jean Boussier)
2022-09-22  5:52 ` [ruby-core:109989] " ioquatix (Samuel Williams)
2022-09-23 12:57 ` [ruby-core:110045] " Dan0042 (Daniel DeLorme)
2022-10-07 14:38 ` [ruby-core:110231] " matz (Yukihiro Matsumoto)
2022-10-07 15:05 ` [ruby-core:110232] " byroot (Jean Boussier)
2023-04-13  7:21 ` [ruby-core:113213] " ioquatix (Samuel Williams) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-98558.20220802085642.7941@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).