ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:103400] [Ruby master Feature#17795] `before_fork` and `after_fork` callback API
@ 2021-04-12  8:09 jean.boussier
  2021-04-12 13:24 ` [ruby-core:103402] " daniel
                   ` (40 more replies)
  0 siblings, 41 replies; 42+ messages in thread
From: jean.boussier @ 2021-04-12  8:09 UTC (permalink / raw)
  To: ruby-core

Issue #17795 has been reported by byroot (Jean Boussier).

----------------------------------------
Feature #17795: `before_fork` and `after_fork` callback API
https://bugs.ruby-lang.org/issues/17795

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
Replaces: https://bugs.ruby-lang.org/issues/5446

### Context

Ruby code in production is very often running in a forking setup (puma, unicorn, etc), and it is common some types of libraries to need to know when the Ruby process was forked. For instance:

  - Most database clients, ORMs or other libraries keeping a connection pool might need to close connections before the fork happens.
  - Libraries relying on some kind of dispatcher thread might need to restart the thread in the forked children, and clear any internal buffer (e.g. statsd clients, newrelic_rpm).

**This need is only for forking the whole ruby process, extensions doing a `fork(2) + exec(2)` combo etc are not a concern, this aim at only catching `kernel.fork`, `Process.fork` and maybe `Process.daemon`.**.
The use case is for forks that end up executing Ruby code.

### Current solutions

Right now this use case is handled in several ways.

#### Rely on the integrating code to call a `before_fork` or `after_fork` callback.

Some libraries simply rely on documentation and require the user to use the hooks provided by their forking server.

Examples:

  - Sequel: http://sequel.jeremyevans.net/rdoc/files/doc/fork_safety_rdoc.html
  - Rails's Active Record: https://devcenter.heroku.com/articles/concurrency-and-database-connections#multi-process-servers
  - ScoutAPM (it tries to detect popular forking setup and register itself): https://github.com/scoutapp/scout_apm_ruby/blob/fa83793b9e8d2f9a32c920f59b57d7f198f466b8/lib/scout_apm/environment.rb#L142-L146
  - NewRelic RPM (similarly tries to register to popular forking setups): https://www.rubydoc.info/github/newrelic/rpm/NewRelic%2FAgent:after_fork


#### Continuously check `Process.pid`

Some libraries chose to instead keep the process PID in a variable, and to regularly compare it to `Process.pid` to detect forked children.
Unfortunately `Process.pid` is relatively slow on Linux, and these checks tend to be in tight loops, so it's not uncommon when using these libraries
to spend `1` or `2%` of runtime in `Process.pid`.

Examples:

  - Rails's Active Record used to check `Process.pid` https://github.com/Shopify/rails/blob/411ccbdab2608c62aabdb320d52cb02d446bb39c/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L946, it still does but a bit less: https://github.com/rails/rails/pull/41850
  - the `Typhoeus` HTTP client: https://github.com/typhoeus/typhoeus/blob/a345545e5e4ac0522b883fe0cf19e5e2e807b4b0/lib/typhoeus/pool.rb#L34-L42
  - Redis client: https://github.com/redis/redis-rb/blob/6542934f01b9c390ee450bd372209a04bc3a239b/lib/redis/client.rb#L384
  - Some older versions of NewRelic RPM: https://github.com/opendna/scoreranking-api/blob/8fba96d23b4d3e6b64f625079c184f3a292bbc12/vendor/gems/ruby/1.9.1/gems/newrelic_rpm-3.7.3.204/lib/new_relic/agent/harvester.rb#L39-L41

#### Continuously check `Thread#alive?`

Similar to checking `Process.pid`, but for the background thread use case. `Thread#alive?` is regularly checked, and if the thread is dead, it is assumed that the process was forked.
It's much less costly than a `Process.pid`, but also a bit less reliable as the thread could have died for other reasons. It also delays re-creating the thread to the next check rather than immediately upon forking.

Examples:

  - `statsd-instrument`: https://github.com/Shopify/statsd-instrument/blob/0445cca46e29aa48e9f1efec7c72352aff7ec931/lib/statsd/instrument/batched_udp_sink.rb#L63

#### Decorate `Kernel.fork` and `Process.fork`

Another solution is to prepend a module in `Process` and `Kernel`, to decorate the fork method and implement your own callback. It works well, but is made difficult by `Kernel.fork`.


Examples:

  - Active Support: https://github.com/rails/rails/blob/9aed3dcdfea6b64c18035f8e2622c474ba499846/activesupport/lib/active_support/fork_tracker.rb
  - `dd-trace-rb`: https://github.com/DataDog/dd-trace-rb/blob/793946146b4709289cfd459f3b68e8227a9f5fa7/lib/ddtrace/profiling/ext/forking.rb
  - To some extent, `nakayoshi_fork` decorates the `fork` method: https://github.com/ko1/nakayoshi_fork/blob/19ef5efc51e0ae51d7f5f37a0b785309bf16e97f/lib/nakayoshi_fork.rb

### Proposals

I see two possible features to improve this situation:

#### `Process.before_fork` and `Process.after_fork` callbacks

One solution would be for Ruby to expose a callback API for these two events, similar to `Kernel.at_exit`.

#### Make `Kernel.fork` a delegator

A simpler change would be to just make `Kernel.fork` a delegator to `Process.fork`. This would make it much easier to prepend a module on `Process` for each library to implement its own callback.

Proposed patch: https://github.com/ruby/ruby/pull/4361



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-10-25 10:03 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12  8:09 [ruby-core:103400] [Ruby master Feature#17795] `before_fork` and `after_fork` callback API jean.boussier
2021-04-12 13:24 ` [ruby-core:103402] " daniel
2021-04-12 14:12 ` [ruby-core:103403] " jean.boussier
2021-04-12 14:59 ` [ruby-core:103404] " daniel
2021-04-13  1:32 ` [ruby-core:103410] " muraken
2021-04-13  2:32 ` [ruby-core:103411] " merch-redmine
2021-04-13 10:52 ` [ruby-core:103421] " jean.boussier
2021-04-13 11:16 ` [ruby-core:103423] " eregontp
2021-04-13 14:36 ` [ruby-core:103427] " mame
2021-04-13 14:46 ` [ruby-core:103428] " daniel
2021-04-13 15:17 ` [ruby-core:103431] " jean.boussier
2021-04-13 16:32 ` [ruby-core:103432] " mame
2021-04-13 22:22 ` [ruby-core:103443] " jean.boussier
2021-04-14  6:07 ` [ruby-core:103447] " ko1
2021-04-14  7:34 ` [ruby-core:103448] [Ruby master Feature#17795] Around `Process.fork` callbacks API jean.boussier
2021-04-16  8:03 ` [ruby-core:103476] " akr
2021-04-16  8:07 ` [ruby-core:103478] " jean.boussier
2021-04-16 13:28 ` [ruby-core:103482] " ivo.anjo
2021-04-17  7:56 ` [ruby-core:103495] " mame
2021-04-17 11:02 ` [ruby-core:103499] " jean.boussier
2021-04-17 17:43 ` [ruby-core:103501] " mame
2021-04-19  9:55 ` [ruby-core:103512] " jean.boussier
2021-04-21  7:30 ` [ruby-core:103537] " sam.saffron
2021-04-22 13:21 ` [ruby-core:103550] " jean.boussier
2021-04-22 14:34 ` [ruby-core:103552] " daniel
2021-04-22 15:23 ` [ruby-core:103553] " daniel
2021-04-22 18:58 ` [ruby-core:103558] " jean.boussier
2021-04-23 16:26 ` [ruby-core:103575] " eregontp
2021-05-03 11:34 ` [ruby-core:103697] " eregontp
2021-05-21 17:45 ` [ruby-core:103952] " mame
2021-05-21 21:23 ` [ruby-core:103959] " merch-redmine
2021-06-02  7:46 ` [ruby-core:104135] " ivo.anjo
2021-06-02  8:40 ` [ruby-core:104136] " jean.boussier
2021-07-15  7:18 ` [ruby-core:104614] " matz
2021-07-15  7:29 ` [ruby-core:104615] " mame
2021-07-15 20:15 ` [ruby-core:104624] " jean.boussier
2021-07-15 20:40 ` [ruby-core:104625] " jean.boussier
2021-07-16  6:07 ` [ruby-core:104628] " mame
2021-10-17 19:39 ` [ruby-core:105657] " Dan0042 (Daniel DeLorme)
2021-10-25  2:36 ` [ruby-core:105768] " matz (Yukihiro Matsumoto)
2021-10-25  4:18 ` [ruby-core:105771] " mame (Yusuke Endoh)
2021-10-25 10:03 ` [ruby-core:105785] " ioquatix (Samuel Williams)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).