ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:43732] [ruby-trunk - Feature #5446][Feedback] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
@ 2012-03-27 13:45 ` mame (Yusuke Endoh)
  2012-04-09  4:34   ` [ruby-core:44209] " Eric Wong
  2013-11-05 21:48 ` [ruby-core:58175] [ruby-trunk - Feature #5446] " sam.saffron (Sam Saffron)
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 30+ messages in thread
From: mame (Yusuke Endoh) @ 2012-03-27 13:45 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by mame (Yusuke Endoh).

Status changed from Open to Feedback

Eric Wong,

Do you still want this feature?
If so, could you answer kosaki's comment?


OT: We noticed and surprised at your ID (normalperson) at the recent
developers' meeting in Akihabara.  Clearly, you are greatperson :-)

-- 
Yusuke Endoh <mame@tsg.ne.jp>
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-25238

Author: normalperson (Eric Wong)
Status: Feedback
Priority: Low
Assignee: 
Category: 
Target version: 


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:44209] Re: [ruby-trunk - Feature #5446][Feedback] at_fork callback API
  2012-03-27 13:45 ` [ruby-core:43732] [ruby-trunk - Feature #5446][Feedback] at_fork callback API mame (Yusuke Endoh)
@ 2012-04-09  4:34   ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2012-04-09  4:34 UTC (permalink / raw
  To: ruby-core

"mame (Yusuke Endoh)" <mame@tsg.ne.jp> wrote:
> Do you still want this feature?

Yes, but lower priority.

I think default to IO#close_on_exec=true for 2.0.0 makes this
less important.

> If so, could you answer kosaki's comment?

kosaki wrote:
> As you know, we can only call asynchronous-signal-safe function
> between fork and exec when the process is multi threaded. but ruby
> code invocation definitely need to use malloc which not
> async-signal-safe. so, it's pretty hard to implement. Am I missing
> something?

I can't edit the existing ticket, but I think _only_ Kernel#fork should
be touched.  Methods that call exec after fork will already get things
cleaned up from close_on_exec.

If there is a _Ruby_ call to exec, then we already have a chance to use
non-async-signal safe code.

It could be implemented in pure Ruby, even.  This is a prototype (using
xfork name) intead:

	ATFORK = {
	  :prepare => [ lambda { puts ":prepare in #$$" } ],
	  :parent => [ lambda { puts ":parent in #$$" } ],
	  :child => [ lambda { puts ":child in #$$" } ],
	}

	def xfork
	  ATFORK[:prepare].each { |code| code.call }
	  if block_given?
	    pid = fork do
	      ATFORK[:child].each { |code| code.call }
	      yield
	    end
	    ATFORK[:parent].each { |code| code.call }
	  else
	    case pid = fork
	    when nil
	      ATFORK[:child].each { |code| code.call }
	    when Integer
	      ATFORK[:parent].each { |code| code.call }
	    end
	  end

	  pid
	end


I haven't thought of an API to manipulate the ATFORK arrays
with.  I don't want to emulate pthread_atfork() directly, it's
too cumbersome for Ruby.  Perhaps:

	at_fork(:prepare) { ... }
	at_fork(:child) { ... }
	at_fork(:parent) { ... }

> OT: We noticed and surprised at your ID (normalperson) at the recent
> developers' meeting in Akihabara.  Clearly, you are greatperson :-)

I don't think of myself as great.  But if others think I'm great,
they should try to be like me, then we'll all be normal :)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58175] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
  2012-03-27 13:45 ` [ruby-core:43732] [ruby-trunk - Feature #5446][Feedback] at_fork callback API mame (Yusuke Endoh)
@ 2013-11-05 21:48 ` sam.saffron (Sam Saffron)
  2013-11-06  1:38 ` [ruby-core:58195] " jasonrclark (Jason Clark)
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: sam.saffron (Sam Saffron) @ 2013-11-05 21:48 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by sam.saffron (Sam Saffron).


This is a critical feature for Ruby imho, at the moment there are 100 mechanisms for at_fork, we need a clean, supported ordered one. 

I think there should be strong parity with at_exit, so am not particularly fond of the symbol param. I would like. 

at_fork{ } # returns proc, a stack of procs that are called in order just before fork
after_fork {} # returns a proc, a stack of procs that are called in order just after child process launches 

The prepare thing for me seems like overkill. It would be like we are implementing 2 queues, one urgent, one less urgent. I don't really see the point. 

--

There is a question of cancelling forks (eg: what happens when an exception is thrown during these callbacks?) 
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-42762

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58195] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
  2012-03-27 13:45 ` [ruby-core:43732] [ruby-trunk - Feature #5446][Feedback] at_fork callback API mame (Yusuke Endoh)
  2013-11-05 21:48 ` [ruby-core:58175] [ruby-trunk - Feature #5446] " sam.saffron (Sam Saffron)
@ 2013-11-06  1:38 ` jasonrclark (Jason Clark)
  2013-11-30 21:53 ` [ruby-core:58717] " tmm1 (Aman Gupta)
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: jasonrclark (Jason Clark) @ 2013-11-06  1:38 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by jasonrclark (Jason Clark).


I'd love to see this added. Gems using threads (like newrelic_rpm) have a lot of potential for deadlocks when forking happens. This would gives a nice mechanism for dealing with those issues more generally, rather than having to hook things gem-by-gem like we do today.

New Relic + Resque has seen a lot of these types of problems, some of which are documented at https://github.com/resque/resque/issues/1101. While Resque has gem-specific hooks we lean on, having those hooks be at the Ruby level instead would be awesome.

Is there any possibility of this type of after_fork hook could also apply to daemonizing with Process.daemon? While we have fewer deadlocks, we often lose visibility after processes daemonize because we don't know to start our threads back up in the daemonized process. (See https://github.com/puma/puma/issues/335 for an example with the Puma web server).
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-42778

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58717] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2013-11-06  1:38 ` [ruby-core:58195] " jasonrclark (Jason Clark)
@ 2013-11-30 21:53 ` tmm1 (Aman Gupta)
  2013-11-30 21:54 ` [ruby-core:58719] " tmm1 (Aman Gupta)
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-30 21:53 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


Here is another example of a gem which implement its own fork hooks: https://github.com/zk-ruby/zk/blob/master/lib/zk/fork_hook.rb
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43288

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58719] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2013-11-30 21:53 ` [ruby-core:58717] " tmm1 (Aman Gupta)
@ 2013-11-30 21:54 ` tmm1 (Aman Gupta)
  2013-11-30 23:05   ` [ruby-core:58722] " Eric Wong
  2013-12-05 22:37   ` [ruby-core:58882] " KOSAKI Motohiro
  2013-11-30 22:12 ` [ruby-core:58721] " Eregon (Benoit Daloze)
                   ` (14 subsequent siblings)
  19 siblings, 2 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-30 21:54 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


Simple implementation of an after_fork{} hook: https://github.com/tmm1/ruby/commit/711a68b6599d176c5bcb4ef0c90aa195a290d1c0

Any objection?
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43290

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58721] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2013-11-30 21:54 ` [ruby-core:58719] " tmm1 (Aman Gupta)
@ 2013-11-30 22:12 ` Eregon (Benoit Daloze)
  2013-12-01  0:25 ` [ruby-core:58725] " tmm1 (Aman Gupta)
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: Eregon (Benoit Daloze) @ 2013-11-30 22:12 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by Eregon (Benoit Daloze).


tmm1 (Aman Gupta) wrote:
> Simple implementation of an after_fork{} hook: https://github.com/tmm1/ruby/commit/711a68b6599d176c5bcb4ef0c90aa195a290d1c0
> 
> Any objection?

Sounds good and useful to me!
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43292

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58722] Re: [ruby-trunk - Feature #5446] at_fork callback API
  2013-11-30 21:54 ` [ruby-core:58719] " tmm1 (Aman Gupta)
@ 2013-11-30 23:05   ` Eric Wong
  2013-12-06 21:31     ` [ruby-core:58922] " Eric Wong
  2013-12-05 22:37   ` [ruby-core:58882] " KOSAKI Motohiro
  1 sibling, 1 reply; 30+ messages in thread
From: Eric Wong @ 2013-11-30 23:05 UTC (permalink / raw
  To: Ruby developers

"tmm1 (Aman Gupta)" <ruby@tmm1.net> wrote:
> Simple implementation of an after_fork{} hook: https://github.com/tmm1/ruby/commit/711a68b6599d176c5bcb4ef0c90aa195a290d1c0
> 
> Any objection?

I think there needs to be separate hooks for prepare/parent/child
(like pthread_atfork(3)).  The parent may need to release resources
before forking (prepare hook), and perhaps reacquire/reinitialize
them after forking (parent hook).

The prepare hook is important for things like DB connections;
the parent hook might be less useful (especially for apps which
fork repeatedly).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58725] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2013-11-30 22:12 ` [ruby-core:58721] " Eregon (Benoit Daloze)
@ 2013-12-01  0:25 ` tmm1 (Aman Gupta)
  2013-12-06  6:37 ` [ruby-core:58898] " tmm1 (Aman Gupta)
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-12-01  0:25 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


I'd like to add a before_fork{} hook that fires in the parent before the fork. An after_fork hook in the parent seems unnecessary.
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43295

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58882] Re: [ruby-trunk - Feature #5446] at_fork callback API
  2013-11-30 21:54 ` [ruby-core:58719] " tmm1 (Aman Gupta)
  2013-11-30 23:05   ` [ruby-core:58722] " Eric Wong
@ 2013-12-05 22:37   ` KOSAKI Motohiro
  1 sibling, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2013-12-05 22:37 UTC (permalink / raw
  To: Ruby developers

2013/11/30 tmm1 (Aman Gupta) <ruby@tmm1.net>:
>
> Issue #5446 has been updated by tmm1 (Aman Gupta).
>
>
> Simple implementation of an after_fork{} hook: https://github.com/tmm1/ruby/commit/711a68b6599d176c5bcb4ef0c90aa195a290d1c0
>
> Any objection?

??!?!?

1. Why no before_fork?
2. zk has :after_child nad :after_parent hook and your patch don't. Why?
3. Why should the new method return proc?
4. When rb_daemon() is used, some platform call after_fork once and
    the other call twice. It seems useless.
5. Why do hook fire at rb_thread_atfork() make a lot of new array?
6. Your patch doesn't aim to remove the hooks and I'm sure it is required.

You said the new method should be created for killing the gem specific hooks.
But your patch seems not to be able to.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58898] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2013-12-01  0:25 ` [ruby-core:58725] " tmm1 (Aman Gupta)
@ 2013-12-06  6:37 ` tmm1 (Aman Gupta)
  2013-12-06 22:29 ` [ruby-core:58925] " tmm1 (Aman Gupta)
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-12-06  6:37 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


Thanks for your feedback.

>  1. Why no before_fork?

I planned to add this in the same way as after_fork, as long as there are no issues with my patch.
I wasn't sure if rb_vm_t is the best place for the new hooks.
If it seems sane, I can expand the patch.

>  2. zk has :after_child nad :after_parent hook and your patch don't. Why?

With two hooks, before_fork/after_fork methods are enough. If we want to implement three hooks, should we add three new methods?

>  3. Why should the new method return proc?

I guessed the return value can be used to de-register the hook later. But there is no way to do this currently.

> 4. When rb_daemon() is used, some platform call after_fork once and
>     the other call twice. It seems useless.

Ah, good point. Maybe we need a flag to ensure only one invocation.

>  5. Why do hook fire at rb_thread_atfork() make a lot of new array?

No reason. Patch was a simple proof of concept for feedback.

>  6. Your patch doesn't aim to remove the hooks and I'm sure it is required.

Do you have any API ideas for this? I agree we need some way to remove the hooks.

One option is to use tracepoint api:

tp = TracePoint.new(:before_fork, :after_fork_child){ ... }
tp.enable
tp.disable

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43450

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58922] Re: [ruby-trunk - Feature #5446] at_fork callback API
  2013-11-30 23:05   ` [ruby-core:58722] " Eric Wong
@ 2013-12-06 21:31     ` Eric Wong
  2013-12-07  4:41       ` [ruby-core:58934] " Tanaka Akira
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Wong @ 2013-12-06 21:31 UTC (permalink / raw
  To: Ruby developers

Eric Wong <normalperson@yhbt.net> wrote:
> the parent hook might be less useful (especially for apps which
> fork repeatedly).

I take that back.  It can be useful for for setting up pipes/sockets for IPC
between parent and child.

Example without wrapper class and explicit IO#close:

    r, w = IO.pipe
    pid = fork do
      w.close # could be atfork_child
      ...
    end
    r.close # could be atfork_parent
    ...

However, I want to do this via callback, example with Worker class:

    class Worker
      attr_writer :pid

      def initialize
        @r, @w = IO.pipe
        Process.atfork_parent { @r.close unless @r.closed? }
        Process.atfork_child { @w.close unless @w.closed? }
      end
    end

    worker = Worker.new # IO.pipe
    worker.pid = fork { ... }
    ...

    # No need to remember what to close in parent/child

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58925] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2013-12-06  6:37 ` [ruby-core:58898] " tmm1 (Aman Gupta)
@ 2013-12-06 22:29 ` tmm1 (Aman Gupta)
  2013-12-07  4:02 ` [ruby-core:58933] " tmm1 (Aman Gupta)
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-12-06 22:29 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


Ok good point. I agree we should add all three if we're going to do this.

I like the Process.atfork_parent{} API in your example. If we add three methods to Process, the only thing missing is a way to un-register a hook.
----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43474

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58933] [ruby-trunk - Feature #5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (8 preceding siblings ...)
  2013-12-06 22:29 ` [ruby-core:58925] " tmm1 (Aman Gupta)
@ 2013-12-07  4:02 ` tmm1 (Aman Gupta)
  2016-11-02  2:46 ` [ruby-core:77860] [Ruby trunk Feature#5446] " sam.saffron
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-12-07  4:02 UTC (permalink / raw
  To: ruby-core


Issue #5446 has been updated by tmm1 (Aman Gupta).


Another idea:

  Process.at_fork{ |loc| case loc; when :before_fork; ... end } -> proc
  Process.remove_at_fork(proc)

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-43483

Author: normalperson (Eric Wong)
Status: Assigned
Priority: Low
Assignee: kosaki (Motohiro KOSAKI)
Category: 
Target version: next minor


It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58934] Re: [ruby-trunk - Feature #5446] at_fork callback API
  2013-12-06 21:31     ` [ruby-core:58922] " Eric Wong
@ 2013-12-07  4:41       ` Tanaka Akira
  2013-12-08  6:49         ` [ruby-core:58961] " Eric Wong
  0 siblings, 1 reply; 30+ messages in thread
From: Tanaka Akira @ 2013-12-07  4:41 UTC (permalink / raw
  To: Ruby developers

2013/12/7 Eric Wong <normalperson@yhbt.net>:

> However, I want to do this via callback, example with Worker class:
>
>     class Worker
>       attr_writer :pid
>
>       def initialize
>         @r, @w = IO.pipe
>         Process.atfork_parent { @r.close unless @r.closed? }
>         Process.atfork_child { @w.close unless @w.closed? }
>       end
>     end
>
>     worker = Worker.new # IO.pipe
>     worker.pid = fork { ... }
>     ...
>
>     # No need to remember what to close in parent/child

I think it doesn't work with multiple thread.

2.times {
  Thread.new {
    worker = Worker.new # IO.pipe
    worker.pid = fork { ... }
    ...
  }
}

If fork for worker 1 is called between IO.pipe and fork for worker 2,
pipes for worker 2 is leaked for the process for worker 1 and
not inherited to the process for worker 2.

I feel it is not a good example for this issue.
-- 
Tanaka Akira

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:58961] Re: [ruby-trunk - Feature #5446] at_fork callback API
  2013-12-07  4:41       ` [ruby-core:58934] " Tanaka Akira
@ 2013-12-08  6:49         ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2013-12-08  6:49 UTC (permalink / raw
  To: Ruby developers

Tanaka Akira <akr@fsij.org> wrote:
> 2013/12/7 Eric Wong <normalperson@yhbt.net>:
> > However, I want to do this via callback, example with Worker class:
> >
> >     class Worker
> >       attr_writer :pid
> >
> >       def initialize
> >         @r, @w = IO.pipe
> >         Process.atfork_parent { @r.close unless @r.closed? }
> >         Process.atfork_child { @w.close unless @w.closed? }
> >       end
> >     end
> >
> >     worker = Worker.new # IO.pipe
> >     worker.pid = fork { ... }
> >     ...
> >
> >     # No need to remember what to close in parent/child
> 
> I think it doesn't work with multiple thread.

True, but I wasn't intending this example to be used for an MT
application, but a single-threaded, multi-process HTTP server.

Generally, I do not use fork after I've spawned threads (unless
followed immediately with exec).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:77860] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (9 preceding siblings ...)
  2013-12-07  4:02 ` [ruby-core:58933] " tmm1 (Aman Gupta)
@ 2016-11-02  2:46 ` sam.saffron
  2016-11-05 17:06 ` [ruby-core:78004] " email
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: sam.saffron @ 2016-11-02  2:46 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Sam Saffron.


ping, any plans to add these apis, they will help clean up a lot in the Ruby ecosystem, just hit it again today. 

I think what is really needed here is a conceptual Yes or No from Koichi or Matz

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-61175

* Author: Eric Wong
* Status: Assigned
* Priority: Normal
* Assignee: Motohiro KOSAKI
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:78004] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (10 preceding siblings ...)
  2016-11-02  2:46 ` [ruby-core:77860] [Ruby trunk Feature#5446] " sam.saffron
@ 2016-11-05 17:06 ` email
  2016-12-21 15:08 ` [ruby-core:78786] " shyouhei
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: email @ 2016-11-05 17:06 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Petr Chalupa.


We had to get around missing at_fork in concurrent-ruby as well. Since we are avoiding monkey_patching we had to result in lazy-checking pid, when it changes we re-create a resource. The resource has to check on every usage though which is not good.

Having Process.remove_at_fork(proc) is critical, an user of at_fork has to be able to unregister callback when the resource is not used any further.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-61344

* Author: Eric Wong
* Status: Assigned
* Priority: Normal
* Assignee: Motohiro KOSAKI
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:78786] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (11 preceding siblings ...)
  2016-11-05 17:06 ` [ruby-core:78004] " email
@ 2016-12-21 15:08 ` shyouhei
  2016-12-21 17:42 ` [ruby-core:78792] " eregontp
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: shyouhei @ 2016-12-21 15:08 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Shyouhei Urabe.


We looked at this issue at today's developer meeting.

I was told that async-signal-safety is no longer the issue because in comment #4 the OP already restricted the request to fork method that runs ruby code. That's a good thing to know.

Then, given repeated request of this functionality, why not start this feature as a gem?  Because we focus on fork that runs ruby code, the needed functionality is doable in pure ruby I think.  Is there any reason this should be done by the language core (apart from monkey-paytching ugliness)?

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-62193

* Author: Eric Wong
* Status: Assigned
* Priority: Normal
* Assignee: Motohiro KOSAKI
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:78792] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (12 preceding siblings ...)
  2016-12-21 15:08 ` [ruby-core:78786] " shyouhei
@ 2016-12-21 17:42 ` eregontp
  2018-07-19  6:29 ` [ruby-core:88003] " ashe
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 30+ messages in thread
From: eregontp @ 2016-12-21 17:42 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Benoit Daloze.


> Is there any reason this should be done by the language core (apart from monkey-paytching ugliness)?

Yes, Ruby code cannot change e.g. rb_f_fork().
There is already atfork-related logic and it seems brittle to override something as essential as #fork.
Since libraries which need this feature currently monkey-patch #fork, it means they do not compose well.
Monkey-patching is considered very fragile for this, I guess that's why no gem for it exists or it would not be used.

If monkey-patching is not used (because of above problems),
the cost of checking on every access becomes a problem for maintenance and performance.
Sometimes, it even means a library decides to be thread-safe but not process-safe:
https://github.com/jeremyevans/sequel/issues/691

Another example is Passenger and other web servers which have their own hook for after_fork.
This means the application is now forced to keep the same web server, or change the hook (very easy to forget).

Let's add Process.at_fork and Process.remove_at_fork,
to encourage more responsible programming.
The evidence for the need of a common API for this is obvious.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-62200

* Author: Eric Wong
* Status: Assigned
* Priority: Normal
* Assignee: Motohiro KOSAKI
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88003] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (13 preceding siblings ...)
  2016-12-21 17:42 ` [ruby-core:78792] " eregontp
@ 2018-07-19  6:29 ` ashe
  2018-07-19 17:37   ` [ruby-core:88008] " Eric Wong
  2018-07-19 20:16 ` [ruby-core:88011] " eregontp
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 30+ messages in thread
From: ashe @ 2018-07-19  6:29 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by kivikakk (Ashe Connor).


Are we still interested in pursuing something like this? I'd be happy to push it forward.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73019

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88008] Re: [Ruby trunk Feature#5446] at_fork callback API
  2018-07-19  6:29 ` [ruby-core:88003] " ashe
@ 2018-07-19 17:37   ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2018-07-19 17:37 UTC (permalink / raw
  To: ruby-core

ashe@kivikakk.ee wrote:
> Are we still interested in pursuing something like this? I'd
> be happy to push it forward.

I'm not sure it's necessary, anymore.  Most programs and libs
already have workarounds at this point.  pthread_atfork can also
be the source of surprising behaviors (thread/async-signal
safety), so Austin group proposed _Fork function:
http://austingroupbugs.net/view.php?id=62

> https://bugs.ruby-lang.org/issues/5446#change-73019

> These callbacks should be called whenever fork() is used:

> - IO.popen
> - ``
> - Kernel#system

Furthermore, many platforms have vfork (it's possible to support
vfork without MMU) and we favor using that for process spawning,
so the following is unneccessary:

> I also want to consider handlers that only need to be called for plain
> fork() use (without immediate exec() afterwards, like with `` and system()).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88011] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (14 preceding siblings ...)
  2018-07-19  6:29 ` [ruby-core:88003] " ashe
@ 2018-07-19 20:16 ` eregontp
  2018-07-20 10:14   ` [ruby-core:88015] " Eric Wong
  2018-07-20 12:06 ` [ruby-core:88017] " eregontp
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 30+ messages in thread
From: eregontp @ 2018-07-19 20:16 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Eregon (Benoit Daloze).


IMHO, this is is still very much needed for cases using plain fork().
The default close-on-exec behavior is irrelevant for fork() without exec() and does not help.
Inheriting, e.g., a database connection in such a case is extremely harmful (data corruption or leaking sensitive data) and a common pitfall.
Having a nice at_fork hook would allow to safely disconnect the database in each forked process to avoid accidental file descriptor sharing between forks.

Sequel:
https://github.com/jeremyevans/sequel/issues/691

Passenger has PhusionPassenger.on_event(:starting_worker_process).
They give the example of memcached and by default use it to reestablish ActiveRecord connections:
https://www.phusionpassenger.com/library/indepth/ruby/spawn_methods/#unintentional-file-descriptor-sharing

Puma has before_fork:
https://github.com/puma/puma/pull/754/files

Unicorn implements its own before_fork/after_fork:
https://www.rubydoc.info/gems/unicorn/5.0.1/Unicorn%2FConfigurator:after_fork
@normalperson Don't you think this deserves to be in core?

I would guess all of these do not work if rb_fork() is called.
What more motivation do we need?
Wait that a major incident happen due to unintentional fd sharing because database libraries cannot protect themselves from fork()?

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73029

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88015] Re: [Ruby trunk Feature#5446] at_fork callback API
  2018-07-19 20:16 ` [ruby-core:88011] " eregontp
@ 2018-07-20 10:14   ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2018-07-20 10:14 UTC (permalink / raw
  To: ruby-core

eregontp@gmail.com wrote:
> IMHO, this is is still very much needed for cases using plain fork().
> The default close-on-exec behavior is irrelevant for fork() without exec() and does not help.

Agreed.

> Inheriting, e.g., a database connection in such a case is
> extremely harmful (data corruption or leaking sensitive data)
> and a common pitfall.

It's been a known problem for decades, now (at least since the
days of mod_perl + DBI on Apache 1.x); and AFAIK there's no data
leaks from it.  Anybody who makes that mistake would likely
raise/crash/burn long before seeing, much less leaking sensitive
data.

> Having a nice at_fork hook would allow to safely disconnect the database in each forked process to avoid accidental file descriptor sharing between forks.
> 
> Sequel:
> https://github.com/jeremyevans/sequel/issues/691

I agree with Jeremy on this; it will likely cause new problems
and surprises if libraries use it.

> Passenger has PhusionPassenger.on_event(:starting_worker_process).
> They give the example of memcached and by default use it to reestablish ActiveRecord connections:
> https://www.phusionpassenger.com/library/indepth/ruby/spawn_methods/#unintentional-file-descriptor-sharing
> 
> Puma has before_fork:
> https://github.com/puma/puma/pull/754/files

Exactly my point, everything relevant which forks has documented
and supported means to disconnect/restart connections.

> @normalperson Don't you think this deserves to be in core?

Not anymore.

> I would guess all of these do not work if rb_fork() is called.
> What more motivation do we need?
> Wait that a major incident happen due to unintentional fd
> sharing because database libraries cannot protect themselves
> from fork()?

Again, this is a known problem with known workarounds for
decades, now.  I expect nobody has been able to get close to
production or dealing with important data with such a broken
setup.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88017] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (15 preceding siblings ...)
  2018-07-19 20:16 ` [ruby-core:88011] " eregontp
@ 2018-07-20 12:06 ` eregontp
  2018-07-20 20:43   ` [ruby-core:88034] " Eric Wong
  2018-07-23 22:51 ` [ruby-core:88065] " tenderlove
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 30+ messages in thread
From: eregontp @ 2018-07-20 12:06 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by Eregon (Benoit Daloze).


normalperson (Eric Wong) wrote:
>  It's been a known problem for decades, now (at least since the
>  days of mod_perl + DBI on Apache 1.x); and AFAIK there's no data
>  leaks from it.  Anybody who makes that mistake would likely
>  raise/crash/burn long before seeing, much less leaking sensitive data.

Yes, it's not a new problem.
I disagree about no production leaks, because it happened to me on a website running for a national programming contest.
For most of the contest it was fine as one process was able to handle the load,
but at some point the webserver decided to spawn another process by forking,
people starting seeing each's other solution, the scores were corrupted and everyone was puzzled as to what happened.
We had to stop the contest due to this.

I want to help protect future programmers from such bugs, if at all possible.
And I believe it's possible.

>  I agree with Jeremy on this; it will likely cause new problems
>  and surprises if libraries use it.

Let's design it so it doesn't.
What's the harm/surprise to reconnect in at_fork(:after_child) for instance?

The current hooks are webserver-specific and so migrating between unicorn/puma/passenger/etc means
it's quite easy to forget to adapt to the new webserver hook, which would trigger this bug.
Especially if e.g., using ActiveRecord with Passenger, where the hook is automatic and migrating to another forking webserver.
Having standard hooks solves this, also works for rb_f_fork(),
and empowers libraries to solve this issue (not just DB libraries but also concurrent-ruby (see above), etc).

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73036

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88034] Re: [Ruby trunk Feature#5446] at_fork callback API
  2018-07-20 12:06 ` [ruby-core:88017] " eregontp
@ 2018-07-20 20:43   ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2018-07-20 20:43 UTC (permalink / raw
  To: ruby-core

eregontp@gmail.com wrote:
> normalperson (Eric Wong) wrote:
> >  It's been a known problem for decades, now (at least since the
> >  days of mod_perl + DBI on Apache 1.x); and AFAIK there's no data
> >  leaks from it.  Anybody who makes that mistake would likely
> >  raise/crash/burn long before seeing, much less leaking sensitive data.
> 
> Yes, it's not a new problem.
> I disagree about no production leaks, because it happened to me on a website running for a national programming contest.
> For most of the contest it was fine as one process was able to handle the load,
> but at some point the webserver decided to spawn another process by forking,
> people starting seeing each's other solution, the scores were corrupted and everyone was puzzled as to what happened.
> We had to stop the contest due to this.

fork is full of caveats; using atfork hook to work around one
caveat out of many is not a solution.  The solution is knowing
the caveats of the tools you use.

In your case, it seemed like you were not paying attention to
the server setup at all and would not have known to use atfork
hook regardless if it was in the webserver or core.

> I want to help protect future programmers from such bugs, if at all possible.
> And I believe it's possible.
> 
> >  I agree with Jeremy on this; it will likely cause new problems
> >  and surprises if libraries use it.
> 
> Let's design it so it doesn't.
> What's the harm/surprise to reconnect in at_fork(:after_child) for instance?

It's a waste of time and resources when the child has no need for
a connection at all.  Simply put, library authors have no clue
how an application will use/manage processes and would
(rightfully) not bother using such hooks.

Also, consider that pthread_atfork has been around for many
years, it's not adopted by library authors (of C/C++ libraries)
because of problems surrounding it; and POSIX is even
considering deprecating pthread_atfork[1].

How about an alternate proposal?

	Introduce a new object_id-like identifier which changes
	across fork: Thread.current.thread_id

It doesn't penalize platforms without fork, and can work well
with existing thread-aware code.

IMHO, Thread.current.object_id being stable in forked child
isn't good; but I expect compatibility problems if we change it
at this point.  At least some usages of monitor.rb would
break.

> The current hooks are webserver-specific and so migrating
> between unicorn/puma/passenger/etc means it's quite easy to
> forget to adapt to the new webserver hook, which would trigger
> this bug.

I hate the amount of vendor lock-in each webserver has.
But making hooks which library authors can fire unpredictably
on application authors is worse, especially if there's no
"opt-out".


[1] http://austingroupbugs.net/view.php?id=858

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88065] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (16 preceding siblings ...)
  2018-07-20 12:06 ` [ruby-core:88017] " eregontp
@ 2018-07-23 22:51 ` tenderlove
  2018-07-24  4:35   ` [ruby-core:88067] Re: [Feature #5446] identify execution context to avoid inadvertant sharing Eric Wong
  2018-07-24  4:41 ` [ruby-core:88068] [Ruby trunk Feature#5446] at_fork callback API merch-redmine
  2018-07-24  4:57 ` [ruby-core:88071] " merch-redmine
  19 siblings, 1 reply; 30+ messages in thread
From: tenderlove @ 2018-07-23 22:51 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by tenderlovemaking (Aaron Patterson).


normalperson (Eric Wong) wrote:
> eregontp@gmail.com wrote:
>  > normalperson (Eric Wong) wrote:
>  > >  It's been a known problem for decades, now (at least since the
>  > >  days of mod_perl + DBI on Apache 1.x); and AFAIK there's no data
>  > >  leaks from it.  Anybody who makes that mistake would likely
>  > >  raise/crash/burn long before seeing, much less leaking sensitive data.
>  > 
>  > Yes, it's not a new problem.
>  > I disagree about no production leaks, because it happened to me on a website running for a national programming contest.
>  > For most of the contest it was fine as one process was able to handle the load,
>  > but at some point the webserver decided to spawn another process by forking,
>  > people starting seeing each's other solution, the scores were corrupted and everyone was puzzled as to what happened.
>  > We had to stop the contest due to this.

I've experienced a data leak similar to this.  Saying the impact was "catastrophic" would be an understatement. :(

>  fork is full of caveats; using atfork hook to work around one
>  caveat out of many is not a solution.  The solution is knowing
>  the caveats of the tools you use.
>  
>  In your case, it seemed like you were not paying attention to
>  the server setup at all and would not have known to use atfork
>  hook regardless if it was in the webserver or core.
>  
>  > I want to help protect future programmers from such bugs, if at all possible.
>  > And I believe it's possible.
>  > 
>  > >  I agree with Jeremy on this; it will likely cause new problems
>  > >  and surprises if libraries use it.
>  > 
>  > Let's design it so it doesn't.
>  > What's the harm/surprise to reconnect in at_fork(:after_child) for instance?
>  
>  It's a waste of time and resources when the child has no need for
>  a connection at all.  Simply put, library authors have no clue
>  how an application will use/manage processes and would
>  (rightfully) not bother using such hooks.

I think library authors can make things easier though.  Web frameworks, like Rails for example, are expected to handle this situation for the user.  In addition, say a library author provided no such feature like Sequel, how would a user know they need to call `DB.disconnect` after a fork?  Are they responsible for completely understanding the implementation of the library they are using?  Even if an end user called `DB.disconnect` in an after fork hook, what if that wasn't enough?  How would an end user know what needs to be called?

>  Also, consider that pthread_atfork has been around for many
>  years, it's not adopted by library authors (of C/C++ libraries)
>  because of problems surrounding it; and POSIX is even
>  considering deprecating pthread_atfork[1].
>  
>  How about an alternate proposal?
>  
>  	Introduce a new object_id-like identifier which changes
>  	across fork: Thread.current.thread_id
>  
>  It doesn't penalize platforms without fork, and can work well
>  with existing thread-aware code.

I think this is a good idea, but I'm not sure it addresses the communication issue I brought up.  IMO it would be great to have some sort of hook so that library authors can dictate what "the right thing to do" is after a fork (maybe there are other resources or caches that need to be cleaned, and maybe that changes from version to version).

>  IMHO, Thread.current.object_id being stable in forked child
>  isn't good; but I expect compatibility problems if we change it
>  at this point.  At least some usages of monitor.rb would
>  break.
>  
>  > The current hooks are webserver-specific and so migrating
>  > between unicorn/puma/passenger/etc means it's quite easy to
>  > forget to adapt to the new webserver hook, which would trigger
>  > this bug.
>  
>  I hate the amount of vendor lock-in each webserver has.
>  But making hooks which library authors can fire unpredictably
>  on application authors is worse, especially if there's no
>  "opt-out".

I think requiring users to specify a db disconnect after fork causes even more "vendor lock-in".  Lets say I did add the after fork code to deal with Sequel, but now I want to switch to a threaded webserver.  Now I have to do more work to figure out what's required (if anything) in a threaded environment.  It puts the onus on the app dev to figure out what's right for a particular environment, and that means it's harder to change: locking you in by making more work.

Additionally, forking servers all have to provide this type of hook anyway (Unicorn, Resque, Puma, to name a few) but today they have to specify their own API.  I think it would be great if we had a "Rack for fork hooks", if that makes sense.  :)


----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73086

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88067] Re: [Feature #5446] identify execution context to avoid inadvertant sharing
  2018-07-23 22:51 ` [ruby-core:88065] " tenderlove
@ 2018-07-24  4:35   ` Eric Wong
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Wong @ 2018-07-24  4:35 UTC (permalink / raw
  To: ruby-core

```
tenderlove@ruby-lang.org wrote:
> normalperson (Eric Wong) wrote:
> > eregontp@gmail.com wrote:
> >  > I want to help protect future programmers from such bugs, if at all possible.
> >  > And I believe it's possible.
> >  > 
> >  > >  I agree with Jeremy on this; it will likely cause new problems
> >  > >  and surprises if libraries use it.
> >  > 
> >  > Let's design it so it doesn't.
> >  > What's the harm/surprise to reconnect in at_fork(:after_child) for instance?
> >  
> >  It's a waste of time and resources when the child has no need for
> >  a connection at all.  Simply put, library authors have no clue
> >  how an application will use/manage processes and would
> >  (rightfully) not bother using such hooks.
> 
> I think library authors can make things easier though.  Web
> frameworks, like Rails for example, are expected to handle
> this situation for the user.  In addition, say a library
> author provided no such feature like Sequel, how would a user
> know they need to call `DB.disconnect` after a fork?  Are they
> responsible for completely understanding the implementation of
> the library they are using?  Even if an end user called
> `DB.disconnect` in an after fork hook, what if that wasn't
> enough?  How would an end user know what needs to be called?

I don't know how to deal with unknowledgeable users aside from
making them knowledgeable about the tools they use.

However, the issue also relates to allowing things to become
thread-safe and fiber-safe at the moment, and in the future:
Guild-safe.

So maybe ko1 can give his thoughts on this topic since he
is working on Guilds

> >  Also, consider that pthread_atfork has been around for many
> >  years, it's not adopted by library authors (of C/C++ libraries)
> >  because of problems surrounding it; and POSIX is even
> >  considering deprecating pthread_atfork[1].
> >  
> >  How about an alternate proposal?
> >  
> >  	Introduce a new object_id-like identifier which changes
> >  	across fork: Thread.current.thread_id
> >  
> >  It doesn't penalize platforms without fork, and can work well
> >  with existing thread-aware code.
> 
> I think this is a good idea, but I'm not sure it addresses the
> communication issue I brought up.  IMO it would be great to
> have some sort of hook so that library authors can dictate
> what "the right thing to do" is after a fork (maybe there are
> other resources or caches that need to be cleaned, and maybe
> that changes from version to version).

Library authors also do not know if the child process will touch
their objects/code; so making such assumptions can be costly
when child processes with different goals are used.

I think the only thing low-level library authors (e.g. authors
of libpq/sqlite3/libcurl/etc...) can do is document how to deal
with such issues when interacting with threads and processes.

Maybe Sequel and AR can have add optional PID checks
($$ != expected_pid), which is totally generic and usable
in bunch of cases.

One thing I have been annoyed by is END/at_exit hooks firing
unexpectedly in child processes.  I work around them with the
PID check.  So I want to avoid having the same annoyance from
at_exit by avoiding atfork hooks.

> >  IMHO, Thread.current.object_id being stable in forked child
> >  isn't good; but I expect compatibility problems if we change it
> >  at this point.  At least some usages of monitor.rb would
> >  break.
> >  
> >  > The current hooks are webserver-specific and so migrating
> >  > between unicorn/puma/passenger/etc means it's quite easy to
> >  > forget to adapt to the new webserver hook, which would trigger
> >  > this bug.
> >  
> >  I hate the amount of vendor lock-in each webserver has.
> >  But making hooks which library authors can fire unpredictably
> >  on application authors is worse, especially if there's no
> >  "opt-out".
> 
> I think requiring users to specify a db disconnect after fork
> causes even more "vendor lock-in".  Lets say I did add the
> after fork code to deal with Sequel, but now I want to switch
> to a threaded webserver.  Now I have to do more work to figure
> out what's required (if anything) in a threaded environment.
> It puts the onus on the app dev to figure out what's right for
> a particular environment, and that means it's harder to
> change: locking you in by making more work.

AFAIK, Sequel and AR already provides out-of-the-box thread-safe
behavior.  Perhaps Sequel and AR should also have out-of-the-box
thread-AND-fork-safe connection pool which checks the PID.  I
don't expect PID check cost to be expensive, but they can make
that an option for users and it'll work on all forking servers
and maybe Guild servers in the future.

People who don't use forking can use old thread-safe-only or
single-threaded-only code.

The key thing I want to avoid from having atfork hooks is:

        DB.disconnect
        pid = fork do
          # something which never touches DB
          exit!(0) # avoid at_exit hook
        end
        DB.reconnect

It's needlessly expensive in the parent process to pay the
disconnect/reconnect cost when forking a child which never
touches the DB.  And there's already a gotcha for avoiding
at_exit hooks.

> Additionally, forking servers all have to provide this type of
> hook anyway (Unicorn, Resque, Puma, to name a few) but today
> they have to specify their own API.  I think it would be great
> if we had a "Rack for fork hooks", if that makes sense.  :)

I don't think limiting this to fork is necessary; forking may
become obsolete with Guilds.
```

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88068] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (17 preceding siblings ...)
  2018-07-23 22:51 ` [ruby-core:88065] " tenderlove
@ 2018-07-24  4:41 ` merch-redmine
  2018-07-24  4:57 ` [ruby-core:88071] " merch-redmine
  19 siblings, 0 replies; 30+ messages in thread
From: merch-redmine @ 2018-07-24  4:41 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by jeremyevans0 (Jeremy Evans).


tenderlovemaking (Aaron Patterson) wrote:
> I think library authors can make things easier though.  Web frameworks, like Rails for example, are expected to handle this situation for the user.  In addition, say a library author provided no such feature like Sequel, how would a user know they need to call `DB.disconnect` after a fork?  Are they responsible for completely understanding the implementation of the library they are using?  Even if an end user called `DB.disconnect` in an after fork hook, what if that wasn't enough?  How would an end user know what needs to be called?

Preloading before forking is not the default in unicorn or puma, and the documentation for both unicorn and puma mention the related issues in the documentation that describes how to enable preloading before forking.  Additionally, Sequel's documentation mentions the need to disconnect when using preloading before forking.  Users should not be expected to understand the implementation of the libraries they are using, but they should be expected to read the documentation for non-default configuration options they explicitly enable.

Now, passenger enables preloading before forking by default.  Some people may consider that an issue with passenger or others may be OK with them just optimizing for the case where only ActiveRecord is used without any other libraries that would allocate file descriptors.  To be fair, passenger's documentation does discuss the issue in detail.

FWIW, disconnecting (at least with Sequel) should be done before forking, not after forking.  After forking you are already sharing the file descriptors.  As long as the library can reestablish connections as needed (as Sequel can), a single line of code to disconnect before fork is globally much simpler than extensive code that tries (and fails) to handle all possible cases when fork can be used.

I will posit that any attempt to disconnect automatically during forking (either in the parent or child) can break a reasonable (if less common) setup where forking is used correctly without explicit disconnection.

> >  How about an alternate proposal?
> >  
> >  	Introduce a new object_id-like identifier which changes
> >  	across fork: Thread.current.thread_id
> >  
> >  It doesn't penalize platforms without fork, and can work well
> >  with existing thread-aware code.
> 
> I think this is a good idea, but I'm not sure it addresses the communication issue I brought up.  IMO it would be great to have some sort of hook so that library authors can dictate what "the right thing to do" is after a fork (maybe there are other resources or caches that need to be cleaned, and maybe that changes from version to version).

At least for Sequel, I don't think this will matter.  I'm neither for nor against this.

> Additionally, forking servers all have to provide this type of hook anyway (Unicorn, Resque, Puma, to name a few) but today they have to specify their own API.  I think it would be great if we had a "Rack for fork hooks", if that makes sense.  :)

I think the main problem here is that libraries may offer `before_fork` or `after_fork` hooks that are called with arguments that a in-core `at_fork` hook couldn't use.  I know that is the case with unicorn.

Additionally, there is the issue with in-core `at_fork` hooks in terms of inheritance:

~~~ ruby
  fork do # calls at_fork hooks
    fork do # calls at_fork hooks?
    end
  end
~~~

There are arguments for inheriting the hooks, and there are arguments for not doing so.  I suppose you could make it configurable, but that increases complexity.  When fork hooks are implemented in a library, they apply to only the case where the library forks, and not all cases where `fork` is called.

Consider that you are using a preforking webserver, and in your parent process, you do:

~~~ ruby
r, w = IO.pipe
if fork
  # ...
else
  # ...
  exit!
end  
~~~

If an in-core `at_fork` handler is used, the webserver's forking hooks may be called for this fork, when that may not be desired.  That is not the case for library-specific fork hooks that wrap the library's usage of `fork`.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73088

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [ruby-core:88071] [Ruby trunk Feature#5446] at_fork callback API
       [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
                   ` (18 preceding siblings ...)
  2018-07-24  4:41 ` [ruby-core:88068] [Ruby trunk Feature#5446] at_fork callback API merch-redmine
@ 2018-07-24  4:57 ` merch-redmine
  19 siblings, 0 replies; 30+ messages in thread
From: merch-redmine @ 2018-07-24  4:57 UTC (permalink / raw
  To: ruby-core

Issue #5446 has been updated by jeremyevans0 (Jeremy Evans).


normalperson (Eric Wong) wrote:
>  Maybe Sequel and AR can have add optional PID checks
>  ($$ != expected_pid), which is totally generic and usable
>  in bunch of cases.

My problem with doing this by default is that it penalizes knowledgeable users just to avoid problems for users that don't read the documentation.  Some libraries may consider that an acceptable tradeoff, but I do not.  I guess it depends at least partly on the expected audience for the library.

I'd be OK with this as an optional behavior, but in general anyone who would use it would be better off just disconnecting before forking, so I haven't implemented it yet.

>  AFAIK, Sequel and AR already provides out-of-the-box thread-safe
>  behavior.  Perhaps Sequel and AR should also have out-of-the-box
>  thread-AND-fork-safe connection pool which checks the PID.  I
>  don't expect PID check cost to be expensive, but they can make
>  that an option for users and it'll work on all forking servers
>  and maybe Guild servers in the future.

I'm guessing that Guild servers will operate differently than forking servers (not having different PIDs for different Guilds), but ko1 would know more about that.

>  I don't think limiting this to fork is necessary; forking may
>  become obsolete with Guilds.

I do not think that will be the case.  For one thing, you are always going to want a forking approach if you want a parent process that forks children and reaps them when they crash/exit.  This is especially true when the child process uses different permissions than the parent via `setuid`.  It could definitely be the case that a single-process multi-Guild approach will become more popular than current forking multi-process servers.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73091

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-07-24  4:57 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-5446.20111014111058@ruby-lang.org>
2012-03-27 13:45 ` [ruby-core:43732] [ruby-trunk - Feature #5446][Feedback] at_fork callback API mame (Yusuke Endoh)
2012-04-09  4:34   ` [ruby-core:44209] " Eric Wong
2013-11-05 21:48 ` [ruby-core:58175] [ruby-trunk - Feature #5446] " sam.saffron (Sam Saffron)
2013-11-06  1:38 ` [ruby-core:58195] " jasonrclark (Jason Clark)
2013-11-30 21:53 ` [ruby-core:58717] " tmm1 (Aman Gupta)
2013-11-30 21:54 ` [ruby-core:58719] " tmm1 (Aman Gupta)
2013-11-30 23:05   ` [ruby-core:58722] " Eric Wong
2013-12-06 21:31     ` [ruby-core:58922] " Eric Wong
2013-12-07  4:41       ` [ruby-core:58934] " Tanaka Akira
2013-12-08  6:49         ` [ruby-core:58961] " Eric Wong
2013-12-05 22:37   ` [ruby-core:58882] " KOSAKI Motohiro
2013-11-30 22:12 ` [ruby-core:58721] " Eregon (Benoit Daloze)
2013-12-01  0:25 ` [ruby-core:58725] " tmm1 (Aman Gupta)
2013-12-06  6:37 ` [ruby-core:58898] " tmm1 (Aman Gupta)
2013-12-06 22:29 ` [ruby-core:58925] " tmm1 (Aman Gupta)
2013-12-07  4:02 ` [ruby-core:58933] " tmm1 (Aman Gupta)
2016-11-02  2:46 ` [ruby-core:77860] [Ruby trunk Feature#5446] " sam.saffron
2016-11-05 17:06 ` [ruby-core:78004] " email
2016-12-21 15:08 ` [ruby-core:78786] " shyouhei
2016-12-21 17:42 ` [ruby-core:78792] " eregontp
2018-07-19  6:29 ` [ruby-core:88003] " ashe
2018-07-19 17:37   ` [ruby-core:88008] " Eric Wong
2018-07-19 20:16 ` [ruby-core:88011] " eregontp
2018-07-20 10:14   ` [ruby-core:88015] " Eric Wong
2018-07-20 12:06 ` [ruby-core:88017] " eregontp
2018-07-20 20:43   ` [ruby-core:88034] " Eric Wong
2018-07-23 22:51 ` [ruby-core:88065] " tenderlove
2018-07-24  4:35   ` [ruby-core:88067] Re: [Feature #5446] identify execution context to avoid inadvertant sharing Eric Wong
2018-07-24  4:41 ` [ruby-core:88068] [Ruby trunk Feature#5446] at_fork callback API merch-redmine
2018-07-24  4:57 ` [ruby-core:88071] " merch-redmine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).