ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
@ 2020-09-16 18:39 tenderlove
  2020-09-16 21:08 ` [ruby-core:100026] " eregontp
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: tenderlove @ 2020-09-16 18:39 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been reported by tenderlovemaking (Aaron Patterson).

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100026] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
@ 2020-09-16 21:08 ` eregontp
  2020-09-16 21:15 ` [ruby-core:100028] " tenderlove
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: eregontp @ 2020-09-16 21:08 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by Eregon (Benoit Daloze).


API-wise, `GC.autocompact = true/false` as suggested by @ioquatix on the PR, or `GC.auto_compact = true/false`, sounds nicer to me than 2 separate methods.

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87577

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100028] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
  2020-09-16 21:08 ` [ruby-core:100026] " eregontp
@ 2020-09-16 21:15 ` tenderlove
  2020-09-18  7:54 ` [ruby-core:100031] " ko1
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: tenderlove @ 2020-09-16 21:15 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by tenderlovemaking (Aaron Patterson).


Eregon (Benoit Daloze) wrote in #note-1:
> API-wise, `GC.autocompact = true/false` as suggested by @ioquatix on the PR, or `GC.auto_compact = true/false`, sounds nicer to me than 2 separate methods.

I mainly went with `GC.(enable|disable)_autocompact` because we have `GC.enable` and `GC.disable`, but I like @ioquatix suggestion too. (Also a `GC.auto_compact`)

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87579

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100031] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
  2020-09-16 21:08 ` [ruby-core:100026] " eregontp
  2020-09-16 21:15 ` [ruby-core:100028] " tenderlove
@ 2020-09-18  7:54 ` ko1
  2020-09-18  7:55 ` [ruby-core:100032] " ko1
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ko1 @ 2020-09-18  7:54 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by ko1 (Koichi Sasada).


Another idea is `GC.enable(compact: true/false)`.

BTW as I mentioned in slack, GC.compact may have an issue (can access to T_MOVED objects from Ruby level) so please fix it before merge this large commit.

(and I need to learn about this code...)

I can repeat tests with this option. Let me know if you want.


----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87582

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100032] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (2 preceding siblings ...)
  2020-09-18  7:54 ` [ruby-core:100031] " ko1
@ 2020-09-18  7:55 ` ko1
  2020-10-01 20:50 ` [ruby-core:100261] " tenderlove
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ko1 @ 2020-09-18  7:55 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by ko1 (Koichi Sasada).


Also do you have any performance results, for example memory consumption or GC time and so on?

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87583

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100261] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (3 preceding siblings ...)
  2020-09-18  7:55 ` [ruby-core:100032] " ko1
@ 2020-10-01 20:50 ` tenderlove
  2020-10-01 21:16 ` [ruby-core:100263] " eregontp
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: tenderlove @ 2020-10-01 20:50 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by tenderlovemaking (Aaron Patterson).


ko1 (Koichi Sasada) wrote in #note-4:
> BTW as I mentioned in slack, GC.compact may have an issue (can access to T_MOVED objects from Ruby level) so please fix it before merge this large commit.

I think it's an issue with a "use-after-free" bug.  I believe it was fixed here: https://github.com/ruby/ruby/pull/3571

I was able to get the crash locally, but after applying ^^^ the crash goes away.

> Another idea is GC.enable(compact: true/false).

I like this better than `GC.auto_compact = true/false` because we can add more options.  I'll change to use that.

> Also do you have any performance results, for example memory consumption or GC time and so on?

I did a survey using RDoc.  The summary: Minor GC time is the same, major GC time is much slower, but I still think we should add this as an experimental feature.

Here are more details of the tests I did:

To benchmark compaction, I enabled `GC::Profiler` and then generated RDoc for Ruby.

I created a file `x.rb` that looks like this:

```ruby
BEGIN {
  $start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  # Uncomment for automatic compaction
  #GC.enable_autocompact
  GC::Profiler.enable
}
END {
  profile = GC::Profiler.raw_data.dup
  require "csv"
  CSV($stderr) do |csv|
    syms = %i[HEAP_USE_SIZE HEAP_TOTAL_SIZE HEAP_TOTAL_OBJECTS MOVED_OBJECTS]
    csv << (["major_by", "GC_TIME"] + syms.map(&:to_s))
    profile.each do |record|
      csv << ([record[:GC_FLAGS][:major_by], record[:GC_TIME] * 100] + syms.map { |s| record[s] })
    end
  end
  $stderr.puts
  $stderr.puts
  stats = %i[ heap_allocated_pages heap_eden_pages read_barrier_faults minor_gc_count major_gc_count ]
  stats.each { |s|
    $stderr.puts "#{s}: #{GC.stat(s)}"
  }
  elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - $start
  $stderr.puts "Elapsed: #{elapsed}"
}
```

Then I generated RDoc like this:

```
rm -rf .ext/rdoc
./ruby --disable-gems -I. -r x "./libexec/rdoc" --root "." --encoding=UTF-8 --all --ri --op ".ext/rdoc" --page-dir "./doc" --no-force-update  "." 2>logs/no_autocompact_$i.log
```

I did this 100 times with automatic compaction enabled, and with automatic compaction disabled.

This patch only adds automatic compaction to major GCs, so I don't expect minor GC times to change.
Here is a plot of the minor GC time:

![Minor GC Time](https://user-images.githubusercontent.com/3124/94854245-775c6180-03e1-11eb-8dcc-9c56988678c7.png)

X axis is the GC invoke count, Y axis is time in seconds.  As expected, there is really no change.

Here is a plot of the major GC time:

![Major GC Time](https://user-images.githubusercontent.com/3124/94854341-9e1a9800-03e1-11eb-9666-796ab8e0708c.png)

X axis is the GC invoke count, Y axis is time in seconds.  Adding compaction makes the major invocations significantly slower.  The algorithm for doing a major GC doesn't change, so the difference in these lines is completely due to the time it takes to compact the heap.

If we divide heap size by time to get "Objects Per Second", then we can get an idea of GC throughput:

![Major GC Throughput](https://user-images.githubusercontent.com/3124/94855783-e20e9c80-03e3-11eb-9d2e-cd79b452b044.png)

It's pretty clear that the throughput is lower with compaction enabled.

One interesting thing I've found.  If we plot "heap total objects" and compare it to GC time, it makes a graph like this:

![Major GC vs Compact Speed](https://user-images.githubusercontent.com/3124/94860534-0fab1400-03eb-11eb-8b46-e29cb1cef9f9.png)

Whenever the heap size remains stable, GC compaction gets significantly faster.  In other words, when the blue line stays level, the red line goes down.

My hypothesis is that this is due to the way the GC adds new pages when the heap expands.  When the heap expands, the GC adds pages on the left side of the heap.  So the top of "heap_eden" always points at the newest page.  The compaction algorithm packs to the left, so as we add new pages, more objects move (even though they didn't need to).

Anyway, I think that introducing this feature has value because it will help people test with more aggressive object movement.  I would like auto compaction to be enabled by default some day, but I think for now we can add this as an experimental feature.

Once the speed is acceptable and it seems like bugs are worked out, then we can enable it by default.


----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87849

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100263] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (4 preceding siblings ...)
  2020-10-01 20:50 ` [ruby-core:100261] " tenderlove
@ 2020-10-01 21:16 ` eregontp
  2020-10-01 21:21 ` [ruby-core:100264] " jean.boussier
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: eregontp @ 2020-10-01 21:16 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by Eregon (Benoit Daloze).


tenderlovemaking (Aaron Patterson) wrote in #note-5:
> ko1 (Koichi Sasada) wrote in #note-4:
> > Another idea is GC.enable(compact: true/false).
> 
> I like this better than `GC.auto_compact = true/false` because we can add more options.  I'll change to use that.

One potential issue is that this pattern `GC.disable; begin; ...; ensure; GC.enable; end` will unintentionally lose the `compact` flag.
I'm not sure it matters too much because (I hope) this pattern is rare.

Another downside is it makes it more complicated to know if auto-compaction can be set (need to check arity, and won't work for the second flag vs `GC.respond_to?(:auto_compact=)`.

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87850

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100264] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (5 preceding siblings ...)
  2020-10-01 21:16 ` [ruby-core:100263] " eregontp
@ 2020-10-01 21:21 ` jean.boussier
  2020-10-01 21:37 ` [ruby-core:100265] " tenderlove
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jean.boussier @ 2020-10-01 21:21 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by byroot (Jean Boussier).


A third downside is that it makes it complicated to ensure compaction is disabled in a block. e.g. the common pattern of:

```
def with_state
  previous = Some.state
  Some.state = false
  yield
ensure
  Some.state = previous
end
```

For instance I can see myself needing to disable compaction in https://github.com/Shopify/heap-profiler, would be nice if I had a simple way to restore it to it's previous state.

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87851

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100265] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (6 preceding siblings ...)
  2020-10-01 21:21 ` [ruby-core:100264] " jean.boussier
@ 2020-10-01 21:37 ` tenderlove
  2020-10-20 19:39 ` [ruby-core:100449] [Ruby master Feature#17176] GC.auto_compact / GC.auto_compact=(flag) tenderlove
  2020-10-26  6:40 ` [ruby-core:100557] " matz
  9 siblings, 0 replies; 11+ messages in thread
From: tenderlove @ 2020-10-01 21:37 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by tenderlovemaking (Aaron Patterson).


byroot (Jean Boussier) wrote in #note-7:
> A third downside is that it makes it complicated to ensure compaction is disabled in a block. e.g. the common pattern of:
> 
> ```
> def with_state
>   previous = Some.state
>   Some.state = false
>   yield
> ensure
>   Some.state = previous
> end
> ```
> 
> For instance I can see myself needing to disable compaction in https://github.com/Shopify/heap-profiler, would be nice if I had a simple way to restore it to it's previous state.

😆

Ok, I'll change it to `GC.auto_compact = true/false` and `GC.auto_compact`

----------------------------------------
Feature #17176: GC.enable_autocompact / GC.disable_autocompact
https://bugs.ruby-lang.org/issues/17176#change-87852

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100449] [Ruby master Feature#17176] GC.auto_compact / GC.auto_compact=(flag)
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (7 preceding siblings ...)
  2020-10-01 21:37 ` [ruby-core:100265] " tenderlove
@ 2020-10-20 19:39 ` tenderlove
  2020-10-26  6:40 ` [ruby-core:100557] " matz
  9 siblings, 0 replies; 11+ messages in thread
From: tenderlove @ 2020-10-20 19:39 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by tenderlovemaking (Aaron Patterson).


I made another pull request that sets the default to "ON":

  https://github.com/ruby/ruby/pull/3316

It looks like there is one optional ruby spec test that is failing, but all of the other tests pass.  Since the test suites pass with automatic compaction enabled, it makes me pretty confident about the implementation (though I'm sure there are bugs somewhere).

----------------------------------------
Feature #17176: GC.auto_compact / GC.auto_compact=(flag)
https://bugs.ruby-lang.org/issues/17176#change-88068

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ruby-core:100557] [Ruby master Feature#17176] GC.auto_compact / GC.auto_compact=(flag)
  2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
                   ` (8 preceding siblings ...)
  2020-10-20 19:39 ` [ruby-core:100449] [Ruby master Feature#17176] GC.auto_compact / GC.auto_compact=(flag) tenderlove
@ 2020-10-26  6:40 ` matz
  9 siblings, 0 replies; 11+ messages in thread
From: matz @ 2020-10-26  6:40 UTC (permalink / raw)
  To: ruby-core

Issue #17176 has been updated by matz (Yukihiro Matsumoto).


Accepted as long as it's turned off by default.

Matz.


----------------------------------------
Feature #17176: GC.auto_compact / GC.auto_compact=(flag)
https://bugs.ruby-lang.org/issues/17176#change-88186

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I'd like to make compaction automatic eventually.  As a first step, I would like to introduce two functions:

* GC.enable_autocompact
* GC.disable_autocompact

One function enables auto compaction, the other one disables it.  Automatic compaction is *disabled* by default.  When it is enabled it will happen only on every major GC.

I've made a pull request here: https://github.com/ruby/ruby/pull/3547

This patch makes _object movement_ happen at the same time as page sweep.  When one page finishes sweeping, that page is filled.

## Sweep + Move Phase

During sweep, we keep a pointer to the current sweeping page.  This pointer is kept in [`heap->sweeping_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4817).  At the beginning of sweep, this is the *first* element of the heap's linked list.

At the same time, the compaction process points at the *last* page in the heap, and that is stored in `heap->compact_cursor` [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L5023).

Incremental sweeping sweeps one page at a time in the [`gc_page_sweep` function](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4624).  At the end of that function, we call [`gc_fill_swept_page`](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4738-L4742).  `gc_fill_swept_page` fills the page that was just swept and moves the movement cursor towards the sweeping cursor.

When the sweeping cursor and the movement cursor meet, sweeping is paused, and references are updated.  This can happen in 2 ways, the sweeping cursor "runs in to the moving cursor" which is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4634-L4644).  Or the moving cursor runs in to the sweep cursor which happens [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4425-L4430).

Either way, the sweep step is paused and references are updated.

## Reference Updating

Reference updating hasn't changed, but since reference updating happens before the GC finishes a cycle, it must take in to account garbage objects [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L8971-L8977).

## Read Barrier

During the sweep phase, some objects may touch other objects.  For example, `T_CLASS` [must remove itself from a parent class](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L2769-L2770).

```ruby
class A; end
class B < A; end

const_set(:B, nil)
```

When `B` is freed, it must remove itself from `A`'s subclasses.  But what if `A` moved?  To fix this, I've introduced a read barrier.  The read barrier protects `heap_page_body` using `mprotect`.  If something tries to read from the page, an exception will occur and we can move all objects back to the page (invalidate the movement).

The lock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4321-L4335).
The unlock function is [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4337-L4351).

It uses `sigaction` to catch the exception [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4514-L4530).

## Cross Platform

`mprotect` and `sigaction` are not cross platform, they doesn't work on Windows. On Windows the read barrier uses exception handlers that are built in to Windows.  I implemented them [here](https://github.com/ruby/ruby/blob/8a4d8fa0ea463d44486bf2447ea9830593768fd7/gc.c#L4471-L4503).

The read barrier seems to work on all platforms we're testing.

## Statistics

`GC.stat(:compact_count)` contains the number of times compaction has happened, so we can write things like this:

```ruby
GC.enable_autocompact

cc = GC.stat(:compact_count)
list = []
loop do
  500.times { list << Object.new }
  break if cc < GC.stat(:compact_count)
end

p GC.stat(:compact_count)
```

We can check when the read barrier is triggered with `GC.stat(:read_barrier_faults)`

I've also added `GC.latest_compact_info` so you can see what types of objects moved and how many.  For example:

```
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ cat test.rb
list = []
500.times {
  list << Object.new
  Object.new
  Object.new
}

GC.enable_autocompact
count = GC.stat :compact_count
loop do
  list << Object.new
  break if GC.stat(:compact_count) > count
end

p GC.latest_compact_info
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$ make runruby
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test.rb
{:considered=>{:T_OBJECT=>408}, :moved=>{:T_OBJECT=>408}}
[aaron@tc-lan-adapter ~/g/ruby (autocompact)]$
```

## Recap

New methods:

* GC.enable_autocompact
* GC.disable_autocompact
* GC.last_compact_info

New statistics in `GC.stat`:

* GC.stat(:read_barrier_faults)

Diff is here: https://github.com/ruby/ruby/pull/3547



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-26  6:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-16 18:39 [ruby-core:100025] [Ruby master Feature#17176] GC.enable_autocompact / GC.disable_autocompact tenderlove
2020-09-16 21:08 ` [ruby-core:100026] " eregontp
2020-09-16 21:15 ` [ruby-core:100028] " tenderlove
2020-09-18  7:54 ` [ruby-core:100031] " ko1
2020-09-18  7:55 ` [ruby-core:100032] " ko1
2020-10-01 20:50 ` [ruby-core:100261] " tenderlove
2020-10-01 21:16 ` [ruby-core:100263] " eregontp
2020-10-01 21:21 ` [ruby-core:100264] " jean.boussier
2020-10-01 21:37 ` [ruby-core:100265] " tenderlove
2020-10-20 19:39 ` [ruby-core:100449] [Ruby master Feature#17176] GC.auto_compact / GC.auto_compact=(flag) tenderlove
2020-10-26  6:40 ` [ruby-core:100557] " matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).