ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:90097] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction
       [not found] <redmine.issue-15349.20181127221045@ruby-lang.org>
@ 2018-11-27 22:10 ` tenderlove
  2018-11-27 23:25   ` [ruby-core:90100] " Eric Wong
  2018-11-29  1:56 ` [ruby-core:90147] " tenderlove
  2018-11-30 23:03 ` [ruby-core:90197] [Ruby trunk Feature#15349][Closed] " tenderlove
  2 siblings, 1 reply; 4+ messages in thread
From: tenderlove @ 2018-11-27 22:10 UTC (permalink / raw)
  To: ruby-core

Issue #15349 has been reported by tenderlovemaking (Aaron Patterson).

----------------------------------------
Feature #15349: Use a shared array for the `duparray` instruction
https://bugs.ruby-lang.org/issues/15349

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
In this example code:

~~~ ruby
def foo
  [1, 2, 3, 4]
end
~~~

The array literal uses a duparray instruction. Before this patch,
rb_ary_resurrect would malloc and memcpy a new array buffer. This
patch changes rb_ary_resurrect to use ary_make_partial so that the
new array object shares the underlying buffer with the array stored in
the instruction sequences.

Before this patch, the new array object is not shared:

~~~
$ ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7fa2718372d0\", \"type\":\"ARRAY\", \"class\":\"0x7fa26f8b0010\", \"length\":4, \"memsize\":72, \"flags\":{\"wb_protected\":true}}\n"
~~~

After this patch:

~~~
$ ./ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7f9a76883638\", \"type\":\"ARRAY\", \"class\":\"0x7f9a758af900\", \"length\":4, \"shared\":true, \"references\":[\"0x7f9a768837c8\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I wrote a test program:

~~~ ruby
def foo
  [1, 2, 3, 4]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log" # MacOS specific
~~~

This test program uses a MacOS specific tool to get a list of all malloc / free calls (should be able to do the same with valgrind, I just don't know the command).  I compared trunk, trunk + this patch, and Ruby 2.5.3.  All tests disable RubyGems.

Here is a graph of the results:

![smallish array results](https://user-images.githubusercontent.com/3124/49109227-09700d80-f23f-11e8-9597-d87f15b69dbf.png)

The X axis is sample number, and the Y axis is total bytes the process is using at that sample. Each sample is a call to malloc, so the closer the line is to the origin (0, 0), the better (it means fewer samples and less live memory).

After this patch there are fewer calls to malloc than in either trunk or 2.5 for this test program.

If I modify the test program to be like this:

~~~ ruby
def foo
  [
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
  ]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log"
~~~

The difference in memory usage becomes more pronounced:

![large array alloc](https://user-images.githubusercontent.com/3124/49114573-62df3900-f24d-11e8-99c4-957daadd7628.png)

I noticed that strings already use this technique:

~~~
$ ruby -v --disable-gems -robjspace -e'p ObjectSpace.dump("abcdefghijklmnopqrstuvwxyz")'
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
"{\"address\":\"0x00007fad18051528\", \"type\":\"STRING\", \"class\":\"0x00007fad180d3c08\", \"shared\":true, \"encoding\":\"UTF-8\", \"references\":[\"0x00007fad18051640\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I was going to apply this patch, but it modifies `rb_ary_resurrect` and I wanted to double check.   `rb_str_resurrect` will create shared strings, so it seems OK for `rb_ary_resurrect` to create shared arrays, IMO.

Does anyone have opinions on this?  If not, I'll just apply the patch.

(Also I noticed trunk seems to allocate a lot more memory at boot)

Thanks!

---Files--------------------------------
0001-Use-a-shared-array-for-the-duparray-instruction.patch (1.56 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:90100] Re: [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction
  2018-11-27 22:10 ` [ruby-core:90097] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction tenderlove
@ 2018-11-27 23:25   ` Eric Wong
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2018-11-27 23:25 UTC (permalink / raw)
  To: ruby-core

tenderlove@ruby-lang.org wrote:
> I was going to apply this patch, but it modifies `rb_ary_resurrect` and I wanted to double check.   `rb_str_resurrect` will create shared strings, so it seems OK for `rb_ary_resurrect` to create shared arrays, IMO.

It's probably fine; I would not expect us to have bugs in our
shared array handling code at this point.

> Does anyone have opinions on this?  If not, I'll just apply the patch.
> 
> (Also I noticed trunk seems to allocate a lot more memory at boot)

Looks like transient heap (only checked miniruby, so no gems loaded).
Disabling transient heap (below) seems to help, but real-world results
would likely be different:

diff --git a/include/ruby/ruby.h b/include/ruby/ruby.h
index 58e28d2c9f..52a263ffe2 100644
--- a/include/ruby/ruby.h
+++ b/include/ruby/ruby.h
@@ -1014,7 +1014,7 @@ struct RString {
      ((ptrvar) = RSTRING(str)->as.heap.ptr, (lenvar) = RSTRING(str)->as.heap.len))
 
 #ifndef USE_TRANSIENT_HEAP
-#define USE_TRANSIENT_HEAP 1
+#define USE_TRANSIENT_HEAP 0
 #endif
 
 enum ruby_rarray_flags {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [ruby-core:90147] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction
       [not found] <redmine.issue-15349.20181127221045@ruby-lang.org>
  2018-11-27 22:10 ` [ruby-core:90097] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction tenderlove
@ 2018-11-29  1:56 ` tenderlove
  2018-11-30 23:03 ` [ruby-core:90197] [Ruby trunk Feature#15349][Closed] " tenderlove
  2 siblings, 0 replies; 4+ messages in thread
From: tenderlove @ 2018-11-29  1:56 UTC (permalink / raw)
  To: ruby-core

Issue #15349 has been updated by tenderlovemaking (Aaron Patterson).


normalperson (Eric Wong) wrote:
> tenderlove@ruby-lang.org wrote:
>  > I was going to apply this patch, but it modifies `rb_ary_resurrect` and I wanted to double check.   `rb_str_resurrect` will create shared strings, so it seems OK for `rb_ary_resurrect` to create shared arrays, IMO.
>  
>  It's probably fine; I would not expect us to have bugs in our
>  shared array handling code at this point.

Ok, I'll apply the patch.

>  > Does anyone have opinions on this?  If not, I'll just apply the patch.
>  > 
>  > (Also I noticed trunk seems to allocate a lot more memory at boot)
>  
>  Looks like transient heap (only checked miniruby, so no gems loaded).
>  Disabling transient heap (below) seems to help, but real-world results
>  would likely be different:

I tried disabling theap and re-running the test.  It looks like disabling theap brings the memory usage back to what Ruby 2.5 does:

![remove theap](https://user-images.githubusercontent.com/3124/49194017-49fe8280-f336-11e8-8131-dd97ba4922bf.png)

I tried a similar test, but booting a Rails application.  It looks like theap just adds a constant overhead to the application:

![theap, no theap, Ruby 2.5 on Rails](https://user-images.githubusercontent.com/3124/49194094-9944b300-f336-11e8-8b54-6bfec63b78db.png)

Disabling theap makes memory usage on 2.6 *lower* than 2.5, where enabling theap keeps it constantly above 2.5 levels.

I'll open another ticket for this issue.

----------------------------------------
Feature #15349: Use a shared array for the `duparray` instruction
https://bugs.ruby-lang.org/issues/15349#change-75263

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
In this example code:

~~~ ruby
def foo
  [1, 2, 3, 4]
end
~~~

The array literal uses a duparray instruction. Before this patch,
rb_ary_resurrect would malloc and memcpy a new array buffer. This
patch changes rb_ary_resurrect to use ary_make_partial so that the
new array object shares the underlying buffer with the array stored in
the instruction sequences.

Before this patch, the new array object is not shared:

~~~
$ ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7fa2718372d0\", \"type\":\"ARRAY\", \"class\":\"0x7fa26f8b0010\", \"length\":4, \"memsize\":72, \"flags\":{\"wb_protected\":true}}\n"
~~~

After this patch:

~~~
$ ./ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7f9a76883638\", \"type\":\"ARRAY\", \"class\":\"0x7f9a758af900\", \"length\":4, \"shared\":true, \"references\":[\"0x7f9a768837c8\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I wrote a test program:

~~~ ruby
def foo
  [1, 2, 3, 4]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log" # MacOS specific
~~~

This test program uses a MacOS specific tool to get a list of all malloc / free calls (should be able to do the same with valgrind, I just don't know the command).  I compared trunk, trunk + this patch, and Ruby 2.5.3.  All tests disable RubyGems.

Here is a graph of the results:

![smallish array results](https://user-images.githubusercontent.com/3124/49109227-09700d80-f23f-11e8-9597-d87f15b69dbf.png)

The X axis is sample number, and the Y axis is total bytes the process is using at that sample. Each sample is a call to malloc, so the closer the line is to the origin (0, 0), the better (it means fewer samples and less live memory).

After this patch there are fewer calls to malloc than in either trunk or 2.5 for this test program.

If I modify the test program to be like this:

~~~ ruby
def foo
  [
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
  ]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log"
~~~

The difference in memory usage becomes more pronounced:

![large array alloc](https://user-images.githubusercontent.com/3124/49114573-62df3900-f24d-11e8-99c4-957daadd7628.png)

I noticed that strings already use this technique:

~~~
$ ruby -v --disable-gems -robjspace -e'p ObjectSpace.dump("abcdefghijklmnopqrstuvwxyz")'
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
"{\"address\":\"0x00007fad18051528\", \"type\":\"STRING\", \"class\":\"0x00007fad180d3c08\", \"shared\":true, \"encoding\":\"UTF-8\", \"references\":[\"0x00007fad18051640\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I was going to apply this patch, but it modifies `rb_ary_resurrect` and I wanted to double check.   `rb_str_resurrect` will create shared strings, so it seems OK for `rb_ary_resurrect` to create shared arrays, IMO.

Does anyone have opinions on this?  If not, I'll just apply the patch.

(Also I noticed trunk seems to allocate a lot more memory at boot)

Thanks!

---Files--------------------------------
0001-Use-a-shared-array-for-the-duparray-instruction.patch (1.56 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:90197] [Ruby trunk Feature#15349][Closed] Use a shared array for the `duparray` instruction
       [not found] <redmine.issue-15349.20181127221045@ruby-lang.org>
  2018-11-27 22:10 ` [ruby-core:90097] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction tenderlove
  2018-11-29  1:56 ` [ruby-core:90147] " tenderlove
@ 2018-11-30 23:03 ` tenderlove
  2 siblings, 0 replies; 4+ messages in thread
From: tenderlove @ 2018-11-30 23:03 UTC (permalink / raw)
  To: ruby-core

Issue #15349 has been updated by tenderlovemaking (Aaron Patterson).

Status changed from Open to Closed

I committed this in r66095, not sure why it didn't close this.

----------------------------------------
Feature #15349: Use a shared array for the `duparray` instruction
https://bugs.ruby-lang.org/issues/15349#change-75319

* Author: tenderlovemaking (Aaron Patterson)
* Status: Closed
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
In this example code:

~~~ ruby
def foo
  [1, 2, 3, 4]
end
~~~

The array literal uses a duparray instruction. Before this patch,
rb_ary_resurrect would malloc and memcpy a new array buffer. This
patch changes rb_ary_resurrect to use ary_make_partial so that the
new array object shares the underlying buffer with the array stored in
the instruction sequences.

Before this patch, the new array object is not shared:

~~~
$ ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7fa2718372d0\", \"type\":\"ARRAY\", \"class\":\"0x7fa26f8b0010\", \"length\":4, \"memsize\":72, \"flags\":{\"wb_protected\":true}}\n"
~~~

After this patch:

~~~
$ ./ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7f9a76883638\", \"type\":\"ARRAY\", \"class\":\"0x7f9a758af900\", \"length\":4, \"shared\":true, \"references\":[\"0x7f9a768837c8\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I wrote a test program:

~~~ ruby
def foo
  [1, 2, 3, 4]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log" # MacOS specific
~~~

This test program uses a MacOS specific tool to get a list of all malloc / free calls (should be able to do the same with valgrind, I just don't know the command).  I compared trunk, trunk + this patch, and Ruby 2.5.3.  All tests disable RubyGems.

Here is a graph of the results:

![smallish array results](https://user-images.githubusercontent.com/3124/49109227-09700d80-f23f-11e8-9597-d87f15b69dbf.png)

The X axis is sample number, and the Y axis is total bytes the process is using at that sample. Each sample is a call to malloc, so the closer the line is to the origin (0, 0), the better (it means fewer samples and less live memory).

After this patch there are fewer calls to malloc than in either trunk or 2.5 for this test program.

If I modify the test program to be like this:

~~~ ruby
def foo
  [
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
  ]
end

list = []
10000.times { list << foo }

GC.start

puts "ready #{$$}"
system "malloc_history #{$$} -allEvents > #{RUBY_VERSION}-#{ARGV[0]}.log"
~~~

The difference in memory usage becomes more pronounced:

![large array alloc](https://user-images.githubusercontent.com/3124/49114573-62df3900-f24d-11e8-99c4-957daadd7628.png)

I noticed that strings already use this technique:

~~~
$ ruby -v --disable-gems -robjspace -e'p ObjectSpace.dump("abcdefghijklmnopqrstuvwxyz")'
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
"{\"address\":\"0x00007fad18051528\", \"type\":\"STRING\", \"class\":\"0x00007fad180d3c08\", \"shared\":true, \"encoding\":\"UTF-8\", \"references\":[\"0x00007fad18051640\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
~~~

I was going to apply this patch, but it modifies `rb_ary_resurrect` and I wanted to double check.   `rb_str_resurrect` will create shared strings, so it seems OK for `rb_ary_resurrect` to create shared arrays, IMO.

Does anyone have opinions on this?  If not, I'll just apply the patch.

(Also I noticed trunk seems to allocate a lot more memory at boot)

Thanks!

---Files--------------------------------
0001-Use-a-shared-array-for-the-duparray-instruction.patch (1.56 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-11-30 23:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-15349.20181127221045@ruby-lang.org>
2018-11-27 22:10 ` [ruby-core:90097] [Ruby trunk Feature#15349] Use a shared array for the `duparray` instruction tenderlove
2018-11-27 23:25   ` [ruby-core:90100] " Eric Wong
2018-11-29  1:56 ` [ruby-core:90147] " tenderlove
2018-11-30 23:03 ` [ruby-core:90197] [Ruby trunk Feature#15349][Closed] " tenderlove

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).