From: shevegen@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:94497] [Ruby master Feature#16119] Optimize Array#flatten and flatten! for already flattened arrays
Date: Fri, 23 Aug 2019 06:31:00 +0000 (UTC) [thread overview]
Message-ID: <redmine.journal-80932.20190823063100.e25a3322878b8667@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-16119.20190822235825@ruby-lang.org
Issue #16119 has been updated by shevegen (Robert A. Heiler).
I use .flatten and .flatten! quite a lot in my ruby code, often setting the
initial input from ARGV but also from other sources, such as used from other
ruby classes, w hen I need to have an array - often something like:
def foobar(input)
new_value = [input].flatten.compact
Or something like that. If this patch indeed improves this then that would
be quite nice to have.
----------------------------------------
Feature #16119: Optimize Array#flatten and flatten! for already flattened arrays
https://bugs.ruby-lang.org/issues/16119#change-80932
* Author: dylants (Dylan Thacker-Smith)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
----------------------------------------
## Problem
When doing an object profile from stackprof, I noticed object allocations being made from `Array#flatten!` which was unlike other in-place methods like `Array#uniq!` and `Array#compact!`. In this case, I wanted to optimize for the array already being flattened and found that `ary.flatten! if ary.any?(Array)` was significantly faster. The code confirmed my suspicion that `ary.flatten!` was almost equivalent to `ary.replace(ary.flatten)` in implementation. The object allocations I noticed were from a temporary result array, a stack array to handle nesting and a hash table to prevent infinite recursion, all of which can be avoided in the simple case of an already flattened array.
## Solution
Iterate over the array to find the first nested array. If no nested array is found, then `flatten!` just returns `nil` without creating additional objects and `flatten` returns a shared copy of the original array. If a nested array is found, then it creates and initializes the temporary objects to resume with the existing algorithm for flattening the array.
## Benchmark
```ruby
require 'benchmark'
N = 100000
def report(x, name)
x.report(name) do
N.times do
yield
end
end
end
arrays = {
small_flat_ary: 5.times.to_a,
large_flat_ary: 100.times.to_a,
small_pairs_ary: [[1, 2]] * 5,
large_pairs_ary: [[1, 2]] * 100,
}
Benchmark.bmbm do |x|
arrays.each do |name, ary|
report(x, "#{name}.flatten!") do
ary.flatten!
end
report(x, "#{name}.flatten") do
ary.flatten
end
end
end
```
results on the latest master (`ruby 2.7.0dev (2019-08-22T14:10:55Z master fd20b32130) [x86_64-darwin18]`)
```
user system total real
small_flat_ary.flatten! 0.082001 0.000294 0.082295 ( 0.082685)
small_flat_ary.flatten 0.078655 0.000211 0.078866 ( 0.079176)
large_flat_ary.flatten! 0.552551 0.001643 0.554194 ( 0.556166)
large_flat_ary.flatten 0.551520 0.001327 0.552847 ( 0.554091)
small_pairs_ary.flatten! 0.100073 0.000302 0.100375 ( 0.100687)
small_pairs_ary.flatten 0.109440 0.000232 0.109672 ( 0.109850)
large_pairs_ary.flatten! 1.021200 0.001650 1.022850 ( 1.024227)
large_pairs_ary.flatten 1.049046 0.002938 1.051984 ( 1.055228)
```
results with the attached patch
```
user system total real
small_flat_ary.flatten! 0.034868 0.000150 0.035018 ( 0.035180)
small_flat_ary.flatten 0.040504 0.000148 0.040652 ( 0.040914)
large_flat_ary.flatten! 0.458273 0.000786 0.459059 ( 0.460005)
large_flat_ary.flatten 0.453846 0.000833 0.454679 ( 0.455324)
small_pairs_ary.flatten! 0.055211 0.000205 0.055416 ( 0.055673)
small_pairs_ary.flatten 0.060157 0.000226 0.060383 ( 0.060540)
large_pairs_ary.flatten! 0.901698 0.001650 0.903348 ( 0.905246)
large_pairs_ary.flatten 0.917180 0.001370 0.918550 ( 0.920008)
```
---Files--------------------------------
0001-Optimize-Array-flatten-and-flatten-for-already-fl.diff.txt (2.79 KB)
--
https://bugs.ruby-lang.org/
next prev parent reply other threads:[~2019-08-23 6:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <redmine.issue-16119.20190822235825@ruby-lang.org>
2019-08-22 23:58 ` [ruby-core:94487] [Ruby master Feature#16119] Optimize Array#flatten and flatten! for already flattened arrays dylan.smith
2019-08-23 2:02 ` [ruby-core:94492] " duerst
2019-08-23 6:31 ` shevegen [this message]
2019-08-23 7:41 ` [ruby-core:94498] " hanmac
2019-08-23 14:22 ` [ruby-core:94501] " Austin Ziegler
2019-08-23 15:32 ` [ruby-core:94504] " dylan.smith
2019-08-23 15:38 ` [ruby-core:94506] " dylan.smith
2019-09-19 8:31 ` [ruby-core:94984] " mame
2019-09-20 0:19 ` [ruby-core:94994] " nobu
2019-09-22 18:29 ` [ruby-core:95037] " lourens
2019-09-27 3:42 ` [ruby-core:95124] " dylan.smith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.ruby-lang.org/en/community/mailing-lists/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=redmine.journal-80932.20190823063100.e25a3322878b8667@ruby-lang.org \
--to=ruby-core@ruby-lang.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).