From: fatkodima123@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:99690] [Ruby master Bug#17030] Enumerable#grep{_v} should be optimized for Regexp
Date: Tue, 25 Aug 2020 22:09:46 +0000 (UTC) [thread overview]
Message-ID: <redmine.journal-87181.20200825220945.182@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17030.20200713202650.182@ruby-lang.org
Issue #17030 has been updated by fatkodima (Dima Fatko).
I have implemented a simple PoC - https://github.com/ruby/ruby/pull/3455.
I got the following results.
## Enumerable#grep
```ruby
ARR = %w[foobar foobaz bazquux hello world just making this array longer]
REGEXP = /o/
FAST_REGEXP = /o/f
Benchmark.ips do |x|
x.report("select.match?") { ARR.select { |e| e.match?(REGEXP) } }
x.report("grep") { ARR.grep(REGEXP) }
x.report("fast_grep") { ARR.grep(FAST_REGEXP) }
x.compare!
end
puts "********* MEMORY *********\n"
Benchmark.memory do |x|
x.report("select.match?") { ARR.select { |e| e.match?(REGEXP) } }
x.report("grep") { ARR.grep(REGEXP) }
x.report("fast_grep") { ARR.grep(FAST_REGEXP) }
x.compare!
end
```
```
Warming up --------------------------------------
select.match? 57.956k i/100ms
grep 22.715k i/100ms
fast_grep 59.434k i/100ms
Calculating -------------------------------------
select.match? 580.339k (± 0.5%) i/s - 2.956M in 5.093260s
grep 225.854k (± 0.6%) i/s - 1.136M in 5.028890s
fast_grep 532.658k (± 9.0%) i/s - 2.675M in 5.067008s
Comparison:
select.match?: 580338.8 i/s
fast_grep: 532658.1 i/s - same-ish: difference falls within error
grep: 225853.7 i/s - 2.57x (± 0.00) slower
********* MEMORY *********
Calculating -------------------------------------
select.match? 120.000 memsize ( 0.000 retained)
1.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
grep 536.000 memsize ( 168.000 retained)
3.000 objects ( 1.000 retained)
0.000 strings ( 0.000 retained)
fast_grep 200.000 memsize ( 0.000 retained)
1.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
Comparison:
select.match?: 120 allocated
fast_grep: 200 allocated - 1.67x more
grep: 536 allocated - 4.47x more
```
## case-when
```ruby
REGEXP = /z/
FAST_REGEXP = /z/f
def case_when(str)
case str
when REGEXP
true
end
end
def fast_case_when(str)
case str
when FAST_REGEXP
true
end
end
STR = 'foobarbaz'
Benchmark.ips do |x|
x.report("case_when") { case_when(STR) }
x.report("fast_case_when") { fast_case_when(STR) }
x.compare!
end
puts "********* MEMORY *********\n"
Benchmark.memory do |x|
x.report("case_when") { case_when(STR) }
x.report("fast_case_when") { fast_case_when(STR) }
x.compare!
end
```
```
Warming up --------------------------------------
case_when 95.463k i/100ms
fast_case_when 456.981k i/100ms
Calculating -------------------------------------
case_when 964.438k (± 0.8%) i/s - 4.869M in 5.048469s
fast_case_when 4.571M (± 0.6%) i/s - 23.306M in 5.098414s
Comparison:
fast_case_when: 4571379.8 i/s
case_when: 964438.3 i/s - 4.74x (± 0.00) slower
********* MEMORY *********
Calculating -------------------------------------
case_when 168.000 memsize ( 0.000 retained)
1.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
fast_case_when 0.000 memsize ( 0.000 retained)
0.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
Comparison:
fast_case_when: 0 allocated
case_when: 168 allocated - Infx more
```
## Enumerable#any?
```ruby
REGEXP = /longer/
FAST_REGEXP = /longer/f
ARR = %w[foobar foobaz bazquux hello world just making this array longer]
Benchmark.ips do |x|
x.report("any?") { ARR.any?(REGEXP) }
x.report("fast_any?") { ARR.any?(FAST_REGEXP) }
x.compare!
end
puts "********* MEMORY *********\n"
Benchmark.memory do |x|
x.report("any?") { ARR.any?(REGEXP) }
x.report("fast_any?") { ARR.any?(FAST_REGEXP) }
x.compare!
end
```
```
Warming up --------------------------------------
any? 25.840k i/100ms
fast_any? 95.381k i/100ms
Calculating -------------------------------------
any? 261.095k (± 1.0%) i/s - 1.318M in 5.047859s
fast_any? 893.676k (±13.2%) i/s - 4.388M in 5.070820s
Comparison:
fast_any?: 893675.9 i/s
any?: 261095.0 i/s - 3.42x (± 0.00) slower
********* MEMORY *********
Calculating -------------------------------------
any? 168.000 memsize ( 168.000 retained)
1.000 objects ( 1.000 retained)
0.000 strings ( 0.000 retained)
fast_any? 0.000 memsize ( 0.000 retained)
0.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
Comparison:
fast_any?: 0 allocated
any?: 168 allocated - Infx more
```
If that seems OK, I will update and finish my PR with tests/docs/etc.
----------------------------------------
Bug #17030: Enumerable#grep{_v} should be optimized for Regexp
https://bugs.ruby-lang.org/issues/17030#change-87181
* Author: marcandre (Marc-Andre Lafortune)
* Status: Open
* Priority: Normal
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
Currently:
```ruby
array.select { |e| e.match?(REGEXP) }
# about 3x faster and 6x more memory efficient than
array.grep(REGEXP)
```
This is because `grep` calls `Regexp#===` which creates useless `MatchData`
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>
next prev parent reply other threads:[~2020-08-25 22:10 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-13 20:26 [ruby-core:99156] [Ruby master Bug#17030] Enumerable#grep{_v} should be optimized for Regexp marcandre-ruby-core
2020-07-13 20:27 ` [ruby-core:99157] " marcandre-ruby-core
2020-07-14 1:24 ` [ruby-core:99161] " shyouhei
2020-07-14 3:37 ` [ruby-core:99163] " marcandre-ruby-core
2020-07-14 10:25 ` [ruby-core:99164] " nobu
2020-07-14 13:28 ` [ruby-core:99165] " eregontp
2020-07-14 13:46 ` [ruby-core:99166] " marcandre-ruby-core
2020-07-21 11:27 ` [ruby-core:99248] " scivola20
2020-08-25 17:23 ` [ruby-core:99687] " fatkodima123
2020-08-25 22:09 ` fatkodima123 [this message]
2020-08-25 23:37 ` [ruby-core:99692] " sawadatsuyoshi
2020-08-26 8:31 ` [ruby-core:99702] " fatkodima123
2020-08-26 15:01 ` [ruby-core:99705] " marcandre-ruby-core
2020-08-26 17:01 ` [ruby-core:99706] " fatkodima123
2020-08-26 17:30 ` [ruby-core:99708] " marcandre-ruby-core
2020-08-26 19:17 ` [ruby-core:99713] " daniel
2020-08-26 19:31 ` [ruby-core:99714] " marcandre-ruby-core
2020-08-26 20:43 ` [ruby-core:99716] " daniel
2020-08-27 2:10 ` [ruby-core:99722] " marcandre-ruby-core
2020-08-27 8:03 ` [ruby-core:99729] " fatkodima123
2020-08-27 16:58 ` [ruby-core:99737] " daniel
2020-08-27 17:16 ` [ruby-core:99738] " merch-redmine
2020-08-28 15:01 ` [ruby-core:99751] " eregontp
2020-08-28 15:30 ` [ruby-core:99752] " eregontp
2020-09-25 6:53 ` [ruby-core:100127] " matz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.ruby-lang.org/en/community/mailing-lists/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=redmine.journal-87181.20200825220945.182@ruby-lang.org \
--to=ruby-core@ruby-lang.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).