From: ko1@atdot.net
To: ruby-core@ruby-lang.org
Subject: [ruby-core:95452] [Ruby master Bug#16121] Stop making a redundant hash copy in Hash#dup
Date: Mon, 21 Oct 2019 08:29:42 +0000 (UTC) [thread overview]
Message-ID: <redmine.journal-82202.20191021082941.dc26ba7d274d872d@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-16121.20190823171108@ruby-lang.org
Issue #16121 has been updated by ko1 (Koichi Sasada).
Thank you, I merged it!
----------------------------------------
Bug #16121: Stop making a redundant hash copy in Hash#dup
https://bugs.ruby-lang.org/issues/16121#change-82202
* Author: dylants (Dylan Thacker-Smith)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version:
* ruby -v: ruby 2.7.0dev (2019-08-23T16:41:09Z master b38ab0a3a9) [x86_64-darwin18]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
## Problem
I noticed while profiling object allocations that Hash#dup was allocating 2 objects instead of only 1 as expected. I looked for alternatives for comparison and found that `Hash[hash]` created a copy with only a single object allocation and seemed to be more than twice as fast. Reading the source code revealed the difference was that Hash#dup creates a copy of the Hash, then rehashes the copy. However, rehashing is done by making a copy of the hash, so the first copy before rehashing was unnecessary.
## Solution
I changed the code to just use rehashing to make the copy of the hash to improve performance while also preserving the existing behaviour.
## Benchmark
```ruby
require 'benchmark'
N = 100000
def report(x, name)
x.report(name) do
N.times do
yield
end
end
end
hashes = {
small_hash: { a: 1 },
larger_hash: 20.times.map { |i| [('a'.ord + i).chr.to_sym, i] }.to_h
}
Benchmark.bmbm do |x|
hashes.each do |name, hash|
report(x, "#{name}.dup") do
hash.dup
end
end
end
```
results on master
```
user system total real
small_hash.dup 0.401350 0.001638 0.402988 ( 0.404608)
larger_hash.dup 7.218548 0.433616 7.652164 ( 7.695990)
```
results with the attached patch
```
user system total real
small_hash.dup 0.336733 0.002425 0.339158 ( 0.341760)
larger_hash.dup 6.617343 0.398407 7.015750 ( 7.070282)
```
---Files--------------------------------
0001-Remove-redundant-Check_Type-after-to_hash.diff.txt (624 Bytes)
0002-Fix-freeing-and-clearing-destination-hash-in-Hash.diff.txt (1.57 KB)
0003-Remove-dead-code-paths-in-rb_hash_initialize_copy.diff.txt (1.12 KB)
0004-Stop-making-a-redundant-hash-copy-in-Hash-dup.diff.txt (1.35 KB)
--
https://bugs.ruby-lang.org/
prev parent reply other threads:[~2019-10-21 8:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <redmine.issue-16121.20190823171108@ruby-lang.org>
2019-08-23 17:11 ` [ruby-core:94507] [Ruby master Feature#16121] Stop making a redundant hash copy in Hash#dup dylan.smith
2019-08-23 20:16 ` [ruby-core:94510] [Ruby master Bug#16121] " dylan.smith
2019-09-19 8:20 ` [ruby-core:94983] " ko1
2019-09-20 7:45 ` [ruby-core:95000] " ko1
2019-09-20 8:35 ` [ruby-core:95002] " ko1
2019-09-27 22:49 ` [ruby-core:95137] " dylan.smith
2019-10-21 8:29 ` ko1 [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.ruby-lang.org/en/community/mailing-lists/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=redmine.journal-82202.20191021082941.dc26ba7d274d872d@ruby-lang.org \
--to=ruby-core@ruby-lang.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).