ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
@ 2014-01-18  4:43 ` normalperson
  2014-01-20  3:55   ` [ruby-core:59886] " Eric Wong
  2014-02-01  7:51   ` [ruby-core:60394] " Eric Wong
  2014-01-20  4:02 ` [ruby-core:59888] [ruby-trunk - Feature #9425] " normalperson
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 22+ messages in thread
From: normalperson @ 2014-01-18  4:43 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been reported by Eric Wong.

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59886] Re: [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-01-18  4:43 ` [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops normalperson
@ 2014-01-20  3:55   ` Eric Wong
  2014-02-01  7:51   ` [ruby-core:60394] " Eric Wong
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-01-20  3:55 UTC (permalink / raw
  To: ruby-core

Updated patch and pull due to r44634 (hash_pos)
http://bogomips.org/ruby.git/patch?id=7c7e41496d6b28a4b

The following changes since commit 5ecbe189af77c309845d662f26c0b2797bde2915:

  socket/option.c: helper functions (2014-01-19 14:56:05 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime-r44658

for you to fetch changes up to 7c7e41496d6b28a4b1a516c0df9754c03447a95d:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-19 23:45:52 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 61 +++++--------------------------------------------------------
 1 file changed, 5 insertions(+), 56 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59888] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
  2014-01-18  4:43 ` [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops normalperson
@ 2014-01-20  4:02 ` normalperson
  2014-01-20 17:48 ` [ruby-core:59909] " shyouhei
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-01-20  4:02 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 Updated patch and pull due to r44634 (hash_pos)
 http://bogomips.org/ruby.git/patch?id=7c7e41496d6b28a4b
 
 The following changes since commit 5ecbe189af77c309845d662f26c0b2797bde2915:
 
   socket/option.c: helper functions (2014-01-19 14:56:05 +0000)
 
 are available in the git repository at:
 
   git://80x24.org/ruby.git st-noprime-r44658
 
 for you to fetch changes up to 7c7e41496d6b28a4b1a516c0df9754c03447a95d:
 
   st: use power-of-two sizes to avoid slow modulo ops (2014-01-19 23:45:52 +0000)
 
 ----------------------------------------------------------------
 Eric Wong (1):
       st: use power-of-two sizes to avoid slow modulo ops
 
  st.c | 61 +++++--------------------------------------------------------
  1 file changed, 5 insertions(+), 56 deletions(-)

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44441

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59909] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
  2014-01-18  4:43 ` [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops normalperson
  2014-01-20  4:02 ` [ruby-core:59888] [ruby-trunk - Feature #9425] " normalperson
@ 2014-01-20 17:48 ` shyouhei
  2014-01-21  2:38   ` [ruby-core:59917] " Eric Wong
  2014-01-21  2:41 ` [ruby-core:59918] " normalperson
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: shyouhei @ 2014-01-20 17:48 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Shyouhei Urabe.


Impressive.

Plus I think you have more rooms for optimizations by sticking to
power-of-two sized bins.  When you rehash a hash, right now all
elements are cleaned up from the bin, resize it, then insert them
again one-by-one.  If number of bins are always 2^n, rehashing is
to resize a 2^n array to 2^(n+1).  This case the elements stored in
new_bins[2^n+k] can only come from new_bins[k].  This fact does not
change the algorithmic complexity, but can reduce insertions.

https://github.com/shyouhei/ruby/commit/f7ca891

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44454

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59917] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-01-20 17:48 ` [ruby-core:59909] " shyouhei
@ 2014-01-21  2:38   ` Eric Wong
  2014-01-22  3:16     ` [ruby-core:59953] " Urabe Shyouhei
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Wong @ 2014-01-21  2:38 UTC (permalink / raw
  To: Ruby developers

shyouhei@ruby-lang.org wrote:
> Plus I think you have more rooms for optimizations by sticking to
> power-of-two sized bins.  When you rehash a hash, right now all
> elements are cleaned up from the bin, resize it, then insert them
> again one-by-one.  If number of bins are always 2^n, rehashing is
> to resize a 2^n array to 2^(n+1).  This case the elements stored in
> new_bins[2^n+k] can only come from new_bins[k].  This fact does not
> change the algorithmic complexity, but can reduce insertions.
> 
> https://github.com/shyouhei/ruby/commit/f7ca891

Thanks!  However, I wasn't able to show a difference with
"make benchmark"[1].   Were you?

Perhaps rehash is not called often enough, and I think a branch
inside the loop is difficult for the CPU to optimize.  I think
the current dumb loop is very good for CPU pipelining and prefetch.


[1] I have applied my patches for improved benchmark consistency:
	https://bugs.ruby-lang.org/issues/5985#change-44442
	https://bugs.ruby-lang.org/issues/9430

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59918] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2014-01-20 17:48 ` [ruby-core:59909] " shyouhei
@ 2014-01-21  2:41 ` normalperson
  2014-01-22  3:22 ` [ruby-core:59955] " shyouhei
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-01-21  2:41 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 shyouhei@ruby-lang.org wrote:
 > Plus I think you have more rooms for optimizations by sticking to
 > power-of-two sized bins.  When you rehash a hash, right now all
 > elements are cleaned up from the bin, resize it, then insert them
 > again one-by-one.  If number of bins are always 2^n, rehashing is
 > to resize a 2^n array to 2^(n+1).  This case the elements stored in
 > new_bins[2^n+k] can only come from new_bins[k].  This fact does not
 > change the algorithmic complexity, but can reduce insertions.
 > 
 > https://github.com/shyouhei/ruby/commit/f7ca891
 
 Thanks!  However, I wasn't able to show a difference with
 "make benchmark"[1].   Were you?
 
 Perhaps rehash is not called often enough, and I think a branch
 inside the loop is difficult for the CPU to optimize.  I think
 the current dumb loop is very good for CPU pipelining and prefetch.
 
 
 [1] I have applied my patches for improved benchmark consistency:
 	https://bugs.ruby-lang.org/issues/5985#change-44442
 	https://bugs.ruby-lang.org/issues/9430

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44459

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59953] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-01-21  2:38   ` [ruby-core:59917] " Eric Wong
@ 2014-01-22  3:16     ` Urabe Shyouhei
  2014-01-22  8:10       ` [ruby-core:59967] " Eric Wong
  0 siblings, 1 reply; 22+ messages in thread
From: Urabe Shyouhei @ 2014-01-22  3:16 UTC (permalink / raw
  To: ruby-core

On 01/21/2014 11:38 AM, Eric Wong wrote:
>> https://github.com/shyouhei/ruby/commit/f7ca891
> 
> Thanks!  However, I wasn't able to show a difference with
> "make benchmark"[1].   Were you?
> 
> Perhaps rehash is not called often enough, and I think a branch
> inside the loop is difficult for the CPU to optimize.  I think
> the current dumb loop is very good for CPU pipelining and prefetch.

No, sorry I see no evident speedup.  When I wrote the patch I thought the
function was used for Hash#rehash, but it turned out Hash#rehash uses
something different (don't know why).  The optimization is valid I
believe but in fact used very rarely.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59955] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2014-01-21  2:41 ` [ruby-core:59918] " normalperson
@ 2014-01-22  3:22 ` shyouhei
  2014-01-22  8:12 ` [ruby-core:59968] " normalperson
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: shyouhei @ 2014-01-22  3:22 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Shyouhei Urabe.


 On 01/21/2014 11:38 AM, Eric Wong wrote:
 >> https://github.com/shyouhei/ruby/commit/f7ca891
 > 
 > Thanks!  However, I wasn't able to show a difference with
 > "make benchmark"[1].   Were you?
 > 
 > Perhaps rehash is not called often enough, and I think a branch
 > inside the loop is difficult for the CPU to optimize.  I think
 > the current dumb loop is very good for CPU pipelining and prefetch.
 
 No, sorry I see no evident speedup.  When I wrote the patch I thought the
 function was used for Hash#rehash, but it turned out Hash#rehash uses
 something different (don't know why).  The optimization is valid I
 believe but in fact used very rarely.

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44488

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59967] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-01-22  3:16     ` [ruby-core:59953] " Urabe Shyouhei
@ 2014-01-22  8:10       ` Eric Wong
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-01-22  8:10 UTC (permalink / raw
  To: Ruby developers

Urabe Shyouhei <shyouhei@ruby-lang.org> wrote:
> No, sorry I see no evident speedup.  When I wrote the patch I thought the
> function was used for Hash#rehash, but it turned out Hash#rehash uses
> something different (don't know why).  The optimization is valid I
> believe but in fact used very rarely.

Alright.  My understanding is branch mispredict costs are higher than
the memory stores which would be avoided.  The expensive part is loading
memory on cache miss, and that is not avoided.

We'll probably need to poke around with perf or similar tools to
analyze/confirm this.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:59968] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2014-01-22  3:22 ` [ruby-core:59955] " shyouhei
@ 2014-01-22  8:12 ` normalperson
  2014-02-01  8:02 ` [ruby-core:60395] " normalperson
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-01-22  8:12 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 Urabe Shyouhei <shyouhei@ruby-lang.org> wrote:
 > No, sorry I see no evident speedup.  When I wrote the patch I thought the
 > function was used for Hash#rehash, but it turned out Hash#rehash uses
 > something different (don't know why).  The optimization is valid I
 > believe but in fact used very rarely.
 
 Alright.  My understanding is branch mispredict costs are higher than
 the memory stores which would be avoided.  The expensive part is loading
 memory on cache miss, and that is not avoided.
 
 We'll probably need to poke around with perf or similar tools to
 analyze/confirm this.

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44499

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:60394] Re: [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-01-18  4:43 ` [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops normalperson
  2014-01-20  3:55   ` [ruby-core:59886] " Eric Wong
@ 2014-02-01  7:51   ` Eric Wong
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-02-01  7:51 UTC (permalink / raw
  To: ruby-core

Notes to self (or anybody else with free time right now):
  test and verify compare_by_identity performance

I'm comfortable that ID, string and most objects will hash well with
power-of-two; but compare_by_identity, weakmap, vm->living_threads may
hash less well without a prime modulo (or maybe they hash badly
regardless of modulo!)

I'm also interested in using a doubly-linked list (ccan/list[1])
for vm->living_threads (and possibly other places).

[1] BSDL version of the Linux kernel list.h at http://ccodearchive.net/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:60395] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2014-01-22  8:12 ` [ruby-core:59968] " normalperson
@ 2014-02-01  8:02 ` normalperson
  2014-03-03  6:29   ` [ruby-core:61242] " Eric Wong
  2014-03-03  6:33 ` [ruby-core:61243] " normalperson
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: normalperson @ 2014-02-01  8:02 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 Notes to self (or anybody else with free time right now):
   test and verify compare_by_identity performance
 
 I'm comfortable that ID, string and most objects will hash well with
 power-of-two; but compare_by_identity, weakmap, vm->living_threads may
 hash less well without a prime modulo (or maybe they hash badly
 regardless of modulo!)
 
 I'm also interested in using a doubly-linked list (ccan/list[1])
 for vm->living_threads (and possibly other places).
 
 [1] BSDL version of the Linux kernel list.h at http://ccodearchive.net/

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-44867

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61242] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-02-01  8:02 ` [ruby-core:60395] " normalperson
@ 2014-03-03  6:29   ` Eric Wong
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-03-03  6:29 UTC (permalink / raw
  To: ruby-core

normalperson@yhbt.net wrote:
>    test and verify compare_by_identity performance
>  
>  I'm comfortable that ID, string and most objects will hash well with
>  power-of-two; but compare_by_identity, weakmap, vm->living_threads may
>  hash less well without a prime modulo (or maybe they hash badly
>  regardless of modulo!)

OK, I was right about compare_by_identity being worse with power-of-two,
but I fixed it by tweaking numhash:

http://bogomips.org/ruby.git/patch?id=1579e9d0d82789

I was wrong about IDs hashing well before, they hash OK now :)

results: http://80x24.org/bmlog-20140303-034047.26775.txt
the hash parts:
	hash_aref_miss	1.048
	hash_aref_str	1.162
	hash_aref_sym	1.000
	hash_flatten	1.092
	hash_ident_num	1.007
	hash_ident_obj	1.098
	hash_ident_str	1.106
	hash_ident_sym	1.018
	hash_keys	1.000
	hash_shift	1.011
	hash_values	1.011
	vm2_bighash*	1.183

These numbers are from my weaker AMD FX-8320 which gave me worse numbers
than my Haswell machine in my original test.  I'll try to test on my
Haswell machine soon (network outage there :<).

I'd like to commit the following three patches soon:
http://bogomips.org/ruby.git/patch?id=a3fde671ffeec8 new hash benchmarks
http://bogomips.org/ruby.git/patch?id=8f155afef61342 original patch
http://bogomips.org/ruby.git/patch?id=1579e9d0d82789 numhash tweak

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61243] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2014-02-01  8:02 ` [ruby-core:60395] " normalperson
@ 2014-03-03  6:33 ` normalperson
  2014-03-18  9:02   ` [ruby-core:61574] " Eric Wong
  2014-03-22 23:41   ` [ruby-core:61638] " Eric Wong
  2014-03-18  9:12 ` [ruby-core:61575] " normalperson
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 22+ messages in thread
From: normalperson @ 2014-03-03  6:33 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 normalperson@yhbt.net wrote:
 >    test and verify compare_by_identity performance
 >  
 >  I'm comfortable that ID, string and most objects will hash well with
 >  power-of-two; but compare_by_identity, weakmap, vm->living_threads may
 >  hash less well without a prime modulo (or maybe they hash badly
 >  regardless of modulo!)
 
 OK, I was right about compare_by_identity being worse with power-of-two,
 but I fixed it by tweaking numhash:
 
 http://bogomips.org/ruby.git/patch?id=1579e9d0d82789
 
 I was wrong about IDs hashing well before, they hash OK now :)
 
 results: http://80x24.org/bmlog-20140303-034047.26775.txt
 the hash parts:
 	hash_aref_miss	1.048
 	hash_aref_str	1.162
 	hash_aref_sym	1.000
 	hash_flatten	1.092
 	hash_ident_num	1.007
 	hash_ident_obj	1.098
 	hash_ident_str	1.106
 	hash_ident_sym	1.018
 	hash_keys	1.000
 	hash_shift	1.011
 	hash_values	1.011
 	vm2_bighash*	1.183
 
 These numbers are from my weaker AMD FX-8320 which gave me worse numbers
 than my Haswell machine in my original test.  I'll try to test on my
 Haswell machine soon (network outage there :<).
 
 I'd like to commit the following three patches soon:
 http://bogomips.org/ruby.git/patch?id=a3fde671ffeec8 new hash benchmarks
 http://bogomips.org/ruby.git/patch?id=8f155afef61342 original patch
 http://bogomips.org/ruby.git/patch?id=1579e9d0d82789 numhash tweak

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45586

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61574] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-03-03  6:33 ` [ruby-core:61243] " normalperson
@ 2014-03-18  9:02   ` Eric Wong
  2014-03-18  9:43     ` [ruby-core:61576] " Юрий Соколов
  2014-03-22 23:41   ` [ruby-core:61638] " Eric Wong
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Wong @ 2014-03-18  9:02 UTC (permalink / raw
  To: ruby-core

normalperson@yhbt.net wrote:
>  	hash_aref_sym	1.000

Lack of improvement here was disappointing since symbol keys are common,
and this showed a regression on my x86 (32-bit) VMs.

I tweaked rb_any_hash to be symbol-aware:

	http://bogomips.org/ruby.git/patch?id=497ed6355

12-30% improvement on this test from trunk depending on CPU so far \o/
(Phenom II X4 got 30%, newer/faster x86-64 CPUs show less speedup).

I'm comfortable with improvements of this series on x86 VMs running on
x86-64 (and of course native x86-64).

Can anybody with real 32-bit hardware verify this series?  Not sure I
can trust VM results; my remaining x86 hardware is on its last legs
and showing occasional HW errors.

git://80x24.org/ruby.git st-noprime-v4

      st: use power-of-two sizes to avoid slow modulo ops
      add hash benchmarks
      st.c: tweak numhash to match common Ruby use cases
      hash.c: improve symbol hash distribution

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61575] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2014-03-03  6:33 ` [ruby-core:61243] " normalperson
@ 2014-03-18  9:12 ` normalperson
  2014-03-18  9:48 ` [ruby-core:61577] " funny.falcon
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-03-18  9:12 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 normalperson@yhbt.net wrote:
 >  	hash_aref_sym	1.000
 
 Lack of improvement here was disappointing since symbol keys are common,
 and this showed a regression on my x86 (32-bit) VMs.
 
 I tweaked rb_any_hash to be symbol-aware:
 
 	http://bogomips.org/ruby.git/patch?id=497ed6355
 
 12-30% improvement on this test from trunk depending on CPU so far \o/
 (Phenom II X4 got 30%, newer/faster x86-64 CPUs show less speedup).
 
 I'm comfortable with improvements of this series on x86 VMs running on
 x86-64 (and of course native x86-64).
 
 Can anybody with real 32-bit hardware verify this series?  Not sure I
 can trust VM results; my remaining x86 hardware is on its last legs
 and showing occasional HW errors.
 
 git://80x24.org/ruby.git st-noprime-v4
 
       st: use power-of-two sizes to avoid slow modulo ops
       add hash benchmarks
       st.c: tweak numhash to match common Ruby use cases
       hash.c: improve symbol hash distribution

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45859

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61576] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-03-18  9:02   ` [ruby-core:61574] " Eric Wong
@ 2014-03-18  9:43     ` Юрий Соколов
  0 siblings, 0 replies; 22+ messages in thread
From: Юрий Соколов @ 2014-03-18  9:43 UTC (permalink / raw
  To: Ruby developers

[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]

Can you test this
 + if (SYMBOL_P(a)) a = (a >> RUBY_SPECIAL_SHIFT) ^ (a >>
(RUBY_SPECIAL_SHIFT + 3)); /* 3=ID_SCOPE_SHIFT */


2014-03-18 13:02 GMT+04:00 Eric Wong <normalperson@yhbt.net>:

> normalperson@yhbt.net wrote:
> >       hash_aref_sym   1.000
>
> Lack of improvement here was disappointing since symbol keys are common,
> and this showed a regression on my x86 (32-bit) VMs.
>
> I tweaked rb_any_hash to be symbol-aware:
>
>         http://bogomips.org/ruby.git/patch?id=497ed6355
>
> 12-30% improvement on this test from trunk depending on CPU so far \o/
> (Phenom II X4 got 30%, newer/faster x86-64 CPUs show less speedup).
>
> I'm comfortable with improvements of this series on x86 VMs running on
> x86-64 (and of course native x86-64).
>
> Can anybody with real 32-bit hardware verify this series?  Not sure I
> can trust VM results; my remaining x86 hardware is on its last legs
> and showing occasional HW errors.
>
> git://80x24.org/ruby.git st-noprime-v4
>
>       st: use power-of-two sizes to avoid slow modulo ops
>       add hash benchmarks
>       st.c: tweak numhash to match common Ruby use cases
>       hash.c: improve symbol hash distribution
>

[-- Attachment #2: Type: text/html, Size: 1834 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61577] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (8 preceding siblings ...)
  2014-03-18  9:12 ` [ruby-core:61575] " normalperson
@ 2014-03-18  9:48 ` funny.falcon
  2014-03-18 19:42   ` [ruby-core:61579] " Eric Wong
  2014-03-18 19:48 ` [ruby-core:61580] " normalperson
  2014-03-22 23:48 ` [ruby-core:61639] " normalperson
  11 siblings, 1 reply; 22+ messages in thread
From: funny.falcon @ 2014-03-18  9:48 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Yura Sokolov.


 Can you test this
  + if (SYMBOL_P(a)) a = (a >> RUBY_SPECIAL_SHIFT) ^ (a >>
 (RUBY_SPECIAL_SHIFT + 3)); /* 3=ID_SCOPE_SHIFT */
 
 
 2014-03-18 13:02 GMT+04:00 Eric Wong <normalperson@yhbt.net>:
 
 > normalperson@yhbt.net wrote:
 > >       hash_aref_sym   1.000
 >
 > Lack of improvement here was disappointing since symbol keys are common,
 > and this showed a regression on my x86 (32-bit) VMs.
 >
 > I tweaked rb_any_hash to be symbol-aware:
 >
 >         http://bogomips.org/ruby.git/patch?id=497ed6355
 >
 > 12-30% improvement on this test from trunk depending on CPU so far \o/
 > (Phenom II X4 got 30%, newer/faster x86-64 CPUs show less speedup).
 >
 > I'm comfortable with improvements of this series on x86 VMs running on
 > x86-64 (and of course native x86-64).
 >
 > Can anybody with real 32-bit hardware verify this series?  Not sure I
 > can trust VM results; my remaining x86 hardware is on its last legs
 > and showing occasional HW errors.
 >
 > git://80x24.org/ruby.git st-noprime-v4
 >
 >       st: use power-of-two sizes to avoid slow modulo ops
 >       add hash benchmarks
 >       st.c: tweak numhash to match common Ruby use cases
 >       hash.c: improve symbol hash distribution
 >

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45860

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61579] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-03-18  9:48 ` [ruby-core:61577] " funny.falcon
@ 2014-03-18 19:42   ` Eric Wong
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-03-18 19:42 UTC (permalink / raw
  To: Ruby developers

funny.falcon@gmail.com wrote:
>  Can you test this
>   + if (SYMBOL_P(a)) a = (a >> RUBY_SPECIAL_SHIFT) ^ (a >>
>  (RUBY_SPECIAL_SHIFT + 3)); /* 3=ID_SCOPE_SHIFT */

Seems to hurt performance on x86 compared to my original.  I don't think
ID scope bits are useful for this case.  From what I've seen, Ruby users
tend to limit symbol names to \w when using the Hash class and keywords.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61580] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (9 preceding siblings ...)
  2014-03-18  9:48 ` [ruby-core:61577] " funny.falcon
@ 2014-03-18 19:48 ` normalperson
  2014-03-22 23:48 ` [ruby-core:61639] " normalperson
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-03-18 19:48 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 funny.falcon@gmail.com wrote:
 >  Can you test this
 >   + if (SYMBOL_P(a)) a = (a >> RUBY_SPECIAL_SHIFT) ^ (a >>
 >  (RUBY_SPECIAL_SHIFT + 3)); /* 3=ID_SCOPE_SHIFT */
 
 Seems to hurt performance on x86 compared to my original.  I don't think
 ID scope bits are useful for this case.  From what I've seen, Ruby users
 tend to limit symbol names to \w when using the Hash class and keywords.

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45862

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61638] Re: [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
  2014-03-03  6:33 ` [ruby-core:61243] " normalperson
  2014-03-18  9:02   ` [ruby-core:61574] " Eric Wong
@ 2014-03-22 23:41   ` Eric Wong
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Wong @ 2014-03-22 23:41 UTC (permalink / raw
  To: ruby-core

normalperson@yhbt.net wrote:
>  OK, I was right about compare_by_identity being worse with power-of-two,
>  but I fixed it by tweaking numhash:
>  
>  http://bogomips.org/ruby.git/patch?id=1579e9d0d82789

I tweaked that further, adding +3 instead of +1 to RUBY_SPECIAL_SHIFT
in r45384.  Also updated NEWS in case some extensions need tweaking for
performance.

Haswell Xeon E3-1230 v3 numbers:

hash_aref_miss  1.166
hash_aref_str   1.167
hash_aref_sym   1.224
hash_aref_sym_long      1.270
hash_flatten    1.656
hash_ident_num  1.142
hash_ident_obj  1.193
hash_ident_str  1.194
hash_ident_sym  1.171
hash_keys       1.002
hash_shift      1.122
hash_values     1.006
loop_whileloop2 1.001
vm2_bighash*    1.233

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [ruby-core:61639] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
       [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
                   ` (10 preceding siblings ...)
  2014-03-18 19:48 ` [ruby-core:61580] " normalperson
@ 2014-03-22 23:48 ` normalperson
  11 siblings, 0 replies; 22+ messages in thread
From: normalperson @ 2014-03-22 23:48 UTC (permalink / raw
  To: ruby-core

Issue #9425 has been updated by Eric Wong.


 normalperson@yhbt.net wrote:
 >  OK, I was right about compare_by_identity being worse with power-of-two,
 >  but I fixed it by tweaking numhash:
 >  
 >  http://bogomips.org/ruby.git/patch?id=1579e9d0d82789
 
 I tweaked that further, adding +3 instead of +1 to RUBY_SPECIAL_SHIFT
 in r45384.  Also updated NEWS in case some extensions need tweaking for
 performance.
 
 Haswell Xeon E3-1230 v3 numbers:
 
 hash_aref_miss  1.166
 hash_aref_str   1.167
 hash_aref_sym   1.224
 hash_aref_sym_long      1.270
 hash_flatten    1.656
 hash_ident_num  1.142
 hash_ident_obj  1.193
 hash_ident_str  1.194
 hash_ident_sym  1.171
 hash_keys       1.002
 hash_shift      1.122
 hash_values     1.006
 loop_whileloop2 1.001
 vm2_bighash*    1.233

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45904

* Author: Eric Wong
* Status: Closed
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-03-22 23:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-9425.20140118044317@ruby-lang.org>
2014-01-18  4:43 ` [ruby-core:59836] [ruby-trunk - Feature #9425] [Open] [PATCH] st: use power-of-two sizes to avoid slow modulo ops normalperson
2014-01-20  3:55   ` [ruby-core:59886] " Eric Wong
2014-02-01  7:51   ` [ruby-core:60394] " Eric Wong
2014-01-20  4:02 ` [ruby-core:59888] [ruby-trunk - Feature #9425] " normalperson
2014-01-20 17:48 ` [ruby-core:59909] " shyouhei
2014-01-21  2:38   ` [ruby-core:59917] " Eric Wong
2014-01-22  3:16     ` [ruby-core:59953] " Urabe Shyouhei
2014-01-22  8:10       ` [ruby-core:59967] " Eric Wong
2014-01-21  2:41 ` [ruby-core:59918] " normalperson
2014-01-22  3:22 ` [ruby-core:59955] " shyouhei
2014-01-22  8:12 ` [ruby-core:59968] " normalperson
2014-02-01  8:02 ` [ruby-core:60395] " normalperson
2014-03-03  6:29   ` [ruby-core:61242] " Eric Wong
2014-03-03  6:33 ` [ruby-core:61243] " normalperson
2014-03-18  9:02   ` [ruby-core:61574] " Eric Wong
2014-03-18  9:43     ` [ruby-core:61576] " Юрий Соколов
2014-03-22 23:41   ` [ruby-core:61638] " Eric Wong
2014-03-18  9:12 ` [ruby-core:61575] " normalperson
2014-03-18  9:48 ` [ruby-core:61577] " funny.falcon
2014-03-18 19:42   ` [ruby-core:61579] " Eric Wong
2014-03-18 19:48 ` [ruby-core:61580] " normalperson
2014-03-22 23:48 ` [ruby-core:61639] " normalperson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).