ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings
@ 2013-10-08  9:54 normalperson (Eric Wong)
  2013-10-08 21:17 ` [ruby-core:57742] [ruby-trunk - Feature #8998] " normalperson (Eric Wong)
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: normalperson (Eric Wong) @ 2013-10-08  9:54 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been reported by normalperson (Eric Wong).

----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998

Author: normalperson (Eric Wong)
Status: Open
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:57742] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
@ 2013-10-08 21:17 ` normalperson (Eric Wong)
  2013-10-08 21:29 ` [ruby-core:57743] " normalperson (Eric Wong)
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: normalperson (Eric Wong) @ 2013-10-08 21:17 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by normalperson (Eric Wong).

File hash_aset_fstring.diff added

Proposed patch to partially implement this, but I get segfaults (backtrace/dump coming) with "make check".
There probably needs to be some RGenGC-related calls/fixes for this.

Note: this patch does not avoid the short-lived, unfrozen string literal.
I don't currently know the parser/compiler code well enough to make
that optimization.

----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-42345

Author: normalperson (Eric Wong)
Status: Open
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:57743] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
  2013-10-08 21:17 ` [ruby-core:57742] [ruby-trunk - Feature #8998] " normalperson (Eric Wong)
@ 2013-10-08 21:29 ` normalperson (Eric Wong)
  2013-10-09  2:15   ` [ruby-core:57756] " Eric Wong
  2013-11-27  4:00 ` [ruby-core:58624] " tmm1 (Aman Gupta)
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: normalperson (Eric Wong) @ 2013-10-08 21:29 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by normalperson (Eric Wong).

File hash_aset_fstring_check_fail.txt added
File hash_aset_fstring_check_bt.txt added

attaching output of "make check" and gdb backtrace
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-42346

Author: normalperson (Eric Wong)
Status: Open
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:57756] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08 21:29 ` [ruby-core:57743] " normalperson (Eric Wong)
@ 2013-10-09  2:15   ` Eric Wong
  2013-10-10  2:48     ` [ruby-core:57801] " Eric Wong
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wong @ 2013-10-09  2:15 UTC (permalink / raw
  To: ruby-core

I think my failed patch exposes a bug with lazy sweep + rb_fstring.
Lazy sweep GC means the element remains in the frozen_string hash,

	fstr1 = rb_fstring(str)
	fstr1 goes out of scope
	GC mark runs ...
	fstr1 is eligible for lazy sweep
	fstr2 = rb_fstring(str)
	fstr2 is identical to fstr1
	fstr1 is swept
	fstr2 use attempted -> crash

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:57801] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-09  2:15   ` [ruby-core:57756] " Eric Wong
@ 2013-10-10  2:48     ` Eric Wong
  2013-10-23 17:15       ` [ruby-core:58000] " Eric Wong
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Wong @ 2013-10-10  2:48 UTC (permalink / raw
  To: ruby-core

The issue I had with my original patch is fixed by nobu with r43210
Thanks!

So my proposed patch should be safe to apply, but it's only a partial
implementation of the idea for this feature.

I don't think I'll have time+energy to implement this for the
parser/compiler.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58000] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-10  2:48     ` [ruby-core:57801] " Eric Wong
@ 2013-10-23 17:15       ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2013-10-23 17:15 UTC (permalink / raw
  To: ruby-core

Eric Wong <normalperson@yhbt.net> wrote:
> So my proposed patch should be safe to apply, but it's only a partial
> implementation of the idea for this feature.

While that's true, I haven't been able to measure performance
improvements from this (no real apps which are very hash-dependent).

Can anybody else?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58624] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
  2013-10-08 21:17 ` [ruby-core:57742] [ruby-trunk - Feature #8998] " normalperson (Eric Wong)
  2013-10-08 21:29 ` [ruby-core:57743] " normalperson (Eric Wong)
@ 2013-11-27  4:00 ` tmm1 (Aman Gupta)
  2013-11-27  4:01 ` [ruby-core:58625] " tmm1 (Aman Gupta)
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-27  4:00 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by tmm1 (Aman Gupta).


I didn't realize MRI already froze string keys in hashes.

Your patch reduces long-lived strings in our rails app by ~11%:

  $ ruby -rconfig/environment -e' GC.start; p ObjectSpace.count_objects[:T_STRING] '
  173956
  
  $ ruby -rconfig/environment -e' GC.start; p ObjectSpace.count_objects[:T_STRING] '
  154750

I am inclined to merge it.
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43193

Author: normalperson (Eric Wong)
Status: Open
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58625] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (2 preceding siblings ...)
  2013-11-27  4:00 ` [ruby-core:58624] " tmm1 (Aman Gupta)
@ 2013-11-27  4:01 ` tmm1 (Aman Gupta)
  2013-11-27  7:15   ` [ruby-core:58631] " Eric Wong
  2013-11-27  8:12 ` [ruby-core:58632] " tmm1 (Aman Gupta)
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-27  4:01 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by tmm1 (Aman Gupta).


>+	if (!OBJ_FROZEN(str))
>+	    *key = rb_fstring(str);

Do you know why the OBJ_FROZEN check is required here?

I tried to investigate and it appears this function is sometimes invoked with symbols.
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43194

Author: normalperson (Eric Wong)
Status: Open
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58631] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-11-27  4:01 ` [ruby-core:58625] " tmm1 (Aman Gupta)
@ 2013-11-27  7:15   ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2013-11-27  7:15 UTC (permalink / raw
  To: Ruby developers


"tmm1 (Aman Gupta)" <ruby@tmm1.net> wrote:
> Issue #8998 has been updated by tmm1 (Aman Gupta).

Cool, I didn't check object lifetimes.
Any measurable impact on speed?

> >+	if (!OBJ_FROZEN(str))
> >+	    *key = rb_fstring(str);
> 
> Do you know why the OBJ_FROZEN check is required here?

Some apps use frozen constant strings to speed up hash assignment for
common string keys.  No point in having the extra hash lookup if it was
already a pre-frozen string (and it could break existing behavior).

Ideally it would be nice to transparently freeze strings used for hash
assignment in the parser so apps wouldn't have to add the ugly frozen
constants.  Mongrel, and most probably every server descended from
Mongrel uses pre-frozen constants to speed up hash assignment.

> I tried to investigate and it appears this function is sometimes
> invoked with symbols.

Really?  Yikes!  That's strange.  I didn't expect that, and I'm
not sure how it could be from reading rb_hash_aset()...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58632] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (3 preceding siblings ...)
  2013-11-27  4:01 ` [ruby-core:58625] " tmm1 (Aman Gupta)
@ 2013-11-27  8:12 ` tmm1 (Aman Gupta)
  2013-11-27 10:29   ` [ruby-core:58638] " Eric Wong
  2013-11-28  7:42 ` [ruby-core:58655] " tmm1 (Aman Gupta)
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-27  8:12 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by tmm1 (Aman Gupta).


Seeing some occasional segfaults in CI after this change. Not quite sure what's up yet.

http://c5632.rubyci.org/~chkbuild/ruby-trunk/log/20131127T070302Z.log.html.gz

#3  0x0063fb97 in rb_bug (fmt=0x67363d "Segmentation fault") at error.c:341
#4  0x00557a16 in sigsegv (sig=11, info=0x95f9c7c, ctx=0x95f9cfc)
    at signal.c:699
#5  <signal handler called>
#6  search_nonascii (p=0x1c <Address 0x1c out of bounds>, 
    len=<value optimized out>, enc=0x95df180) at string.c:202
#7  coderange_scan (p=0x1c <Address 0x1c out of bounds>, 
    len=<value optimized out>, enc=0x95df180) at string.c:232
#8  0x0057ab6e in rb_enc_str_coderange (str1=184779220, str2=173130540)
    at string.c:376
#9  rb_str_comparable (str1=184779220, str2=173130540) at string.c:2403
#10 0x0057bb32 in rb_str_hash_cmp (a=184779220, b=173130540) at string.c:2367
#11 fstring_cmp (a=184779220, b=173130540) at string.c:161
#12 0x00561257 in find_entry (table=0x9603aa0, key=184779220, 
    value=0xbf9a818c) at st.c:387
#13 st_lookup (table=0x9603aa0, key=184779220, value=0xbf9a818c) at st.c:425
#14 0x00568062 in rb_fstring (str=184779220) at string.c:144
#15 0x004b00da in hash_aset_str (key=0xbf9a8234, val=0xbf9a8218, 
    arg=3214574164, existing=0) at hash.c:1267
#16 hash_aset_str_insert (key=0xbf9a8234, val=0xbf9a8218, arg=3214574164, 
    existing=0) at hash.c:1273
#17 0x00561dba in st_update (table=0xa069f50, key=184779220, 
    func=0x4b0040 <hash_aset_str_insert>, arg=3214574164) at st.c:908
#18 0x004b4c0f in tbl_update (hash=184779600, key=184779220, val=171144140)
    at hash.c:380
#19 rb_hash_aset (hash=184779600, key=184779220, val=171144140) at hash.c:1316
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43201

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58638] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-11-27  8:12 ` [ruby-core:58632] " tmm1 (Aman Gupta)
@ 2013-11-27 10:29   ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2013-11-27 10:29 UTC (permalink / raw
  To: Ruby developers

"tmm1 (Aman Gupta)" <ruby@tmm1.net> wrote:
> Seeing some occasional segfaults in CI after this change. Not quite
> sure what's up yet.
>
> http://c5632.rubyci.org/~chkbuild/ruby-trunk/log/20131127T070302Z.log.html.gz
> 
> #3  0x0063fb97 in rb_bug (fmt=0x67363d "Segmentation fault") at error.c:341
> #4  0x00557a16 in sigsegv (sig=11, info=0x95f9c7c, ctx=0x95f9cfc)
>     at signal.c:699
> #5  <signal handler called>
> #6  search_nonascii (p=0x1c <Address 0x1c out of bounds>, 
>     len=<value optimized out>, enc=0x95df180) at string.c:202
> #7  coderange_scan (p=0x1c <Address 0x1c out of bounds>, 
>     len=<value optimized out>, enc=0x95df180) at string.c:232

So it seems GC is not preserving the hash used by rb_fstring, still.  I
wonder if something else changed since r43210 which caused this, perhaps
r43718?  I've never been a GC expert, especially not with the current GC.

> #8  0x0057ab6e in rb_enc_str_coderange (str1=184779220, str2=173130540)
>     at string.c:376

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58655] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (4 preceding siblings ...)
  2013-11-27  8:12 ` [ruby-core:58632] " tmm1 (Aman Gupta)
@ 2013-11-28  7:42 ` tmm1 (Aman Gupta)
  2013-11-28 10:44 ` [ruby-core:58657] " ko1 (Koichi Sasada)
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-28  7:42 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by tmm1 (Aman Gupta).


The regression was indeed caused by r43718, and has been resolved with r43887.
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43224

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58657] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (5 preceding siblings ...)
  2013-11-28  7:42 ` [ruby-core:58655] " tmm1 (Aman Gupta)
@ 2013-11-28 10:44 ` ko1 (Koichi Sasada)
  2013-11-28 22:05 ` [ruby-core:58669] " tmm1 (Aman Gupta)
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: ko1 (Koichi Sasada) @ 2013-11-28 10:44 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by ko1 (Koichi Sasada).


> The regression was indeed caused by r43718, and has been resolved with r43887.

No. Before r43718, it works accidentally.

----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43225

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:58669] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (6 preceding siblings ...)
  2013-11-28 10:44 ` [ruby-core:58657] " ko1 (Koichi Sasada)
@ 2013-11-28 22:05 ` tmm1 (Aman Gupta)
  2014-01-02 23:44 ` [ruby-core:59486] " headius (Charles Nutter)
  2014-01-03  7:44 ` [ruby-core:59504] " headius (Charles Nutter)
  9 siblings, 0 replies; 17+ messages in thread
From: tmm1 (Aman Gupta) @ 2013-11-28 22:05 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by tmm1 (Aman Gupta).


Oops, @ko1 is right.

The issue at hand is that fstrings were allowed to be shared strings. With rb_gc_mark(), the shared string was more likely to get marked (due to the mark_stack addition). But there were still cases where the shared string had already been lazy swept by the time the fstring was marked or resurrected.

With r43718, we resolved the issue once and for all by disallowing shared fstrings altogether.
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-43236

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: current: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:59486] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (7 preceding siblings ...)
  2013-11-28 22:05 ` [ruby-core:58669] " tmm1 (Aman Gupta)
@ 2014-01-02 23:44 ` headius (Charles Nutter)
  2014-01-03  7:44 ` [ruby-core:59504] " headius (Charles Nutter)
  9 siblings, 0 replies; 17+ messages in thread
From: headius (Charles Nutter) @ 2014-01-02 23:44 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by headius (Charles Nutter).


I've implemented something like this for JRuby 9k. Both the interpreter and the compiler pre-freeze and dedup the keys, allowing them to go straight into literal hashes.

Performance is significantly improved by doing this:

Before:

$ jruby -rbenchmark -e "10.times { puts Benchmark.measure { hash = nil; 10_000_000.times { hash = { 'a' => 1, 'b' => 1, 'c' => 1, 'd' => 1, 'e' => 1 } } } }"
  5.810000   0.100000   5.910000 (  5.282000)
  5.230000   0.030000   5.260000 (  5.101000)
  4.710000   0.020000   4.730000 (  4.509000)
  4.500000   0.030000   4.530000 (  4.375000)
  4.630000   0.030000   4.660000 (  4.516000)
  4.510000   0.020000   4.530000 (  4.390000)
  4.560000   0.030000   4.590000 (  4.433000)
  4.580000   0.030000   4.610000 (  4.455000)
  4.540000   0.020000   4.560000 (  4.416000)
  4.560000   0.030000   4.590000 (  4.435000)

After:

$ jruby -rbenchmark -e "10.times { puts Benchmark.measure { hash = nil; 10_000_000.times { hash = { 'a' => 1, 'b' => 1, 'c' => 1, 'd' => 1, 'e' => 1 } } } }"
  3.270000   0.100000   3.370000 (  2.295000)
  2.140000   0.020000   2.160000 (  2.069000)
  1.870000   0.010000   1.880000 (  1.779000)
  1.930000   0.020000   1.950000 (  1.744000)
  1.790000   0.010000   1.800000 (  1.734000)
  1.810000   0.010000   1.820000 (  1.750000)
  1.790000   0.010000   1.800000 (  1.743000)
  1.820000   0.020000   1.840000 (  1.769000)
  1.820000   0.010000   1.830000 (  1.766000)
  1.770000   0.010000   1.780000 (  1.712000)

For reference, 2.1p0 on my system:

$ rvm ruby-2.1 do ruby -rbenchmark -e "10.times { puts Benchmark.measure { hash = nil; 10_000_000.times { hash = { 'a' => 1, 'b' => 1, 'c' => 1, 'd' => 1, 'e' => 1 } } } }"
  9.060000   0.040000   9.100000 (  9.098937)
  9.030000   0.030000   9.060000 (  9.064616)
  8.980000   0.020000   9.000000 (  9.007513)
  9.050000   0.030000   9.080000 (  9.078869)
  9.250000   0.040000   9.290000 (  9.292398)
  9.020000   0.030000   9.050000 (  9.053593)
  9.080000   0.030000   9.110000 (  9.113024)
  8.970000   0.020000   8.990000 (  8.986615)
  8.870000   0.010000   8.880000 (  8.884534)
  8.940000   0.010000   8.950000 (  8.954753)

When I have the commit done (needs a little cleanup...only a couple hours work at the moment) I will link here in case it is useful.
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-44033

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:59504] [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
                   ` (8 preceding siblings ...)
  2014-01-02 23:44 ` [ruby-core:59486] " headius (Charles Nutter)
@ 2014-01-03  7:44 ` headius (Charles Nutter)
  2014-01-03  8:51   ` [ruby-core:59506] " Eric Wong
  9 siblings, 1 reply; 17+ messages in thread
From: headius (Charles Nutter) @ 2014-01-03  7:44 UTC (permalink / raw
  To: ruby-core


Issue #8998 has been updated by headius (Charles Nutter).


I ran into a small snag so I think it will be useful to have my JRuby commits here.

First, a background commit... adding a weak dedup cache: https://github.com/jruby/jruby/commit/926ca89075a4a4c84592add729531189263c143f

It's not particularly complicated; a double-weak hash synchronized for concurrency purposes.

Here's the commit that adds dedup to literal hashes with literal string keys: https://github.com/jruby/jruby/commit/aedcbc7b4024cf651a1df3bb65a8490a74a66562

This commit does all of the following:

* Interpreter and compiler logic for literal hash with literal keys will only freeze-dup them once, using the dedup cache.
* Hash will use dedup cache for all incoming strings.

The first problem I ran into was that there are tests in CSV that appear to depend on already-frozen strings not being deduplicated during hash insertion. This brought about the following failures:

  1) Failure:
TestCSV::Interface#test_write_hash_with_string_keys [/Users/headius/projects/jruby/test/mri/csv/test_interface.rb:214]:
Expected "a" (oid=9726) to be the same as "a" (oid=9728).

  2) Failure:
TestCSV::Interface::DifferentOFS#test_write_hash_with_string_keys [/Users/headius/projects/jruby/test/mri/csv/test_interface.rb:214]:
Expected "a" (oid=9726) to be the same as "a" (oid=9746).

  3) Failure:
TestCSV::Row#test_to_hash [/Users/headius/projects/jruby/test/mri/csv/test_row.rb:304]:
Expected "A" (oid=9748) to be the same as "A" (oid=9750).

  4) Failure:
TestCSV::Row::DifferentOFS#test_to_hash [/Users/headius/projects/jruby/test/mri/csv/test_row.rb:304]:
Expected "A" (oid=9752) to be the same as "A" (oid=9750).

I fixed these by adding a check to only dedup incoming strings if they weren't already frozen. I'm not sure if that's the right course of action or if the tests should be changed.

Then I ran into a more unusual snag. In RubySpec, there are tests for Hash#compare_by_identity/= that fail if I dedup incoming strings, since they test that the same string content in two different objects can hold two different values in the Hash. Specifically:

1)
Hash#compare_by_identity perists over #dups FAILED
Expected {"foo"=>:glark}
 to equal {"foo"=>:bar, "foo"=>:glark}

/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:77:in `(root)'
org/jruby/RubyBasicObject.java:1573:in `instance_eval'
org/jruby/RubyEnumerable.java:1437:in `all?'
org/jruby/RubyFixnum.java:269:in `times'
org/jruby/RubyArray.java:1556:in `each'
/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:2:in `(root)'
/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:1:in `(root)'
org/jruby/RubyKernel.java:881:in `load'
org/jruby/RubyBasicObject.java:1573:in `instance_eval'
org/jruby/RubyArray.java:1556:in `each'

2)
Hash#compare_by_identity persists over #clones FAILED
Expected {"foo"=>:glark}
 to equal {"foo"=>:bar, "foo"=>:glark}

/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:84:in `(root)'
org/jruby/RubyBasicObject.java:1573:in `instance_eval'
org/jruby/RubyEnumerable.java:1437:in `all?'
org/jruby/RubyFixnum.java:269:in `times'
org/jruby/RubyArray.java:1556:in `each'
/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:2:in `(root)'
/Users/headius/projects/jruby/spec/ruby/core/hash/compare_by_identity_spec.rb:1:in `(root)'
org/jruby/RubyKernel.java:881:in `load'
org/jruby/RubyBasicObject.java:1573:in `instance_eval'
org/jruby/RubyArray.java:1556:in `each'

I fixed this issue by only deduping when compare_by_identity is false, but I'm not sure if it's correct.

So the basic question that came out of my experiment:

* Should string freeze-duping in Hash#[]= always dedup, regardless of whether the string has already been frozen or the Hash has compare_by_identity == true? Or else what combination of those conditions should lead to deduping?
----------------------------------------
Feature #8998: string keys for hash literals should use fstrings
https://bugs.ruby-lang.org/issues/8998#change-44049

Author: normalperson (Eric Wong)
Status: Closed
Priority: Low
Assignee: 
Category: core
Target version: 2.1.0


While we're introducing optimizations from frozen strings,
string keys inside hashes should be frozen at the compiler level
to prevent duplication.

	a = { "ABC" => :t }
	b = { "ABC" => :t }

	# the following ought to print true
	p(a.keys[0].object_id == b.keys[0].object_id)

This should introduce no incompatibilities and be transparent to users of
older rubies.



-- 
http://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:59506] Re: [ruby-trunk - Feature #8998] string keys for hash literals should use fstrings
  2014-01-03  7:44 ` [ruby-core:59504] " headius (Charles Nutter)
@ 2014-01-03  8:51   ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2014-01-03  8:51 UTC (permalink / raw
  To: Ruby developers

> * Should string freeze-duping in Hash#[]= always dedup, regardless of
> whether the string has already been frozen or the Hash has
> compare_by_identity == true? Or else what combination of those
> conditions should lead to deduping?

MRI doesn't replace/alter the key at all if compare_by_identity is true.
Thus keys remain mutable strings in the case of compare_by_identity.

Also, in my original proposed patch, it only deduped when a new frozen
string was required, so there's almost zero chance of incompatibility.
Pre-frozen strings are used as-is when used as keys in all cases.

Btw, in addition to the current optimization and in lieu of my original
patch, I also have a proposed patch in
https://bugs.ruby-lang.org/issues/9188 which also optimizes aref along
with aset only when literal strings are used.  This means we won't bust
the fstring cache with dynamically-generated strings, but string
literals in the source are not duplicated.

I'm not sure what the real-world performance impact is, but I suspect
Rails apps will benefit from that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-01-03  9:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-08  9:54 [ruby-core:57727] [ruby-trunk - Feature #8998][Open] string keys for hash literals should use fstrings normalperson (Eric Wong)
2013-10-08 21:17 ` [ruby-core:57742] [ruby-trunk - Feature #8998] " normalperson (Eric Wong)
2013-10-08 21:29 ` [ruby-core:57743] " normalperson (Eric Wong)
2013-10-09  2:15   ` [ruby-core:57756] " Eric Wong
2013-10-10  2:48     ` [ruby-core:57801] " Eric Wong
2013-10-23 17:15       ` [ruby-core:58000] " Eric Wong
2013-11-27  4:00 ` [ruby-core:58624] " tmm1 (Aman Gupta)
2013-11-27  4:01 ` [ruby-core:58625] " tmm1 (Aman Gupta)
2013-11-27  7:15   ` [ruby-core:58631] " Eric Wong
2013-11-27  8:12 ` [ruby-core:58632] " tmm1 (Aman Gupta)
2013-11-27 10:29   ` [ruby-core:58638] " Eric Wong
2013-11-28  7:42 ` [ruby-core:58655] " tmm1 (Aman Gupta)
2013-11-28 10:44 ` [ruby-core:58657] " ko1 (Koichi Sasada)
2013-11-28 22:05 ` [ruby-core:58669] " tmm1 (Aman Gupta)
2014-01-02 23:44 ` [ruby-core:59486] " headius (Charles Nutter)
2014-01-03  7:44 ` [ruby-core:59504] " headius (Charles Nutter)
2014-01-03  8:51   ` [ruby-core:59506] " Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).