ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:74903] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
@ 2016-04-12 19:03 ` asnow.dev
  2016-04-12 19:04 ` [ruby-core:74904] " asnow.dev
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: asnow.dev @ 2016-04-12 19:03 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been reported by Andrew Bolshov.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275

* Author: Andrew Bolshov
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:
~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = true
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:74904] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
  2016-04-12 19:03 ` [ruby-core:74903] [Ruby trunk Feature#12275] String unescape asnow.dev
@ 2016-04-12 19:04 ` asnow.dev
  2016-04-12 19:05 ` [ruby-core:74905] " asnow.dev
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: asnow.dev @ 2016-04-12 19:04 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Andrew Bolshov.

Description updated

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-58028

* Author: Andrew Bolshov
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = true
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:74905] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
  2016-04-12 19:03 ` [ruby-core:74903] [Ruby trunk Feature#12275] String unescape asnow.dev
  2016-04-12 19:04 ` [ruby-core:74904] " asnow.dev
@ 2016-04-12 19:05 ` asnow.dev
  2016-04-20  4:01 ` [ruby-core:75040] [Ruby trunk Feature#12275][Feedback] " shyouhei
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: asnow.dev @ 2016-04-12 19:05 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Andrew Bolshov.

Description updated

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-58029

* Author: Andrew Bolshov
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:75040] [Ruby trunk Feature#12275][Feedback] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2016-04-12 19:05 ` [ruby-core:74905] " asnow.dev
@ 2016-04-20  4:01 ` shyouhei
  2016-05-12 15:43 ` [ruby-core:75474] [Ruby trunk Feature#12275] " asnow.dev
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: shyouhei @ 2016-04-20  4:01 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Shyouhei Urabe.

Status changed from Open to Feedback

We looked at this ticket on this month's developer meeting.  I then started to think that the "escape" you refer to is not that concrete.

Unescaping cannot work out without escaping.  In ruby, there already is a method called String#dump.  Is this what you want to negate?

~~~ruby
irb(main):001:0> puts "\u5b57".encode('CP932').dump
"\x8E\x9A"
=> nil
~~~

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-58165

* Author: Andrew Bolshov
* Status: Feedback
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:75474] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2016-04-20  4:01 ` [ruby-core:75040] [Ruby trunk Feature#12275][Feedback] " shyouhei
@ 2016-05-12 15:43 ` asnow.dev
  2016-05-13  6:44 ` [ruby-core:75483] [Ruby trunk Feature#12275][Open] " shyouhei
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: asnow.dev @ 2016-05-12 15:43 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Andrew Bolshov.


I think yes, inverse of String#dump. I have user inputed string without qoutes, but it don't metter much.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-58595

* Author: Andrew Bolshov
* Status: Feedback
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:75483] [Ruby trunk Feature#12275][Open] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2016-05-12 15:43 ` [ruby-core:75474] [Ruby trunk Feature#12275] " asnow.dev
@ 2016-05-13  6:44 ` shyouhei
  2016-07-19  6:00 ` [ruby-core:76411] [Ruby trunk Feature#12275] " matz
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: shyouhei @ 2016-05-13  6:44 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Shyouhei Urabe.

Status changed from Feedback to Open

Thank you.  That makes sense to me because String#dump has no corresponding undump method now.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-58603

* Author: Andrew Bolshov
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:76411] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2016-05-13  6:44 ` [ruby-core:75483] [Ruby trunk Feature#12275][Open] " shyouhei
@ 2016-07-19  6:00 ` matz
  2017-11-18 13:27 ` [ruby-core:83816] " tad.a.digger
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: matz @ 2016-07-19  6:00 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by Yukihiro Matsumoto.


String#undump sounds reasonable. If someone implement, it's OK to add.

Matz.


----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-59651

* Author: Andrew Bolshov
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83816] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2016-07-19  6:00 ` [ruby-core:76411] [Ruby trunk Feature#12275] " matz
@ 2017-11-18 13:27 ` tad.a.digger
  2017-11-19 10:11 ` [ruby-core:83822] " duerst
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-11-18 13:27 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).


Hi, I'm working on this feature for several months.

First of all, I began to implement this as a gem.
https://github.com/tadd/string_undump
https://github.com/tadd/string_undump/blob/master/ext/string_undump/string_undump.c

Comments welcomed. I'll write a patch for trunk soon, as the next step.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67850

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83822] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2017-11-18 13:27 ` [ruby-core:83816] " tad.a.digger
@ 2017-11-19 10:11 ` duerst
  2017-11-24  7:05 ` [ruby-core:83874] " tad.a.digger
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: duerst @ 2017-11-19 10:11 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by duerst (Martin Dürst).


I think rather than using true/false to distinguish single and double quotes, it would be better to have a keyword parameter, such as `quotes: :single` (and quotes: :double, but that would be default).

Also, "prime quote" isn't used widely. Please check e.g. "prime quote" and "single quote" on your favorite search engine. In addition, U+2032 (′, PRIME) is a different character. (The official name of U+0027 is APOSTROPHE.)

Also, please think about encodings. Some people may want all non-ASCII characters escaped, but others may not want that at all.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67854

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83874] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (8 preceding siblings ...)
  2017-11-19 10:11 ` [ruby-core:83822] " duerst
@ 2017-11-24  7:05 ` tad.a.digger
  2017-11-24  8:04 ` [ruby-core:83875] " duerst
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-11-24  7:05 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).


Thank you for your comments.

> I think rather than using true/false to distinguish single and double quotes, it would be better to have a keyword parameter, such as quotes: :single (and quotes: :double, but that would be default).

I think we can forget about arguments (i.e. additional quotes), because current implementation never uses `eval()` internally.

My `String#undump` takes no argument just like: 

~~~ruby
'"\u00FC"'.undump #=> "ü"
~~~

I'll write detailed specs when I submit a patch.  Basically I focused to does inverse of `String#dump`.

> Also, please think about encodings. Some people may want all non-ASCII characters escaped, but others may not want that at all.

Unfortunately, I couldn't understand your concern.  I think we're discussing about unescaping/undumping, not escaping.
Note that `String#dump` already escapes all of non-ASCII characters, so I'm trying to unescape them all with `undump`.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67913

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83875] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (9 preceding siblings ...)
  2017-11-24  7:05 ` [ruby-core:83874] " tad.a.digger
@ 2017-11-24  8:04 ` duerst
  2017-11-25  8:20 ` [ruby-core:83880] " tad.a.digger
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: duerst @ 2017-11-24  8:04 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by duerst (Martin Dürst).


tad (Tadashi Saito) wrote:

> > Also, please think about encodings. Some people may want all non-ASCII characters escaped, but others may not want that at all.
> 
> Unfortunately, I couldn't understand your concern.  I think we're discussing about unescaping/undumping, not escaping.
> Note that `String#dump` already escapes all of non-ASCII characters, so I'm trying to unescape them all with `undump`.

Thanks for your explanation. I was confused.

Still, there is the question of what the encoding of the result of `#unescape` should be.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67914

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83880] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (10 preceding siblings ...)
  2017-11-24  8:04 ` [ruby-core:83875] " duerst
@ 2017-11-25  8:20 ` tad.a.digger
  2017-11-27 19:58 ` [ruby-core:83896] " tad.a.digger
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-11-25  8:20 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).


> Still, there is the question of what the encoding of the result of #unescape should be.

Indeed. It is one of few things that I'm still worried about.

For now, `undump` inherits receiver's encoding:

~~~ ruby
"abc".encode('euc-jp').undump.encoding #=> #<Encoding:EUC-JP>
~~~

But it may cause some inconvenient errors like:

~~~ ruby
utf8 = "\xE3\x81\x82".force_encoding('utf-8')
dumped = utf8.dump.encode('ascii') # we can treat dumped string as ASCII
dumped.valid_encoding? #=> always true, of course
dumped.undump #=> RangeError: 12354 out of char range
~~~

`dump`-ed string may contain any codepoints without original encoding information basically,
and this situation reminds me about `Integer#chr(encoding)`.
Then `undump` may needs an argument too, to specify encoding of result string, I think.

(Of course `dumped.force_encoding('utf-8')` before `undump` solves this problem, but I feel it's little redundant.)

Any thoughts about this?

Although this is another topic, I think that the name of this new method is confirmed as 
`#undump` (not `#unescape`) by @matz.  Please see https://bugs.ruby-lang.org/issues/12275#note-6
and below.  (I believe it's a good name because it reminds its spec clearly.)

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67919

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83896] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (11 preceding siblings ...)
  2017-11-25  8:20 ` [ruby-core:83880] " tad.a.digger
@ 2017-11-27 19:58 ` tad.a.digger
  2017-11-28  6:37 ` [ruby-core:83914] [Ruby trunk Feature#12275][Assigned] " tad.a.digger
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-11-27 19:58 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).

File benchmark.rb added
File v1.patch added

Sorry for late, I implemented `#undump` as `v1.patch` based on my "string_undump" gem.
Please see https://github.com/ruby/ruby/pull/1765 also.

## Spec

Roughly speaking, my implementation follows steps below:

1. If `self` is wrapped with double quote, just ignore them
2. Parse `self` and produce new string with concatenating character
   1. If escaped character (begins with backslash) found, unescape and add it to new string
   2. Otherwise, just add the character to the new string
3. Return the produced string

Note that this method does not require the wrapping of double quotes.  It will be a help
for the cases such as in the initial proposal like `"\\\t".undump` .

Supported escaping formats are:

* Backslash itself
  * \\\\
* Double quote after backslash
  * \" yields double quote itself
* One ASCII character after backslash
  * \n \r \t \f \v \b \a \e
* "u" after backslash (Unicode)
  * \uXXXX form
  * \u{XXXXX} form (number of hex digits is variable)
* "x" and two hex digits after backslash
  * \xXX form
* "#$", "#@" or "#{" after backslash
  * These are embedded-Ruby-variable-like strings

I was careful to cover all escaping cases in `String#dump` so that `s.dump.undump == s`
is true as possible.  Unfortunately, there are some limitations that shown below.

## Testing

I added some testcases in test/ruby/test_string.rb
https://github.com/ruby/ruby/pull/1765/files#diff-25eb856a893dbc53c562f6865b215083
and they passes of course.

Another testcases that based on the original gems also passed.
https://gist.github.com/tadd/634b6e4b09b6dfe7c8b97bca138d31ec

Furthermore, at the RubyKaigi of this year, I knew about AFL (American Fuzzy Lop).
http://lcamtuf.coredump.cx/afl/
(I was fortunate to know that.  Thank you shyouhei!)

It can tease my implementation.  I checked my original gem (string_undump 0.1.0) with AFL 2.36b,
then I confirmed that:

* It did not cause SEGV during one night, with (about) 9 million times execution
* It did not cause roundtrip error during one night, with (about) 10 million times execution
  * `s == s.dump.undump` always `true`
  * I ran it in UTF-8 environment

## Performance

It may be a boring result, but I'll also mention about performance.  With really-naive
benchmark, `undump` is about 9 times faster than `eval(string)`.
See and try attached `benchmark.rb` file, then feel free to experience Ruby 3x3x3 now...

## Limitations

Sorry, some limitations exist on current implementation.

* Can't undump non ASCII-compatible string
  * `'"abc"'.encode('utf-16le').undump` yields `Encoding::CompatibilityError` for now
  * This is simply due to my lack of impl knowledge.  Advice welcomed
* Can't undump dump-ed string correctly that is produced from non ASCII-compatible string
  * String#dump adds `.force_encoding("encoding name here")` at the end of dump-ed string,
	but String#undump doesn't parse this.  Please check code below:

~~~ ruby
s = '"abc"'.encode('utf-16le')
puts s.dump #=> "a\x00b\x00c\x00".force_encoding("UTF-16LE")
s == s.dump.undump #=> false
~~~

  * I believe this is rare case, and it's convenient enough even in the present situation
  * But of course, I will not commit the patch if this limitation is not acceptable

## Future work

* Improve support for non ASCII-compatible encodings (eliminate limitations above)
* Optimization for single-byte-optimizable string

## Conclusion

I implemented `#undump` to be "someone" matz said.  The code

* covers most practical cases of `dump` treats
* is enough safe from SEGV
* runs far faster from `eval()`

but some limitations still exist.

Any comments?


----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67942

* Author: asnow (Andrew Bolshov)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:83914] [Ruby trunk Feature#12275][Assigned] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (12 preceding siblings ...)
  2017-11-27 19:58 ` [ruby-core:83896] " tad.a.digger
@ 2017-11-28  6:37 ` tad.a.digger
  2017-12-03 11:07 ` [ruby-core:84064] [Ruby trunk Feature#12275] " tad.a.digger
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-11-28  6:37 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).

Status changed from Open to Assigned
Assignee set to tad (Tadashi Saito)

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-67965

* Author: asnow (Andrew Bolshov)
* Status: Assigned
* Priority: Normal
* Assignee: tad (Tadashi Saito)
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:84064] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (13 preceding siblings ...)
  2017-11-28  6:37 ` [ruby-core:83914] [Ruby trunk Feature#12275][Assigned] " tad.a.digger
@ 2017-12-03 11:07 ` tad.a.digger
  2017-12-09 18:05 ` [ruby-core:84144] " tad.a.digger
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-12-03 11:07 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).


A few days ago, I attended at Ruby developers' meeting.
We concluded that the implementation is immature, so I need to improve in several points before commit.

* Encoding of a `undump`ed string which including `\uXXXX` before `undump` should be UTF-8 automatically
* '"...".force_encoding("...")' form should be parsed
* `self` must be wrapped with double quotes
  * We need strict handling to clarify the spec

Improvements must be done in a week or so, then I'll require code reviewing.
After that,  I'll mention to the 2.5 release manager, naruse, to get approval to check in.

\# Yes, I have to hurry...!

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-68152

* Author: asnow (Andrew Bolshov)
* Status: Assigned
* Priority: Normal
* Assignee: tad (Tadashi Saito)
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:84144] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (14 preceding siblings ...)
  2017-12-03 11:07 ` [ruby-core:84064] [Ruby trunk Feature#12275] " tad.a.digger
@ 2017-12-09 18:05 ` tad.a.digger
  2017-12-13 20:44 ` [ruby-core:84242] " tad.a.digger
  2017-12-14  8:52 ` [ruby-core:84259] " tad.a.digger
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-12-09 18:05 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).

File benchmark2.rb added
File v2.patch added

I updated patch as v2.patch to satisfy 3 points that mentioned in [note-15](https://bugs.ruby-lang.org/issues/12275#note-15).
(Also https://github.com/ruby/ruby/pull/1765 is updated too.)

I also attached a simple benchmarking script as benchmark2.rb to check performance of newly-supported `"...".force_encoding("...")` form.

Can anyone review this patch?  Or @naruse, do you want to nominate somebody?

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-68246

* Author: asnow (Andrew Bolshov)
* Status: Assigned
* Priority: Normal
* Assignee: tad (Tadashi Saito)
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)
benchmark2.rb (315 Bytes)
v2.patch (12.1 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:84242] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (15 preceding siblings ...)
  2017-12-09 18:05 ` [ruby-core:84144] " tad.a.digger
@ 2017-12-13 20:44 ` tad.a.digger
  2017-12-14  8:52 ` [ruby-core:84259] " tad.a.digger
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-12-13 20:44 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).

File v3.patch added

Thanks to shyouhei, mame, and especially naruse, I was able to brush up the patch.
v3.patch is attached.  Improvements are diverse.

Spec change:

* use RuntimeError instead of ArgumentError for invalid formed (self) string
  * no arguments are given for this method... :(
* explicitly reject string that contains:
  * non-ascii character
  * NUL `\0` character
  * (note that `dump`ed strings do not contain above)

Bug fix:

* reject string that contains double quote in double quotes, like `'""""'`
* prevent compiler's warnings/errors
  * cast explicitly from unsigned long to int
  * remove needless "const"

Misc:

* fix styles
* add more tests for invalid escaping
* remove needless logic
* adjust unescaped expression to parse.y

I'll take a short nap while praying so that AFL will not catch the worm...


----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-68371

* Author: asnow (Andrew Bolshov)
* Status: Assigned
* Priority: Normal
* Assignee: tad (Tadashi Saito)
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)
benchmark2.rb (315 Bytes)
v2.patch (12.1 KB)
v3.patch (12.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [ruby-core:84259] [Ruby trunk Feature#12275] String unescape
       [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
                   ` (16 preceding siblings ...)
  2017-12-13 20:44 ` [ruby-core:84242] " tad.a.digger
@ 2017-12-14  8:52 ` tad.a.digger
  17 siblings, 0 replies; 18+ messages in thread
From: tad.a.digger @ 2017-12-14  8:52 UTC (permalink / raw
  To: ruby-core

Issue #12275 has been updated by tad (Tadashi Saito).


I committed this under the approval of @naruse . https://github.com/ruby/ruby/pull/1765#pullrequestreview-83409358
Thanks a lot.

----------------------------------------
Feature #12275: String unescape
https://bugs.ruby-lang.org/issues/12275#change-68409

* Author: asnow (Andrew Bolshov)
* Status: Closed
* Priority: Normal
* Assignee: tad (Tadashi Saito)
* Target version: 
----------------------------------------
I think it will be usefull to have function that convert input string as it was written in prime qouted string or in double qouted string. It's part of metaprogramming.
Example:

~~~ ruby
class String
  # Create new string like it will be writed in qoutes. Optional argument define type of qouting used: true - prime qoute, false - double qoute. Default is double qoute.
  def unescape prime = false
    eval( prime ? "'#{self}'" : "\"#{self}\"" )
  end
end

"\\\t".unescape # => "\t"
~~~

Other requests:
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape
http://stackoverflow.com/questions/4265928/how-do-i-unescape-c-style-escape-sequences-from-ruby
http://stackoverflow.com/questions/8639642/best-way-to-escape-and-unescape-strings-in-ruby

Realized
http://www.rubydoc.info/github/ronin-ruby/ronin-support/String:unescape

---Files--------------------------------
benchmark.rb (193 Bytes)
v1.patch (8.95 KB)
benchmark2.rb (315 Bytes)
v2.patch (12.1 KB)
v3.patch (12.9 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-12-14  8:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-12275.20160412190303@ruby-lang.org>
2016-04-12 19:03 ` [ruby-core:74903] [Ruby trunk Feature#12275] String unescape asnow.dev
2016-04-12 19:04 ` [ruby-core:74904] " asnow.dev
2016-04-12 19:05 ` [ruby-core:74905] " asnow.dev
2016-04-20  4:01 ` [ruby-core:75040] [Ruby trunk Feature#12275][Feedback] " shyouhei
2016-05-12 15:43 ` [ruby-core:75474] [Ruby trunk Feature#12275] " asnow.dev
2016-05-13  6:44 ` [ruby-core:75483] [Ruby trunk Feature#12275][Open] " shyouhei
2016-07-19  6:00 ` [ruby-core:76411] [Ruby trunk Feature#12275] " matz
2017-11-18 13:27 ` [ruby-core:83816] " tad.a.digger
2017-11-19 10:11 ` [ruby-core:83822] " duerst
2017-11-24  7:05 ` [ruby-core:83874] " tad.a.digger
2017-11-24  8:04 ` [ruby-core:83875] " duerst
2017-11-25  8:20 ` [ruby-core:83880] " tad.a.digger
2017-11-27 19:58 ` [ruby-core:83896] " tad.a.digger
2017-11-28  6:37 ` [ruby-core:83914] [Ruby trunk Feature#12275][Assigned] " tad.a.digger
2017-12-03 11:07 ` [ruby-core:84064] [Ruby trunk Feature#12275] " tad.a.digger
2017-12-09 18:05 ` [ruby-core:84144] " tad.a.digger
2017-12-13 20:44 ` [ruby-core:84242] " tad.a.digger
2017-12-14  8:52 ` [ruby-core:84259] " tad.a.digger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).