ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:22715] [Bug #1251] gsub problem
@ 2009-03-07  9:08 Alexander Pettelkau
  2009-03-07  9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
  2009-03-07  9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner
  0 siblings, 2 replies; 9+ messages in thread
From: Alexander Pettelkau @ 2009-03-07  9:08 UTC (permalink / raw
  To: ruby-core

Bug #1251: gsub problem
http://redmine.ruby-lang.org/issues/show/1251

Author: Alexander Pettelkau
Status: Open, Priority: Normal
Category: core, Target version: 1.9.1
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]

I wanted to replace "\" with "\\" in the string "\TEST":

s="\\TEST"
puts s    # Output --> "\TEST"
s.gsub!("\\","\\\\")
puts s    # Output --> "\TEST"
          # but EXPECTED Output "\\TEST"


----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22716] Re: [Bug #1251] gsub problem
  2009-03-07  9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
@ 2009-03-07  9:27 ` Yukihiro Matsumoto
  2009-03-07 12:00   ` [ruby-core:22719] " Wolfgang Nádasi-Donner
  2009-03-07  9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner
  1 sibling, 1 reply; 9+ messages in thread
From: Yukihiro Matsumoto @ 2009-03-07  9:27 UTC (permalink / raw
  To: ruby-core

HI,

In message "Re: [ruby-core:22715] [Bug #1251] gsub problem"
    on Sat, 7 Mar 2009 18:08:11 +0900, Alexander Pettelkau <redmine@ruby-lang.org> writes:

|I wanted to replace "\" with "\\" in the string "\TEST":
|
|s="\\TEST"
|puts s    # Output --> "\TEST"
|s.gsub!("\\","\\\\")
|puts s    # Output --> "\TEST"
|          # but EXPECTED Output "\\TEST"

You specified four backslashes in double quotes, which is two
backslashes in a string.  But replacement character does backslash
escapement such as \1, and \\ (two backslashes) are transformed into
one backslash.  That means you've substituted one backslash to one
backslash.

To substitute one backslash into two, you have to do

  s.gsub!("\\","\\\\\\")

or

  s.gsub!(/\\/){"\\\\"}

							matz.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22717] Re: [Bug #1251] gsub problem
  2009-03-07  9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
  2009-03-07  9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
@ 2009-03-07  9:28 ` Wolfgang Nádasi-Donner
  1 sibling, 0 replies; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-07  9:28 UTC (permalink / raw
  To: ruby-core

Alexander Pettelkau schrieb:
> Bug #1251: gsub problem
> http://redmine.ruby-lang.org/issues/show/1251
> 
> Author: Alexander Pettelkau
> Status: Open, Priority: Normal
> Category: core, Target version: 1.9.1
> ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]
> 
> I wanted to replace "\" with "\\" in the string "\TEST":
> 
> s="\\TEST"
> puts s    # Output --> "\TEST"
> s.gsub!("\\","\\\\")
> puts s    # Output --> "\TEST"
>           # but EXPECTED Output "\\TEST"
> 
> 
> ----------------------------------------
> http://redmine.ruby-lang.org
> 
> 
After the first step, the String contains two backslashes. This string 
will be interpreted again, because there can be references to matched 
groups inside (e.g. '\1'). This second interpretation sees a escaped 
backslash (backslash-backslash, which results in one backslash.

I think it should be documented,

Wolfgang Nádasi-Donner

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22719] Re: [Bug #1251] gsub problem
  2009-03-07  9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
@ 2009-03-07 12:00   ` Wolfgang Nádasi-Donner
  2009-03-07 17:51     ` [ruby-core:22722] " Alexander Pettelkau
  2009-03-13 10:47     ` [ruby-core:22878] " Yukihiro Matsumoto
  0 siblings, 2 replies; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-07 12:00 UTC (permalink / raw
  To: ruby-core

Yukihiro Matsumoto schrieb:
> To substitute one backslash into two, you have to do
> 
>   s.gsub!("\\","\\\\\\")
...
myprompt> irb191-p0
irb(main):001:0> puts "a\\b".gsub!("\\","\\\\\\")
a\\b
=> nil
irb(main):002:0> puts "a\\b".gsub!("\\","\\\\\\\\")
a\\b
=> nil

I was surprized by this result long ago, until I started to assume, that
the second replacement works only for \<...>, \nr, \\, and leaves the 
backslash as it is in all other combinations (even at end of the string).

This ist different from the first replacement, which consumes always a 
backslash as escape character...

myprompt> irb191-p0
irb(main):001:0> puts "\\\w"
\w
=> nil

I think this behaviour should be documented somewhere, because it can 
really confuse persons, which do not use complex RegExes during their 
daily work.

Wolfgang Nádasi-Donner

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22722] [Bug #1251] gsub problem
  2009-03-07 12:00   ` [ruby-core:22719] " Wolfgang Nádasi-Donner
@ 2009-03-07 17:51     ` Alexander Pettelkau
  2009-03-13 10:47     ` [ruby-core:22878] " Yukihiro Matsumoto
  1 sibling, 0 replies; 9+ messages in thread
From: Alexander Pettelkau @ 2009-03-07 17:51 UTC (permalink / raw
  To: ruby-core

Issue #1251 has been updated by Alexander Pettelkau.


Thanks a lot for clearing that up so fast !

Alexander Pettelkau
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1251

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22878] Re: [Bug #1251] gsub problem
  2009-03-07 12:00   ` [ruby-core:22719] " Wolfgang Nádasi-Donner
  2009-03-07 17:51     ` [ruby-core:22722] " Alexander Pettelkau
@ 2009-03-13 10:47     ` Yukihiro Matsumoto
  2009-03-13 11:50       ` [ruby-core:22881] " Wolfgang Nádasi-Donner
  2009-03-13 16:32       ` [ruby-core:22883] " Eero Saynatkari
  1 sibling, 2 replies; 9+ messages in thread
From: Yukihiro Matsumoto @ 2009-03-13 10:47 UTC (permalink / raw
  To: ruby-core

Hi,

In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
    on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <ed.odanow@wonado.de> writes:

|I think this behaviour should be documented somewhere, because it can 
|really confuse persons, which do not use complex RegExes during their 
|daily work.

Agreed.  Any opinion for concrete description?  Anyone?

							matz.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22881] Re: [Bug #1251] gsub problem
  2009-03-13 10:47     ` [ruby-core:22878] " Yukihiro Matsumoto
@ 2009-03-13 11:50       ` Wolfgang Nádasi-Donner
  2009-03-13 13:45         ` [ruby-core:22882] " Stephen Bannasch
  2009-03-13 16:32       ` [ruby-core:22883] " Eero Saynatkari
  1 sibling, 1 reply; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-13 11:50 UTC (permalink / raw
  To: ruby-core

Yukihiro Matsumoto schrieb:
> In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
>     on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <ed.odanow@wonado.de> writes:
> |I think this behaviour should be documented somewhere, because it can 
> |really confuse persons, which do not use complex RegExes during their 
> |daily work.
> Agreed.  Any opinion for concrete description?  Anyone?
The contents should describe the fact, that the second parsing of the 
replacement string will replace \\ by \, \n by the string found by 
anonymous group n or by empty string if the group doesn't exist and n is 
between 1 and 9, or \<name> and \'name' by the named group.

But don't use my english. It may lead to more confusion.

Wolfgang Nádasi-Donner

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22882] [Bug #1251] gsub problem
  2009-03-13 11:50       ` [ruby-core:22881] " Wolfgang Nádasi-Donner
@ 2009-03-13 13:45         ` Stephen Bannasch
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen Bannasch @ 2009-03-13 13:45 UTC (permalink / raw
  To: ruby-core

Issue #1251 has been updated by Stephen Bannasch.


This sequence helped me understand the issue better:

>> a = b = "1_2_3"
=> "1_2_3"
>> for i in 0..b.length do print "#{b[i]} " end
49 95 50 95 51  => 0..5
>> b = a.gsub('_', '\\')
=> "1\\2\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 50 92 51  => 0..5
>> b = a.gsub('_', '\\\\')
=> "1\\2\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 50 92 51  => 0..5
>> b = a.gsub('_', '\\\\\\')
=> "1\\\\2\\\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 92 50 92 92 51  => 0..7


----------------------------------------
http://redmine.ruby-lang.org/issues/show/1251

----------------------------------------
http://redmine.ruby-lang.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:22883] Re: [Bug #1251] gsub problem
  2009-03-13 10:47     ` [ruby-core:22878] " Yukihiro Matsumoto
  2009-03-13 11:50       ` [ruby-core:22881] " Wolfgang Nádasi-Donner
@ 2009-03-13 16:32       ` Eero Saynatkari
  1 sibling, 0 replies; 9+ messages in thread
From: Eero Saynatkari @ 2009-03-13 16:32 UTC (permalink / raw
  To: ruby-core@ruby-lang.org

Excerpts from Yukihiro Matsumoto's message of Fri Mar 13 12:47:48 +0200 2009:
> Hi,
> 
> In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
>     on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Ndasi-Donner <ed.odanow@wonado.de> writes:
> 
> |I think this behaviour should be documented somewhere, because it can 
> |really confuse persons, which do not use complex RegExes during their 
> |daily work.
> 
> Agreed.  Any opinion for concrete description?  Anyone?

RubySpec has this to say (please add any clarifications and
missing behaviour--I am sure there are some 1.9.1 cases at
least):

ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]

String#sub with pattern, replacement
- returns a copy of self with all occurrences of pattern replaced with replacement
- ignores a block if supplied
- supports \G which matches at the beginning of the string
- supports /i for ignoring case
- doesn't interpret regexp metacharacters if pattern is a string
- replaces \1 sequences with the regexp's corresponding capture
- treats \1 sequences without corresponding captures as empty strings
- replaces \& and \0 with the complete match
- replaces \` with everything before the current match
- replaces \' with everything after the current match
- replaces \\\+ with \\+
- replaces \+ with the last paren that actually matched
- treats \+ as an empty string if there was no captures
- maps \\ in replacement to \
- leaves unknown \x escapes in replacement untouched
- leaves \ at the end of replacement untouched
- taints the result if the original string or replacement is tainted
- tries to convert pattern to a string using to_str
- raises a TypeError when pattern can't be converted to a string
- tries to convert replacement to a string using to_str
- raises a TypeError when replacement can't be converted to a string
- returns subclass instances when called on a subclass
- sets $~ to MatchData of match and nil when there's none
- replaces \\1 with \1
- replaces \\1 with \1
- replaces \\\1 with \

String#sub with pattern and block
- returns a copy of self with the first occurrences of pattern replaced with the block's return value
- sets $~ for access from the block
- restores $~ after leaving the block
- sets $~ to MatchData of last match and nil when there's none for access from outside
- doesn't raise a RuntimeError if the string is modified while substituting
- doesn't interpolate special sequences like \1 for the block's return value
- converts the block's return value to a string using to_s
- taints the result if the original string or replacement is tainted

String#sub! with pattern, replacement
- modifies self in place and returns self
- taints self if replacement is tainted
- returns nil if no modifications were made
- raises a TypeError when self is frozen

String#sub! with pattern and block
- modifies self in place and returns self
- taints self if block's result is tainted
- returns nil if no modifications were made
- raises a RuntimeError if the string is modified while substituting
- raises a RuntimeError when self is frozen

String#gsub with pattern and replacement
- doesn't freak out when replacing ^
- returns a copy of self with all occurrences of pattern replaced with replacement
- ignores a block if supplied
- supports \G which matches at the beginning of the remaining (non-matched) string
- supports /i for ignoring case
- doesn't interpret regexp metacharacters if pattern is a string
- replaces \1 sequences with the regexp's corresponding capture
- treats \1 sequences without corresponding captures as empty strings
- replaces \& and \0 with the complete match
- replaces \` with everything before the current match
- replaces \' with everything after the current match
- replaces \+ with the last paren that actually matched
- treats \+ as an empty string if there was no captures
- maps \\ in replacement to \
- leaves unknown \x escapes in replacement untouched
- leaves \ at the end of replacement untouched
- taints the result if the original string or replacement is tainted
- tries to convert pattern to a string using to_str
- raises a TypeError when pattern can't be converted to a string
- tries to convert replacement to a string using to_str
- raises a TypeError when replacement can't be converted to a string
- returns subclass instances when called on a subclass
- sets $~ to MatchData of last match and nil when there's none

String#gsub with pattern and block
- returns a copy of self with all occurrences of pattern replaced with the block's return value
- sets $~ for access from the block
- restores $~ after leaving the block
- sets $~ to MatchData of last match and nil when there's none for access from outside
- raises a RuntimeError if the string is modified while substituting
- doesn't interpolate special sequences like \1 for the block's return value
- converts the block's return value to a string using to_s
- taints the result if the original string or replacement is tainted

String#gsub! with pattern and replacement
- modifies self in place and returns self
- taints self if replacement is tainted
- returns nil if no modifications were made
- raises a TypeError when self is frozen

String#gsub! with pattern and block
- modifies self in place and returns self
- taints self if block's result is tainted
- returns nil if no modifications were made
- raises a RuntimeError when self is frozen


Finished in 0.030081 seconds

2 files, 82 examples, 251 expectations, 0 failures, 0 errors


--
Magic is insufficiently advanced technology.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-03-13 16:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-07  9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
2009-03-07  9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
2009-03-07 12:00   ` [ruby-core:22719] " Wolfgang Nádasi-Donner
2009-03-07 17:51     ` [ruby-core:22722] " Alexander Pettelkau
2009-03-13 10:47     ` [ruby-core:22878] " Yukihiro Matsumoto
2009-03-13 11:50       ` [ruby-core:22881] " Wolfgang Nádasi-Donner
2009-03-13 13:45         ` [ruby-core:22882] " Stephen Bannasch
2009-03-13 16:32       ` [ruby-core:22883] " Eero Saynatkari
2009-03-07  9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).