* [ruby-core:22715] [Bug #1251] gsub problem
@ 2009-03-07 9:08 Alexander Pettelkau
2009-03-07 9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
2009-03-07 9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner
0 siblings, 2 replies; 9+ messages in thread
From: Alexander Pettelkau @ 2009-03-07 9:08 UTC (permalink / raw
To: ruby-core
Bug #1251: gsub problem
http://redmine.ruby-lang.org/issues/show/1251
Author: Alexander Pettelkau
Status: Open, Priority: Normal
Category: core, Target version: 1.9.1
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]
I wanted to replace "\" with "\\" in the string "\TEST":
s="\\TEST"
puts s # Output --> "\TEST"
s.gsub!("\\","\\\\")
puts s # Output --> "\TEST"
# but EXPECTED Output "\\TEST"
----------------------------------------
http://redmine.ruby-lang.org
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22716] Re: [Bug #1251] gsub problem
2009-03-07 9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
@ 2009-03-07 9:27 ` Yukihiro Matsumoto
2009-03-07 12:00 ` [ruby-core:22719] " Wolfgang Nádasi-Donner
2009-03-07 9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner
1 sibling, 1 reply; 9+ messages in thread
From: Yukihiro Matsumoto @ 2009-03-07 9:27 UTC (permalink / raw
To: ruby-core
HI,
In message "Re: [ruby-core:22715] [Bug #1251] gsub problem"
on Sat, 7 Mar 2009 18:08:11 +0900, Alexander Pettelkau <redmine@ruby-lang.org> writes:
|I wanted to replace "\" with "\\" in the string "\TEST":
|
|s="\\TEST"
|puts s # Output --> "\TEST"
|s.gsub!("\\","\\\\")
|puts s # Output --> "\TEST"
| # but EXPECTED Output "\\TEST"
You specified four backslashes in double quotes, which is two
backslashes in a string. But replacement character does backslash
escapement such as \1, and \\ (two backslashes) are transformed into
one backslash. That means you've substituted one backslash to one
backslash.
To substitute one backslash into two, you have to do
s.gsub!("\\","\\\\\\")
or
s.gsub!(/\\/){"\\\\"}
matz.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22717] Re: [Bug #1251] gsub problem
2009-03-07 9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
2009-03-07 9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
@ 2009-03-07 9:28 ` Wolfgang Nádasi-Donner
1 sibling, 0 replies; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-07 9:28 UTC (permalink / raw
To: ruby-core
Alexander Pettelkau schrieb:
> Bug #1251: gsub problem
> http://redmine.ruby-lang.org/issues/show/1251
>
> Author: Alexander Pettelkau
> Status: Open, Priority: Normal
> Category: core, Target version: 1.9.1
> ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]
>
> I wanted to replace "\" with "\\" in the string "\TEST":
>
> s="\\TEST"
> puts s # Output --> "\TEST"
> s.gsub!("\\","\\\\")
> puts s # Output --> "\TEST"
> # but EXPECTED Output "\\TEST"
>
>
> ----------------------------------------
> http://redmine.ruby-lang.org
>
>
After the first step, the String contains two backslashes. This string
will be interpreted again, because there can be references to matched
groups inside (e.g. '\1'). This second interpretation sees a escaped
backslash (backslash-backslash, which results in one backslash.
I think it should be documented,
Wolfgang Nádasi-Donner
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22719] Re: [Bug #1251] gsub problem
2009-03-07 9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
@ 2009-03-07 12:00 ` Wolfgang Nádasi-Donner
2009-03-07 17:51 ` [ruby-core:22722] " Alexander Pettelkau
2009-03-13 10:47 ` [ruby-core:22878] " Yukihiro Matsumoto
0 siblings, 2 replies; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-07 12:00 UTC (permalink / raw
To: ruby-core
Yukihiro Matsumoto schrieb:
> To substitute one backslash into two, you have to do
>
> s.gsub!("\\","\\\\\\")
...
myprompt> irb191-p0
irb(main):001:0> puts "a\\b".gsub!("\\","\\\\\\")
a\\b
=> nil
irb(main):002:0> puts "a\\b".gsub!("\\","\\\\\\\\")
a\\b
=> nil
I was surprized by this result long ago, until I started to assume, that
the second replacement works only for \<...>, \nr, \\, and leaves the
backslash as it is in all other combinations (even at end of the string).
This ist different from the first replacement, which consumes always a
backslash as escape character...
myprompt> irb191-p0
irb(main):001:0> puts "\\\w"
\w
=> nil
I think this behaviour should be documented somewhere, because it can
really confuse persons, which do not use complex RegExes during their
daily work.
Wolfgang Nádasi-Donner
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22722] [Bug #1251] gsub problem
2009-03-07 12:00 ` [ruby-core:22719] " Wolfgang Nádasi-Donner
@ 2009-03-07 17:51 ` Alexander Pettelkau
2009-03-13 10:47 ` [ruby-core:22878] " Yukihiro Matsumoto
1 sibling, 0 replies; 9+ messages in thread
From: Alexander Pettelkau @ 2009-03-07 17:51 UTC (permalink / raw
To: ruby-core
Issue #1251 has been updated by Alexander Pettelkau.
Thanks a lot for clearing that up so fast !
Alexander Pettelkau
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1251
----------------------------------------
http://redmine.ruby-lang.org
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22878] Re: [Bug #1251] gsub problem
2009-03-07 12:00 ` [ruby-core:22719] " Wolfgang Nádasi-Donner
2009-03-07 17:51 ` [ruby-core:22722] " Alexander Pettelkau
@ 2009-03-13 10:47 ` Yukihiro Matsumoto
2009-03-13 11:50 ` [ruby-core:22881] " Wolfgang Nádasi-Donner
2009-03-13 16:32 ` [ruby-core:22883] " Eero Saynatkari
1 sibling, 2 replies; 9+ messages in thread
From: Yukihiro Matsumoto @ 2009-03-13 10:47 UTC (permalink / raw
To: ruby-core
Hi,
In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <ed.odanow@wonado.de> writes:
|I think this behaviour should be documented somewhere, because it can
|really confuse persons, which do not use complex RegExes during their
|daily work.
Agreed. Any opinion for concrete description? Anyone?
matz.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22881] Re: [Bug #1251] gsub problem
2009-03-13 10:47 ` [ruby-core:22878] " Yukihiro Matsumoto
@ 2009-03-13 11:50 ` Wolfgang Nádasi-Donner
2009-03-13 13:45 ` [ruby-core:22882] " Stephen Bannasch
2009-03-13 16:32 ` [ruby-core:22883] " Eero Saynatkari
1 sibling, 1 reply; 9+ messages in thread
From: Wolfgang Nádasi-Donner @ 2009-03-13 11:50 UTC (permalink / raw
To: ruby-core
Yukihiro Matsumoto schrieb:
> In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
> on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <ed.odanow@wonado.de> writes:
> |I think this behaviour should be documented somewhere, because it can
> |really confuse persons, which do not use complex RegExes during their
> |daily work.
> Agreed. Any opinion for concrete description? Anyone?
The contents should describe the fact, that the second parsing of the
replacement string will replace \\ by \, \n by the string found by
anonymous group n or by empty string if the group doesn't exist and n is
between 1 and 9, or \<name> and \'name' by the named group.
But don't use my english. It may lead to more confusion.
Wolfgang Nádasi-Donner
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22882] [Bug #1251] gsub problem
2009-03-13 11:50 ` [ruby-core:22881] " Wolfgang Nádasi-Donner
@ 2009-03-13 13:45 ` Stephen Bannasch
0 siblings, 0 replies; 9+ messages in thread
From: Stephen Bannasch @ 2009-03-13 13:45 UTC (permalink / raw
To: ruby-core
Issue #1251 has been updated by Stephen Bannasch.
This sequence helped me understand the issue better:
>> a = b = "1_2_3"
=> "1_2_3"
>> for i in 0..b.length do print "#{b[i]} " end
49 95 50 95 51 => 0..5
>> b = a.gsub('_', '\\')
=> "1\\2\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 50 92 51 => 0..5
>> b = a.gsub('_', '\\\\')
=> "1\\2\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 50 92 51 => 0..5
>> b = a.gsub('_', '\\\\\\')
=> "1\\\\2\\\\3"
>> for i in 0..b.length do print "#{b[i]} " end
49 92 92 50 92 92 51 => 0..7
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1251
----------------------------------------
http://redmine.ruby-lang.org
^ permalink raw reply [flat|nested] 9+ messages in thread
* [ruby-core:22883] Re: [Bug #1251] gsub problem
2009-03-13 10:47 ` [ruby-core:22878] " Yukihiro Matsumoto
2009-03-13 11:50 ` [ruby-core:22881] " Wolfgang Nádasi-Donner
@ 2009-03-13 16:32 ` Eero Saynatkari
1 sibling, 0 replies; 9+ messages in thread
From: Eero Saynatkari @ 2009-03-13 16:32 UTC (permalink / raw
To: ruby-core@ruby-lang.org
Excerpts from Yukihiro Matsumoto's message of Fri Mar 13 12:47:48 +0200 2009:
> Hi,
>
> In message "Re: [ruby-core:22719] Re: [Bug #1251] gsub problem"
> on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Ndasi-Donner <ed.odanow@wonado.de> writes:
>
> |I think this behaviour should be documented somewhere, because it can
> |really confuse persons, which do not use complex RegExes during their
> |daily work.
>
> Agreed. Any opinion for concrete description? Anyone?
RubySpec has this to say (please add any clarifications and
missing behaviour--I am sure there are some 1.9.1 cases at
least):
ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]
String#sub with pattern, replacement
- returns a copy of self with all occurrences of pattern replaced with replacement
- ignores a block if supplied
- supports \G which matches at the beginning of the string
- supports /i for ignoring case
- doesn't interpret regexp metacharacters if pattern is a string
- replaces \1 sequences with the regexp's corresponding capture
- treats \1 sequences without corresponding captures as empty strings
- replaces \& and \0 with the complete match
- replaces \` with everything before the current match
- replaces \' with everything after the current match
- replaces \\\+ with \\+
- replaces \+ with the last paren that actually matched
- treats \+ as an empty string if there was no captures
- maps \\ in replacement to \
- leaves unknown \x escapes in replacement untouched
- leaves \ at the end of replacement untouched
- taints the result if the original string or replacement is tainted
- tries to convert pattern to a string using to_str
- raises a TypeError when pattern can't be converted to a string
- tries to convert replacement to a string using to_str
- raises a TypeError when replacement can't be converted to a string
- returns subclass instances when called on a subclass
- sets $~ to MatchData of match and nil when there's none
- replaces \\1 with \1
- replaces \\1 with \1
- replaces \\\1 with \
String#sub with pattern and block
- returns a copy of self with the first occurrences of pattern replaced with the block's return value
- sets $~ for access from the block
- restores $~ after leaving the block
- sets $~ to MatchData of last match and nil when there's none for access from outside
- doesn't raise a RuntimeError if the string is modified while substituting
- doesn't interpolate special sequences like \1 for the block's return value
- converts the block's return value to a string using to_s
- taints the result if the original string or replacement is tainted
String#sub! with pattern, replacement
- modifies self in place and returns self
- taints self if replacement is tainted
- returns nil if no modifications were made
- raises a TypeError when self is frozen
String#sub! with pattern and block
- modifies self in place and returns self
- taints self if block's result is tainted
- returns nil if no modifications were made
- raises a RuntimeError if the string is modified while substituting
- raises a RuntimeError when self is frozen
String#gsub with pattern and replacement
- doesn't freak out when replacing ^
- returns a copy of self with all occurrences of pattern replaced with replacement
- ignores a block if supplied
- supports \G which matches at the beginning of the remaining (non-matched) string
- supports /i for ignoring case
- doesn't interpret regexp metacharacters if pattern is a string
- replaces \1 sequences with the regexp's corresponding capture
- treats \1 sequences without corresponding captures as empty strings
- replaces \& and \0 with the complete match
- replaces \` with everything before the current match
- replaces \' with everything after the current match
- replaces \+ with the last paren that actually matched
- treats \+ as an empty string if there was no captures
- maps \\ in replacement to \
- leaves unknown \x escapes in replacement untouched
- leaves \ at the end of replacement untouched
- taints the result if the original string or replacement is tainted
- tries to convert pattern to a string using to_str
- raises a TypeError when pattern can't be converted to a string
- tries to convert replacement to a string using to_str
- raises a TypeError when replacement can't be converted to a string
- returns subclass instances when called on a subclass
- sets $~ to MatchData of last match and nil when there's none
String#gsub with pattern and block
- returns a copy of self with all occurrences of pattern replaced with the block's return value
- sets $~ for access from the block
- restores $~ after leaving the block
- sets $~ to MatchData of last match and nil when there's none for access from outside
- raises a RuntimeError if the string is modified while substituting
- doesn't interpolate special sequences like \1 for the block's return value
- converts the block's return value to a string using to_s
- taints the result if the original string or replacement is tainted
String#gsub! with pattern and replacement
- modifies self in place and returns self
- taints self if replacement is tainted
- returns nil if no modifications were made
- raises a TypeError when self is frozen
String#gsub! with pattern and block
- modifies self in place and returns self
- taints self if block's result is tainted
- returns nil if no modifications were made
- raises a RuntimeError when self is frozen
Finished in 0.030081 seconds
2 files, 82 examples, 251 expectations, 0 failures, 0 errors
--
Magic is insufficiently advanced technology.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-03-13 16:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-07 9:08 [ruby-core:22715] [Bug #1251] gsub problem Alexander Pettelkau
2009-03-07 9:27 ` [ruby-core:22716] " Yukihiro Matsumoto
2009-03-07 12:00 ` [ruby-core:22719] " Wolfgang Nádasi-Donner
2009-03-07 17:51 ` [ruby-core:22722] " Alexander Pettelkau
2009-03-13 10:47 ` [ruby-core:22878] " Yukihiro Matsumoto
2009-03-13 11:50 ` [ruby-core:22881] " Wolfgang Nádasi-Donner
2009-03-13 13:45 ` [ruby-core:22882] " Stephen Bannasch
2009-03-13 16:32 ` [ruby-core:22883] " Eero Saynatkari
2009-03-07 9:28 ` [ruby-core:22717] " Wolfgang Nádasi-Donner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).