[ruby-core:65030] [ruby-trunk - Bug #10239] [Open] Regexp.quote() and default encoding

ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed

* [ruby-core:65030] [ruby-trunk - Bug #10239] [Open] Regexp.quote() and default encoding
       [not found] <redmine.issue-10239.20140914095455@ruby-lang.org>
@ 2014-09-14  9:54 ` shevegen
  2014-10-29  8:39 ` [ruby-core:65986] [ruby-trunk - Bug #10239] [Assigned] " nagachika00
  1 sibling, 0 replies; 2+ messages in thread
From: shevegen @ 2014-09-14  9:54 UTC (permalink / raw
  To: ruby-core

Issue #10239 has been reported by Robert A. Heiler.

----------------------------------------
Bug #10239: Regexp.quote() and default encoding
https://bugs.ruby-lang.org/issues/10239

* Author: Robert A. Heiler
* Status: Open
* Priority: Low
* Assignee: 
* Category: 
* Target version: 
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Hello,

I am not sure if this is a bug, or unexpected behaviour (for me).

I will simply report it, I am sure you guys know how and if to 
handle this anyway.

I believe it should be documented at least in the official documentation
if it is not a bug.

The situation is that I have several strings with mixed encodings.

Some will have automatically UTF8, some US-ASCII, and yet some
others will have ASCII-8BIT.

I noticed that Regexp.quote() change the encoding of the string 
in question in the same project unfortunately, and no way to
change that (as some of that gets set from the outside world
to me).

Here is proof for Regexp.quote() changing the encoding, where
x is my test variable - a string:

  x = "abc"; x.encoding # => #<Encoding:US-ASCII>

  x.encode!('ASCII-8BIT'); x.encoding # => #<Encoding:ASCII-8BIT>

Ok, all works fine, it defaulted to US-ASCII but is not
ASCII-8BIT.

Next:

  test = Regexp.quote(x); test.encoding # => #<Encoding:US-ASCII>

Suddenly the new string that is returned has another encoding.

I looked at the documentation:

  http://www.ruby-doc.org/core-2.1.2/Regexp.html#method-c-quote

But there is no mention that this method would return a new
String object with a different encoding.

I would have expected it to not change the encoding of the
argument-string object there.

Perhaps the documentation could mention that it will ignore
the original encoding of the string given?

-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [ruby-core:65986] [ruby-trunk - Bug #10239] [Assigned] Regexp.quote() and default encoding
       [not found] <redmine.issue-10239.20140914095455@ruby-lang.org>
  2014-09-14  9:54 ` [ruby-core:65030] [ruby-trunk - Bug #10239] [Open] Regexp.quote() and default encoding shevegen
@ 2014-10-29  8:39 ` nagachika00
  1 sibling, 0 replies; 2+ messages in thread
From: nagachika00 @ 2014-10-29  8:39 UTC (permalink / raw
  To: ruby-core

Issue #10239 has been updated by Tomoyuki Chikanaga.

Category set to doc
Status changed from Open to Assigned
Assignee set to Zachary Scott
Target version set to current: 2.2.0

I think this is intended behavior.

----------------------------------------
Bug #10239: Regexp.quote() and default encoding
https://bugs.ruby-lang.org/issues/10239#change-49719

* Author: Robert A. Heiler
* Status: Assigned
* Priority: Low
* Assignee: Zachary Scott
* Category: doc
* Target version: current: 2.2.0
* ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Hello,

I am not sure if this is a bug, or unexpected behaviour (for me).

I will simply report it, I am sure you guys know how and if to 
handle this anyway.

I believe it should be documented at least in the official documentation
if it is not a bug.

The situation is that I have several strings with mixed encodings.

Some will have automatically UTF8, some US-ASCII, and yet some
others will have ASCII-8BIT.

I noticed that Regexp.quote() change the encoding of the string 
in question in the same project unfortunately, and no way to
change that (as some of that gets set from the outside world
to me).

Here is proof for Regexp.quote() changing the encoding, where
x is my test variable - a string:

  x = "abc"; x.encoding # => #<Encoding:US-ASCII>

  x.encode!('ASCII-8BIT'); x.encoding # => #<Encoding:ASCII-8BIT>

Ok, all works fine, it defaulted to US-ASCII but is not
ASCII-8BIT.

Next:

  test = Regexp.quote(x); test.encoding # => #<Encoding:US-ASCII>

Suddenly the new string that is returned has another encoding.

I looked at the documentation:

  http://www.ruby-doc.org/core-2.1.2/Regexp.html#method-c-quote

But there is no mention that this method would return a new
String object with a different encoding.

I would have expected it to not change the encoding of the
argument-string object there.

Perhaps the documentation could mention that it will ignore
the original encoding of the string given?

-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-10-29  8:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-10239.20140914095455@ruby-lang.org>
2014-09-14  9:54 ` [ruby-core:65030] [ruby-trunk - Bug #10239] [Open] Regexp.quote() and default encoding shevegen
2014-10-29  8:39 ` [ruby-core:65986] [ruby-trunk - Bug #10239] [Assigned] " nagachika00

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).