ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:92842] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
@ 2019-05-25 17:40 ` michael
  2019-05-25 22:25 ` [ruby-core:92843] " shevegen
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: michael @ 2019-05-25 17:40 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been reported by grosser (Michael Grosser).

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876

* Author: grosser (Michael Grosser)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92843] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
  2019-05-25 17:40 ` [ruby-core:92842] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal michael
@ 2019-05-25 22:25 ` shevegen
  2019-05-26  0:34 ` [ruby-core:92845] " mame
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: shevegen @ 2019-05-25 22:25 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by shevegen (Robert A. Heiler).


> which is confusing/annoying especially to users that don't know
> how encodings work in ruby.

I personally finally switched into UTF-8 (oddly enough, primarily due to emoji and
unicode-symbols that can be used for simple indications both on the commandline and
www), but I think one problem (for me) was from ruby 1.8.x to later ruby versions
that there was not that much documentation available.

Judging from your comment encoding may still pose a problem for some ruby users (or
potentially new ruby users).

Some time ago, I think, jeremy evans wrote a document about symbols, which was added
(my apologies if I misremember). If anyone feels like writing some document about
encoding in ruby, and how to deal with it ... :) (could be in wiki-style or perhaps
gist-github or some other place; I am in no way suggesting that only a single 
person should do so, it could be a collaborative effort).

To the issue at hand, I just tested in irb:

    1.to_s.encoding  #should be the same as "".encoding # => #<Encoding:US-ASCII>
    "".encoding # => #<Encoding:UTF-8>

This is indeed a little surprising (to me). There may be valid reasons for this,
perhaps default external encoding, or something like this, but I can see why 
people may be confused about it. Actually what surprises me is that .to_s on
the number leads to US-ASCII encoding by default.

I think looking back when I used an ISO-encoding, the most surprising result I
had encountered was actually in regards to regexp-engine and encodings used
there. I do not remember exactly how I found it, but I think I reported it back
then; still not entirely sure how it came, but regexes may also be an area where
users may be a little bit confused - so documentation may be of some help.

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78224

* Author: grosser (Michael Grosser)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92845] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
  2019-05-25 17:40 ` [ruby-core:92842] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal michael
  2019-05-25 22:25 ` [ruby-core:92843] " shevegen
@ 2019-05-26  0:34 ` mame
  2019-05-26  2:30 ` [ruby-core:92846] " duerst
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: mame @ 2019-05-26  0:34 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by mame (Yusuke Endoh).


@grosser, could you elaborate your problem?  I cannot reproduce the warning.  What warning did you see?  And how?

```
s1 = 1.to_s
p s1.encoding #=> #<Encoding:US-ASCII>

s2 = "1"
p s2.encoding #=> #<Encoding:UTF-8>

p s1 == s2 #=> true with no warning
```

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78226

* Author: grosser (Michael Grosser)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92846] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2019-05-26  0:34 ` [ruby-core:92845] " mame
@ 2019-05-26  2:30 ` duerst
  2019-05-26  3:08 ` [ruby-core:92847] " mame
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: duerst @ 2019-05-26  2:30 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by duerst (Martin Dürst).


@mame:

What @grosser is saying is that
```
p s1.encoding == s2.encoding #=> false
```
but he expects the result to be true. But you are right that what counts is the equality of the strings, not the encodings.

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78227

* Author: grosser (Michael Grosser)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92847] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2019-05-26  2:30 ` [ruby-core:92846] " duerst
@ 2019-05-26  3:08 ` mame
  2019-05-31 16:10 ` [ruby-core:92913] " naruse
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: mame @ 2019-05-26  3:08 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by mame (Yusuke Endoh).


@grosser said

> I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different

I thought that some string-comparison assertions (maybe attributed to an external testing framework?) emitted a spurious warning like "the encoding was different" or something.

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78228

* Author: grosser (Michael Grosser)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92913] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2019-05-26  3:08 ` [ruby-core:92847] " mame
@ 2019-05-31 16:10 ` naruse
  2019-05-31 17:37 ` [ruby-core:92914] " hanmac
  2019-05-31 20:29 ` [ruby-core:92917] " eregontp
  7 siblings, 0 replies; 8+ messages in thread
From: naruse @ 2019-05-31 16:10 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by naruse (Yui NARUSE).

Status changed from Open to Feedback

What is the problem you are actually troubled with?

If it is just a testing problem, I feel it should just use correct assertions.
But if there's a frequent pitfall, I may reconsider it.

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78288

* Author: grosser (Michael Grosser)
* Status: Feedback
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92914] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2019-05-31 16:10 ` [ruby-core:92913] " naruse
@ 2019-05-31 17:37 ` hanmac
  2019-05-31 20:29 ` [ruby-core:92917] " eregontp
  7 siblings, 0 replies; 8+ messages in thread
From: hanmac @ 2019-05-31 17:37 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by Hanmac (Hans Mackowiak).


There is `Encoding.compatible?` which might help to check if two strings/symbols has a common encoding

@naruse i don't know if you are the right contact person for this, but is there a way to see if two encoding objects are compatible or can that only be checked on the string?

----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78290

* Author: grosser (Michael Grosser)
* Status: Feedback
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:92917] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal
       [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2019-05-31 17:37 ` [ruby-core:92914] " hanmac
@ 2019-05-31 20:29 ` eregontp
  7 siblings, 0 replies; 8+ messages in thread
From: eregontp @ 2019-05-31 20:29 UTC (permalink / raw)
  To: ruby-core

Issue #15876 has been updated by Eregon (Benoit Daloze).


Hanmac (Hans Mackowiak) wrote:
> is there a way to see if two encoding objects are compatible or can that only be checked on the string?

`Encoding.compatible?` can take Encoding arguments too:

    > Encoding.compatible?(Encoding::UTF_8, Encoding::US_ASCII) 
    => #<Encoding:UTF-8>


----------------------------------------
Bug #15876: 1.to_s.encoding != Encoding.default_internal
https://bugs.ruby-lang.org/issues/15876#change-78293

* Author: grosser (Michael Grosser)
* Status: Feedback
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 2.6.3
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-31 20:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-15876.20190525174004@ruby-lang.org>
2019-05-25 17:40 ` [ruby-core:92842] [Ruby trunk Bug#15876] 1.to_s.encoding != Encoding.default_internal michael
2019-05-25 22:25 ` [ruby-core:92843] " shevegen
2019-05-26  0:34 ` [ruby-core:92845] " mame
2019-05-26  2:30 ` [ruby-core:92846] " duerst
2019-05-26  3:08 ` [ruby-core:92847] " mame
2019-05-31 16:10 ` [ruby-core:92913] " naruse
2019-05-31 17:37 ` [ruby-core:92914] " hanmac
2019-05-31 20:29 ` [ruby-core:92917] " eregontp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).