ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:109842] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
@ 2022-09-04 23:06 javanthropus (Jeremy Bopp)
  2022-09-15 13:28 ` [ruby-core:109902] " javanthropus (Jeremy Bopp)
  2024-05-05 15:20 ` [ruby-core:117778] " javanthropus (Jeremy Bopp) via ruby-core
  0 siblings, 2 replies; 3+ messages in thread
From: javanthropus (Jeremy Bopp) @ 2022-09-04 23:06 UTC (permalink / raw)
  To: ruby-core

Issue #18995 has been reported by javanthropus (Jeremy Bopp).

----------------------------------------
Bug #18995: IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
https://bugs.ruby-lang.org/issues/18995

* Author: javanthropus (Jeremy Bopp)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
This script demonstrates the behavior:

```ruby
def show(io)
  printf(
    "external encoding: %-25p  internal encoding: %-25p\n",
    io.external_encoding,
    io.internal_encoding
  )
end

Encoding.default_external = 'iso-8859-1'
Encoding.default_internal = 'iso-8859-2'

File.open('/dev/null') do |f|
  f.set_encoding('utf-8', nil)
  show(f)                             # f.internal_encoding is iso-8859-2, as expected

  f.set_encoding('utf-8', 'invalid')
  show(f)                             # f.internal_encoding is now iso-8859-1!

  Encoding.default_external = 'iso-8859-3'
  Encoding.default_internal = 'iso-8859-4'
  show(f)                             # f.internal_encoding is now iso-8859-3!
end
```

In the 1st case, we see that the IO's internal encoding is set to the current setting of Encoding.default_internal. In the 2nd case, the IO's internal encoding is set to Encoding.default_external instead. The 3rd case is more interesting because it shows that the IO's internal encoding is actually following the current setting of Encoding.default_external. It didn't just copy it when #set_encoding was called. It changes whenever Encoding.default_external changes.

What should the correct behavior be?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [ruby-core:109902] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
  2022-09-04 23:06 [ruby-core:109842] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding javanthropus (Jeremy Bopp)
@ 2022-09-15 13:28 ` javanthropus (Jeremy Bopp)
  2024-05-05 15:20 ` [ruby-core:117778] " javanthropus (Jeremy Bopp) via ruby-core
  1 sibling, 0 replies; 3+ messages in thread
From: javanthropus (Jeremy Bopp) @ 2022-09-15 13:28 UTC (permalink / raw)
  To: ruby-core

Issue #18995 has been updated by javanthropus (Jeremy Bopp).


Can anyone confirm that this is a bug and not a misunderstanding?  It looks like the changes to fix this will require a fair bit of refactoring, and there don't yet appear to be any tests around the various cases for arguments to `IO#set_encoding` where `IO#internal_encoding` and `IO#external_encoding` are checked.  I found tests around various ways of opening files and pipes with encoding arguments which do check the resulting internal and external encodings of the IO object, but none of those test these corner cases.

----------------------------------------
Bug #18995: IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
https://bugs.ruby-lang.org/issues/18995#change-99145

* Author: javanthropus (Jeremy Bopp)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
This script demonstrates the behavior:

```ruby
def show(io)
  printf(
    "external encoding: %-25p  internal encoding: %-25p\n",
    io.external_encoding,
    io.internal_encoding
  )
end

Encoding.default_external = 'iso-8859-1'
Encoding.default_internal = 'iso-8859-2'

File.open('/dev/null') do |f|
  f.set_encoding('utf-8', nil)
  show(f)                             # f.internal_encoding is iso-8859-2, as expected

  f.set_encoding('utf-8', 'invalid')
  show(f)                             # f.internal_encoding is now iso-8859-1!

  Encoding.default_external = 'iso-8859-3'
  Encoding.default_internal = 'iso-8859-4'
  show(f)                             # f.internal_encoding is now iso-8859-3!
end
```

In the 1st case, we see that the IO's internal encoding is set to the current setting of Encoding.default_internal. In the 2nd case, the IO's internal encoding is set to Encoding.default_external instead. The 3rd case is more interesting because it shows that the IO's internal encoding is actually following the current setting of Encoding.default_external. It didn't just copy it when #set_encoding was called. It changes whenever Encoding.default_external changes.

What should the correct behavior be?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [ruby-core:117778] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
  2022-09-04 23:06 [ruby-core:109842] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding javanthropus (Jeremy Bopp)
  2022-09-15 13:28 ` [ruby-core:109902] " javanthropus (Jeremy Bopp)
@ 2024-05-05 15:20 ` javanthropus (Jeremy Bopp) via ruby-core
  1 sibling, 0 replies; 3+ messages in thread
From: javanthropus (Jeremy Bopp) via ruby-core @ 2024-05-05 15:20 UTC (permalink / raw)
  To: ruby-core; +Cc: javanthropus (Jeremy Bopp)

Issue #18995 has been updated by javanthropus (Jeremy Bopp).


@jeremyevans, did you ever take a look at this issue when I referenced it in #18899?  The behavior is unchanged in Ruby 3.3.

The script above prints the following:
```
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-2>   
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-1>   
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-3>
```

I expected it to print this:
```
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-2>
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-2>
external encoding: #<Encoding:UTF-8>          internal encoding: #<Encoding:ISO-8859-4>
```

----------------------------------------
Bug #18995: IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
https://bugs.ruby-lang.org/issues/18995#change-108185

* Author: javanthropus (Jeremy Bopp)
* Status: Open
* ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
This script demonstrates the behavior:

```ruby
def show(io)
  printf(
    "external encoding: %-25p  internal encoding: %-25p\n",
    io.external_encoding,
    io.internal_encoding
  )
end

Encoding.default_external = 'iso-8859-1'
Encoding.default_internal = 'iso-8859-2'

File.open('/dev/null') do |f|
  f.set_encoding('utf-8', nil)
  show(f)                             # f.internal_encoding is iso-8859-2, as expected

  f.set_encoding('utf-8', 'invalid')
  show(f)                             # f.internal_encoding is now iso-8859-1!

  Encoding.default_external = 'iso-8859-3'
  Encoding.default_internal = 'iso-8859-4'
  show(f)                             # f.internal_encoding is now iso-8859-3!
end
```

In the 1st case, we see that the IO's internal encoding is set to the current setting of Encoding.default_internal. In the 2nd case, the IO's internal encoding is set to Encoding.default_external instead. The 3rd case is more interesting because it shows that the IO's internal encoding is actually following the current setting of Encoding.default_external. It didn't just copy it when #set_encoding was called. It changes whenever Encoding.default_external changes.

What should the correct behavior be?



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-05 15:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-04 23:06 [ruby-core:109842] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding javanthropus (Jeremy Bopp)
2022-09-15 13:28 ` [ruby-core:109902] " javanthropus (Jeremy Bopp)
2024-05-05 15:20 ` [ruby-core:117778] " javanthropus (Jeremy Bopp) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).