ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "javanthropus (Jeremy Bopp)" <noreply@ruby-lang.org>
To: ruby-core@ruby-lang.org
Subject: [ruby-core:109902] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
Date: Thu, 15 Sep 2022 13:28:50 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-99145.20220915132850.692@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-18995.20220904230654.692@ruby-lang.org

Issue #18995 has been updated by javanthropus (Jeremy Bopp).


Can anyone confirm that this is a bug and not a misunderstanding?  It looks like the changes to fix this will require a fair bit of refactoring, and there don't yet appear to be any tests around the various cases for arguments to `IO#set_encoding` where `IO#internal_encoding` and `IO#external_encoding` are checked.  I found tests around various ways of opening files and pipes with encoding arguments which do check the resulting internal and external encodings of the IO object, but none of those test these corner cases.

----------------------------------------
Bug #18995: IO#set_encoding sometimes set an IO's internal encoding to the default external encoding
https://bugs.ruby-lang.org/issues/18995#change-99145

* Author: javanthropus (Jeremy Bopp)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
This script demonstrates the behavior:

```ruby
def show(io)
  printf(
    "external encoding: %-25p  internal encoding: %-25p\n",
    io.external_encoding,
    io.internal_encoding
  )
end

Encoding.default_external = 'iso-8859-1'
Encoding.default_internal = 'iso-8859-2'

File.open('/dev/null') do |f|
  f.set_encoding('utf-8', nil)
  show(f)                             # f.internal_encoding is iso-8859-2, as expected

  f.set_encoding('utf-8', 'invalid')
  show(f)                             # f.internal_encoding is now iso-8859-1!

  Encoding.default_external = 'iso-8859-3'
  Encoding.default_internal = 'iso-8859-4'
  show(f)                             # f.internal_encoding is now iso-8859-3!
end
```

In the 1st case, we see that the IO's internal encoding is set to the current setting of Encoding.default_internal. In the 2nd case, the IO's internal encoding is set to Encoding.default_external instead. The 3rd case is more interesting because it shows that the IO's internal encoding is actually following the current setting of Encoding.default_external. It didn't just copy it when #set_encoding was called. It changes whenever Encoding.default_external changes.

What should the correct behavior be?



-- 
https://bugs.ruby-lang.org/

  reply	other threads:[~2022-09-15 13:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04 23:06 [ruby-core:109842] [Ruby master Bug#18995] IO#set_encoding sometimes set an IO's internal encoding to the default external encoding javanthropus (Jeremy Bopp)
2022-09-15 13:28 ` javanthropus (Jeremy Bopp) [this message]
2024-05-05 15:20 ` [ruby-core:117778] " javanthropus (Jeremy Bopp) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-99145.20220915132850.692@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).