ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
@ 2022-02-08  9:08 byroot (Jean Boussier)
  2022-02-08  9:21 ` [ruby-core:107515] " duerst
                   ` (45 more replies)
  0 siblings, 46 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-08  9:08 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been reported by byroot (Jean Boussier).

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107515] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
@ 2022-02-08  9:21 ` duerst
  2022-02-08  9:28 ` [ruby-core:107516] " byroot (Jean Boussier)
                   ` (44 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: duerst @ 2022-02-08  9:21 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by duerst (Martin Dürst).


Well, it's actually not just binary. Binary would mean that none of the bytes have any 'meaning' (i.e. characters) assigned to them. But ASCII-8BIT actually has character 'meaning' assigned to the ASCII range.
You can for example do the following:
```Ruby
u = (b = "abcde".force_encoding('ASCII-8BIT')).encode('UTF-8')
```
This gives you the string "abcde" with the encoding UTF-8. This shows that the lower 7 bits are interpreted the same as US-ASCII. The range with the 8th bit set, on the other hand, is just binary values, so
```Ruby
"\xCD".force_encoding('ASCII-8BIT').encode('UTF-8')
```
produces this error:
```
Encoding::UndefinedConversionError ("\xCD" from ASCII-8BIT to UTF-8)
```

I choose UTF-8 as the target encoding because that contains all of Unicode, so the error cannot be because the source character doesn't exist in the target encoding.

So there's indeed some complexity here, but it's not exactly what you think.


----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96420

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107516] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
  2022-02-08  9:21 ` [ruby-core:107515] " duerst
@ 2022-02-08  9:28 ` byroot (Jean Boussier)
  2022-02-08  9:41 ` [ruby-core:107517] " naruse (Yui NARUSE)
                   ` (43 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-08  9:28 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


@duerst I'm aware of this, but I don't quite see how it's a concern. It's a fairly subtle behavior, and I doubt the `ASCII-8BIT` name particularly reveal it.

Also nitpick, but a better example would be:

```ruby
"\xC3\xA9".b.encode(Encoding::UTF_8) # => Encoding::UndefinedConversionError
```

Since it's valid UTF-8.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96421

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107517] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
  2022-02-08  9:21 ` [ruby-core:107515] " duerst
  2022-02-08  9:28 ` [ruby-core:107516] " byroot (Jean Boussier)
@ 2022-02-08  9:41 ` naruse (Yui NARUSE)
  2022-02-08 10:15 ` [ruby-core:107518] " Eregon (Benoit Daloze)
                   ` (42 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: naruse (Yui NARUSE) @ 2022-02-08  9:41 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by naruse (Yui NARUSE).


duerst (Martin Dürst) wrote in #note-1:
> Well, it's actually not just binary. Binary would mean that none of the bytes have any 'meaning' (i.e. characters) assigned to them. But ASCII-8BIT actually has character 'meaning' assigned to the ASCII range.

I agree the principle.
But we should consider this proposal as "ASCII range of binary data in the world is usually ASCII. Why you call them as complex name: ASCII-8BIT?"

I think the name of the encoding is a communication tool. We should compare pros and cons between ASCII-8BIT and BINARY.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96422

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107518] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (2 preceding siblings ...)
  2022-02-08  9:41 ` [ruby-core:107517] " naruse (Yui NARUSE)
@ 2022-02-08 10:15 ` Eregon (Benoit Daloze)
  2022-02-08 10:21 ` [ruby-core:107519] " Eregon (Benoit Daloze)
                   ` (41 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-02-08 10:15 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


+1000 for this, I think ASCII-8BIT is always extremely confusing, and BINARY is much more revealing (= we don't know what the actual encoding is, or it might be binary data and not text).
I've seen many Ruby users confused by this.
I'm not sure why I never thought to propose it here TBH.

I've literally never used the `Encoding::ASCII_8BIT` form in code (and rarely if ever seen it) but `Encoding::BINARY` many times.

The property that bytes < 128 are interpreted as US-ASCII is nothing special, every `Encoding#ascii_compatible?` behaves like that.
And almost all non-dummy Ruby encodings are `#ascii_compatible?`, the only two exceptions are UTF-16/32 (both LE/BE).

Two things particularly confusing about the name ASCII-8BIT:
* It's completely unclear it might mean binary data or unknown encoding
* ISO-8859-* and many other encodings are 8-bit ascii-compatible encodings. Yet ASCII-8BIT which name seems to imply something close is nothing like that (the 8th bit is undefined, uninterpreted but valid).

(FWIW JCodings, the Java library for Ruby encodings has ASCIIEncoding.INSTANCE for BINARY, that's even worse as it's even more confusing with US-ASCII, I've been thinking how to fix that in JCodings in a compatible way)

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96423

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107519] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (3 preceding siblings ...)
  2022-02-08 10:15 ` [ruby-core:107518] " Eregon (Benoit Daloze)
@ 2022-02-08 10:21 ` Eregon (Benoit Daloze)
  2022-02-09  9:51 ` [ruby-core:107527] " naruse (Yui NARUSE)
                   ` (40 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-02-08 10:21 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


BTW Python has the "bytes" encoding and it behaves very similar to Ruby's BINARY encoding (it's a different type in Python but that's details).
That's also a more telling name than ASCII-8BIT.
BINARY is better for Ruby because it's already an established name for it.

There is also already `String#b` for binary, it's not `String#a` or so.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96424

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107527] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (4 preceding siblings ...)
  2022-02-08 10:21 ` [ruby-core:107519] " Eregon (Benoit Daloze)
@ 2022-02-09  9:51 ` naruse (Yui NARUSE)
  2022-02-09 16:52 ` [ruby-core:107531] " tenderlovemaking (Aaron Patterson)
                   ` (39 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: naruse (Yui NARUSE) @ 2022-02-09  9:51 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by naruse (Yui NARUSE).


The name `ASCII-8BIT` expresses how we deeply considered about what "binary" is. Ruby 1.9's encoding system is serial invents. Ruby invented some ideas: ASCII COMPATIBLE and ASCII-8BIT.

> Two things particularly confusing about the name ASCII-8BIT:
>
> * It's completely unclear it might mean binary data or unknown encoding
> * ISO-8859-* and many other encodings are 8-bit ascii-compatible encodings. Yet ASCII-8BIT which name seems to imply something close is nothing like that (the 8th bit is undefined, uninterpreted but valid).

Your two questions raises very good points. The answer for them is tightly coupled with the name `ASCII-8BIT`.

> * It's completely unclear it might mean binary data or unknown encoding

I want to ask you that how often you can actually distinguish them. Ruby's assumption is that developers cannot distinguish them in normal use cases. If so, Ruby may not provide two objects. If Ruby provide only one object for them, developers don't need clarify it.

> ISO-8859-* and many other encodings are 8-bit ascii-compatible encodings. Yet ASCII-8BIT which name seems to imply something close is nothing like that (the 8th bit is undefined, uninterpreted but valid).

This is very good question. Ruby's answer is "yes, ASCII-8BIT is similar to ISO-8859-*". As you say, an ASCII-8BIT string's 8-bit range is undefined. But Ruby doesn't matter that. In the real world such phenomenon is sometimes discovered.

For example the charset of HTTP Header is usually ISO-8859-1. Many languages struggled how to handle these octets. Python and .NET handles this as binary. It prevents to leverage powerful String methods to such binary data. Ruby handles it as ASCII-8BIT. Ruby's insight is binaries Ruby handles is usually such octets. The name `ASCII-8BIT` reflects such insight.

Therefore the conclusion for your question is that they are just what the real world is. The name just reflects that.


Anyway Rails programmers don't need such understanding usually. If renaming cares people who just hit the surface of this chaos, it might be worth considered, though changing encoding.name may hit the compatibility issue.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96438

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107531] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (5 preceding siblings ...)
  2022-02-09  9:51 ` [ruby-core:107527] " naruse (Yui NARUSE)
@ 2022-02-09 16:52 ` tenderlovemaking (Aaron Patterson)
  2022-02-09 17:34 ` [ruby-core:107532] " Eregon (Benoit Daloze)
                   ` (38 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: tenderlovemaking (Aaron Patterson) @ 2022-02-09 16:52 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by tenderlovemaking (Aaron Patterson).


First, I agree with this proposal.  Second, I think this example should raise an exception:

```ruby
u = (b = "abcde".force_encoding('ASCII-8BIT')).encode('UTF-8')
```

But I can open a different ticket for that.  The point I actually want to make is that I've never seen this use case in the wild.  100% of the cases I've seen for `force_encoding('ASCII-8BIT')` are when the developer knows the string is binary (or unknown) data and they want to treat it as binary / unknown data *not* as "might be US-ASCII sometimes".  The name "binary" would more accurately reflect real world usage IMO.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96443

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107532] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (6 preceding siblings ...)
  2022-02-09 16:52 ` [ruby-core:107531] " tenderlovemaking (Aaron Patterson)
@ 2022-02-09 17:34 ` Eregon (Benoit Daloze)
  2022-02-09 17:49 ` [ruby-core:107533] " jeremyevans0 (Jeremy Evans)
                   ` (37 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-02-09 17:34 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


naruse (Yui NARUSE) wrote in #note-6:
> I want to ask you that how often you can actually distinguish them.

I think in many cases it is possible to distinguish.
For instance, an HTTP header might initially be in the binary encoding and mean "unknown encoding" (can often find the real encoding through `Content-Type`'s charset, but not always and could be invalid)
Another example is `socket.read(N)` which might be actual binary data (e.g. for a binary protocol), or text and the actual encoding depends then on what's communicated on that socket.

And I would think most Ruby programs need to handle the binary encoding somehow, and can only leave a String as binary if it's only bytes < 128, otherwise things break.

> If so, Ruby may not provide two objects.

I don't think two different "binary" Encodings are useful, one seems enough in practice and can be used for both meanings, which are very close (as a binary byte array, or a marker for unknown encoding).

> This is very good question. Ruby's answer is "yes, ASCII-8BIT is similar to ISO-8859-*". As you say, an ASCII-8BIT string's 8-bit range is undefined. But Ruby doesn't matter that. In the real world such phenomenon is sometimes discovered.

I think such situations need to be handled somehow and given a real encoding.
"ASCII-8BIT" feels confusing because there is no such thing as a "8th" bit of ASCII, without a more specific encoding which defines that.
So it really means unknown, and "ASCII-8BIT" seems far from "unknown encoding".

Also "ASCII-8BIT" sounds clearly wrong if it's actual binary data (which might not use any ASCII concept at all).
The behavior that this pseudo-encoding is ASCII compatible and e.g. shows byte 65 as `A` is fine, after all hexdump utilities typically do the same for bytes < 128 and it's helpful if there is text in the middle of binary data.

> Anyway Rails programmers don't need such understanding usually. If renaming cares people who just hit the surface of this chaos, it might be worth considered, though changing encoding.name may hit the compatibility issue.

Not just Rails programmers, I think most Ruby programmers are confused when they see ASCII-8BIT, and not only the first time.
I believe renaming to BINARY would help them understand the meaning much better.

@tenderlovemaking One issue is e.g. error messages in CRuby are encoded in the binary encoding (probably for the legacy reason of using `rb_str_new()`), and so that would be I think a wide-reaching change with a high chance of causing real compatibility issues, it seems too incompatible to me.
As an example, the encoding negotiation rules (e.g. for concatenation) in Ruby are all based around whether one side is `#ascii_only?` and if yes then just use the other side's encoding. Preventing to e.g. concat with a ASCII-only binary string would break lots of programs.
Anyway, I think that's a separate issue indeed.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96444

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107533] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (7 preceding siblings ...)
  2022-02-09 17:34 ` [ruby-core:107532] " Eregon (Benoit Daloze)
@ 2022-02-09 17:49 ` jeremyevans0 (Jeremy Evans)
  2022-02-09 23:35 ` [ruby-core:107537] " tenderlovemaking (Aaron Patterson)
                   ` (36 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: jeremyevans0 (Jeremy Evans) @ 2022-02-09 17:49 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by jeremyevans0 (Jeremy Evans).


I'm also in favor of renaming `ASCII-8BIT` to `BINARY`, but I don't have strong feelings about it.  I'm strongly against breaking `String#encode` for binary strings.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96445

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107537] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (8 preceding siblings ...)
  2022-02-09 17:49 ` [ruby-core:107533] " jeremyevans0 (Jeremy Evans)
@ 2022-02-09 23:35 ` tenderlovemaking (Aaron Patterson)
  2022-02-10  7:53 ` [ruby-core:107549] " duerst
                   ` (35 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: tenderlovemaking (Aaron Patterson) @ 2022-02-09 23:35 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by tenderlovemaking (Aaron Patterson).


jeremyevans0 (Jeremy Evans) wrote in #note-9:
> I'm also in favor of renaming `ASCII-8BIT` to `BINARY`, but I don't have strong feelings about it.  I'm strongly against breaking `String#encode` for binary strings.

Ya, sorry, I should be more clear.  I think concatenation shouldn't try to guess at the encoding.  If the user calls "encode" then it seems fine.

Eregon (Benoit Daloze) wrote in #note-8:
> As an example, the encoding negotiation rules (e.g. for concatenation) in Ruby are all based around whether one side is `#ascii_only?` and if yes then just use the other side's encoding. Preventing to e.g. concat with a ASCII-only binary string would break lots of programs.
> Anyway, I think that's a separate issue indeed.

Yes, this is the issue I have.  IME the code is already broken, it just hasn't had the right input to break it yet (where would the binary string come from other than an external location?).  Regardless, I made a ticket here: https://bugs.ruby-lang.org/issues/18579 😄

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96448

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107549] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (9 preceding siblings ...)
  2022-02-09 23:35 ` [ruby-core:107537] " tenderlovemaking (Aaron Patterson)
@ 2022-02-10  7:53 ` duerst
  2022-02-10  9:11 ` [ruby-core:107550] " byroot (Jean Boussier)
                   ` (34 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: duerst @ 2022-02-10  7:53 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by duerst (Martin Dürst).


Eregon (Benoit Daloze) wrote in #note-4:

> The property that bytes < 128 are interpreted as US-ASCII is nothing special, every `Encoding#ascii_compatible?` behaves like that.
> And almost all non-dummy Ruby encodings are `#ascii_compatible?`, the only two exceptions are UTF-16/32 (both LE/BE).
> 
> Two things particularly confusing about the name ASCII-8BIT:
> * It's completely unclear it might mean binary data or unknown encoding

Well, binary data can be character data with unknown encoding (or with encoding not yet set), or it can be truly binary data (e.g. as in a .jpg file or .zip file,...).

> * ISO-8859-* and many other encodings are 8-bit ascii-compatible encodings. Yet ASCII-8BIT which name seems to imply something close is nothing like that (the 8th bit is undefined, uninterpreted but valid).

ASCII-8BIT is an 8-bit ascii-compatible encoding, isn't it?

I think the idea of ASCII-8BIT goes back to the fact that in Ruby, many encodings can be used for source code, and as long as you only use ASCII in the code, it doesn't actually matter. That's to a large extent how Ruby 1.8 operated, and that was carried over into Ruby 1.9.

Now that the default source encoding is UTF-8, we have an encoding pragma for source files in other encodings, and so on, the importance of "something where we know ASCII is ASCII, but we are not sure about the upper half of the byte values" may be quite a bit less important.



----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96461

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107550] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (10 preceding siblings ...)
  2022-02-10  7:53 ` [ruby-core:107549] " duerst
@ 2022-02-10  9:11 ` byroot (Jean Boussier)
  2022-02-10 14:15 ` [ruby-core:107553] " Eregon (Benoit Daloze)
                   ` (33 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-10  9:11 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


> though changing encoding.name may hit the compatibility issue.

I personally don't think it's much of a concern, but if it is, then a possible alternative would be to only change `Encoding::ASCII_8BIT.inspect` so that it shows up as `BINARY` in `EncodingError` and such, but that `Encoding::ASCII_8BIT.name` is unchanged.

Unless people think this would be even more confusing.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96462

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107553] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (11 preceding siblings ...)
  2022-02-10  9:11 ` [ruby-core:107550] " byroot (Jean Boussier)
@ 2022-02-10 14:15 ` Eregon (Benoit Daloze)
  2022-02-17  9:14 ` [ruby-core:107619] " matz (Yukihiro Matsumoto)
                   ` (32 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-02-10 14:15 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


byroot (Jean Boussier) wrote in #note-12:
> > though changing encoding.name may hit the compatibility issue.
> 
> I personally don't think it's much of a concern

I agree, this sounds very unlikely to cause compatibility issues, and if it does it would be extremely rare.
I believe the vast majority of programs simply don't rely on `Encoding#name` values.
(and of course `Encoding.find(name)` would still work for both `"binary"` & `"ascii-8bit"`)



----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96465

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107619] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (12 preceding siblings ...)
  2022-02-10 14:15 ` [ruby-core:107553] " Eregon (Benoit Daloze)
@ 2022-02-17  9:14 ` matz (Yukihiro Matsumoto)
  2022-02-17  9:16 ` [ruby-core:107620] " byroot (Jean Boussier)
                   ` (31 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-02-17  9:14 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by matz (Yukihiro Matsumoto).

Status changed from Open to Rejected

I don't object to the proposal itself. But as @ko1 searched, there are so many gems that compare `Encoding#name` and `ASCII-8BIT`.
So I don't accept the proposal for the sake of compatibility.

Matz.
 

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96530

* Author: byroot (Jean Boussier)
* Status: Rejected
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107620] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (13 preceding siblings ...)
  2022-02-17  9:14 ` [ruby-core:107619] " matz (Yukihiro Matsumoto)
@ 2022-02-17  9:16 ` byroot (Jean Boussier)
  2022-02-17  9:24 ` [ruby-core:107621] " matz (Yukihiro Matsumoto)
                   ` (30 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-17  9:16 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


Can I make a counter proposal?

We could keep `Encoding#name` as `"ASCII-8BIT"`, but change `Encoding#inspect` and make sure `EncodingError` use the `BINARY` name in its error messages.

What do you think?

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96531

* Author: byroot (Jean Boussier)
* Status: Rejected
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107621] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (14 preceding siblings ...)
  2022-02-17  9:16 ` [ruby-core:107620] " byroot (Jean Boussier)
@ 2022-02-17  9:24 ` matz (Yukihiro Matsumoto)
  2022-02-17  9:27 ` [ruby-core:107622] " byroot (Jean Boussier)
                   ` (29 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-02-17  9:24 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by matz (Yukihiro Matsumoto).


Does this counter-proposal solve the original problem?
It seems it introduces another inconsistency (and possible confusion).

Matz.


----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96532

* Author: byroot (Jean Boussier)
* Status: Rejected
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107622] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (15 preceding siblings ...)
  2022-02-17  9:24 ` [ruby-core:107621] " matz (Yukihiro Matsumoto)
@ 2022-02-17  9:27 ` byroot (Jean Boussier)
  2022-02-17 13:30 ` [ruby-core:107634] " Eregon (Benoit Daloze)
                   ` (28 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-17  9:27 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


> Does this counter-proposal solve the original problem?

I believe so because the main way users are exposed to `ASCII-8BIT` is through `EncodingError`.

> It seems it introduces another inconsistency (and possible confusion).

Indeed, my personal belief is that `Encoding#name` is both an advanced API and one that you don't really want to use. So I think the few users that would encounter this inconsistency would have the background to not be tricked by it.

But ultimately this is your call.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96533

* Author: byroot (Jean Boussier)
* Status: Rejected
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107634] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (16 preceding siblings ...)
  2022-02-17  9:27 ` [ruby-core:107622] " byroot (Jean Boussier)
@ 2022-02-17 13:30 ` Eregon (Benoit Daloze)
  2022-02-17 13:58 ` [ruby-core:107636] " matz (Yukihiro Matsumoto)
                   ` (27 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-02-17 13:30 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


Link to the gem-codesearch results from @ko1: https://hackmd.io/koJLPz4eRXKzaaDvVqji7w#Feature-18576-Rename-ASCII-8BIT-encoding-to-BINARY-byroot

This seems very few usages and IMHO such gems should be fixed (if they are still used, probably not for most).
It's only 71 gems: https://gist.github.com/eregon/2b5de829d9aeb8b91b551fa05677b4db#file-gem-names

`str.encoding.name == "ASCII-8BIT"` is also needlessly slow and brittle.

It seems many matches are about old versions of rack/lint.rb and that's already fixed since https://github.com/rack/rack/pull/982.
nokogiri still uses it but that could be easily fixed: https://github.com/sparklemotion/nokogiri/blob/e324a91477fe3b199c95b52c3985647dd2aeb847/lib/nokogiri/html5/document.rb#L33

IMHO from a compatibility perspective it would be fair enough to change the Encoding#name too.
But I guess others will disagree, so I believe @byroot's proposal is still a big step forward (i.e. adding `def Encoding::BINARY.name; 'ASCII-8BIT'; end` or so for compatibility).

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96545

* Author: byroot (Jean Boussier)
* Status: Rejected
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107636] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (17 preceding siblings ...)
  2022-02-17 13:30 ` [ruby-core:107634] " Eregon (Benoit Daloze)
@ 2022-02-17 13:58 ` matz (Yukihiro Matsumoto)
  2022-02-17 14:00 ` [ruby-core:107637] " byroot (Jean Boussier)
                   ` (26 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-02-17 13:58 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by matz (Yukihiro Matsumoto).

Status changed from Rejected to Open

Making `Encoding#name` to return the name different from the encoding name is unacceptable.
Besides that, in general, compatibility issue is hard to estimate beforehand, so we tend to be very conservative.
If you (or someone) estimate the compatibility issue is minimal, and want to experiment to see if it's true during pre-release, I'd say go.
Will you?

Matz.


----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96548

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107637] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (18 preceding siblings ...)
  2022-02-17 13:58 ` [ruby-core:107636] " matz (Yukihiro Matsumoto)
@ 2022-02-17 14:00 ` byroot (Jean Boussier)
  2022-02-17 15:34 ` [ruby-core:107640] " byroot (Jean Boussier)
                   ` (25 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-17 14:00 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


> Will you?

I'd like to champion this. I already started opening pull requests on the affected gems.



----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96549

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107640] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (19 preceding siblings ...)
  2022-02-17 14:00 ` [ruby-core:107637] " byroot (Jean Boussier)
@ 2022-02-17 15:34 ` byroot (Jean Boussier)
  2022-02-19 10:59 ` [ruby-core:107666] " byroot (Jean Boussier)
                   ` (24 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-17 15:34 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


Ok, so I went over all 71 matches after filtering vendored code: https://gist.github.com/casperisfine/5a26c7b85f7d15c4acd63d62d67eafbb

I opened 31 pull requests, all where trivial changes `str.encoding.name == ""` -> `str.encoding == Encoding::BINARY` with the notable exception of `vcr` because it store the encoding names in files.

The vast majority of the matches are abandoned gems with no update since 2013 or older ( I still opened PRs when I could). Some are even just old versions of `rack` republished under another name.

The few high profiles gems impacted are:

  - Nokogiri: patch sent
  - VCR: patch sent
  - mongo: patch sent

That being said, it's impossible to measure how much proprietary code may use the same pattern.


----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96552

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107666] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (20 preceding siblings ...)
  2022-02-17 15:34 ` [ruby-core:107640] " byroot (Jean Boussier)
@ 2022-02-19 10:59 ` byroot (Jean Boussier)
  2022-02-21  8:23 ` [ruby-core:107680] " byroot (Jean Boussier)
                   ` (23 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-19 10:59 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


I prepared the patch for this: https://github.com/ruby/ruby/pull/5571

If there is no objections I'd like to merge it so it's part of the upcoming 3.2.0-preview1

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96584

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107680] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (21 preceding siblings ...)
  2022-02-19 10:59 ` [ruby-core:107666] " byroot (Jean Boussier)
@ 2022-02-21  8:23 ` byroot (Jean Boussier)
  2022-03-17  9:03 ` [ruby-core:107943] " matz (Yukihiro Matsumoto)
                   ` (22 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) @ 2022-02-21  8:23 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by byroot (Jean Boussier).


@matz could you confirm you are OK to merge the `ASCII-8BIT -> BINARY` rename for 3.2.0-preview1? 

I think the earlier this happens the more likely it can go well. So far all the PR I made in gems were received very positively.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96598

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107943] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (22 preceding siblings ...)
  2022-02-21  8:23 ` [ruby-core:107680] " byroot (Jean Boussier)
@ 2022-03-17  9:03 ` matz (Yukihiro Matsumoto)
  2022-03-17 11:08 ` [ruby-core:107944] " Eregon (Benoit Daloze)
                   ` (21 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-03-17  9:03 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by matz (Yukihiro Matsumoto).


The risk of compatibility has been reduced thanks to @byroot's effort, but probably there still are many applications potentially affected by the change. Considering the benefit (of being slightly more descriptive) and risk (of incompatibility), I don't think it pays.

Matz.




----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96893

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107944] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (23 preceding siblings ...)
  2022-03-17  9:03 ` [ruby-core:107943] " matz (Yukihiro Matsumoto)
@ 2022-03-17 11:08 ` Eregon (Benoit Daloze)
  2022-03-17 15:06 ` [ruby-core:107956] " larskanis (Lars Kanis)
                   ` (20 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) @ 2022-03-17 11:08 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by Eregon (Benoit Daloze).


I think it's worth changing, the current name is confusing to most Ruby users, and there were only 71 gems out of 170000+ gems, and those gems were patched.
It seems equally unlikely that many applications would depend on `enc.name == "ASCII-8BIT"`, and that those applications would update to latest Ruby.
If we don't change it now, we will probably never change it and stay forever with that confusing name, that seems really bad for future Ruby.

@matz How about we try it (as experimental or so) before the preview, and based on feedback keep it or revert it?
From your comment in #19 I thought that's what you offered.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96894

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:107956] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (24 preceding siblings ...)
  2022-03-17 11:08 ` [ruby-core:107944] " Eregon (Benoit Daloze)
@ 2022-03-17 15:06 ` larskanis (Lars Kanis)
  2023-12-06 12:36 ` [ruby-core:115604] " Eregon (Benoit Daloze) via ruby-core
                   ` (19 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: larskanis (Lars Kanis) @ 2022-03-17 15:06 UTC (permalink / raw)
  To: ruby-core

Issue #18576 has been updated by larskanis (Lars Kanis).


Having solved a lot of encoding issues for co-workers, especially on Windows, I'm with @Eregon. As the programmers best friend, I think it's worth to try out this minor incompatibility. At least compared to something like the [removal of rb_cData](https://github.com/ruby/ruby/commit/7c738ce5e649b82bdc1305d5c347e81886ee759a) which breaks lots of older gems, just for cleaning up the C-API (after 2 years of deprecation warnings).


----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96906

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:115604] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (25 preceding siblings ...)
  2022-03-17 15:06 ` [ruby-core:107956] " larskanis (Lars Kanis)
@ 2023-12-06 12:36 ` Eregon (Benoit Daloze) via ruby-core
  2023-12-20  8:44 ` [ruby-core:115813] " naruse (Yui NARUSE) via ruby-core
                   ` (18 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2023-12-06 12:36 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).



Target version set to 3.4



@matz Could we try this again for 3.4, soon after the 3.3 release?



Then there is plenty of time to discover any issue related to it (probably very few as gems have been patched, and applications using encoding names instead of encoding constants are likely very old and unlikely to use a recent Ruby).



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-105535



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:115813] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (26 preceding siblings ...)
  2023-12-06 12:36 ` [ruby-core:115604] " Eregon (Benoit Daloze) via ruby-core
@ 2023-12-20  8:44 ` naruse (Yui NARUSE) via ruby-core
  2024-01-11 10:26 ` [ruby-core:116170] " Eregon (Benoit Daloze) via ruby-core
                   ` (17 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: naruse (Yui NARUSE) via ruby-core @ 2023-12-20  8:44 UTC (permalink / raw)
  To: ruby-core; +Cc: naruse (Yui NARUSE)

Issue #18576 has been updated by naruse (Yui NARUSE).





I strongly object that we change Encoding#name of ASCII-8BIT encoding into "BINARY" because of compatibility.

I don't want people to fix the code which are correctly running now.



However supporting people who newly writing a code is reasonable.

I agree to add more information in Encoding#inspect and error message.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-105762



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116170] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (27 preceding siblings ...)
  2023-12-20  8:44 ` [ruby-core:115813] " naruse (Yui NARUSE) via ruby-core
@ 2024-01-11 10:26 ` Eregon (Benoit Daloze) via ruby-core
  2024-01-11 10:30 ` [ruby-core:116172] " Eregon (Benoit Daloze) via ruby-core
                   ` (16 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-01-11 10:26 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).





@naruse Do you have evidence of (latest release and not ancient) gems or applications comparing `encoding.name` to `'ASCII-8BIT'`?

I think it's so obviously a bad idea to compare the encoding name as a String, AFAIK there was never a valid reason to use it (over `enc == Encoding::BINARY`, which works since Ruby 1.9) and it's inefficient, brittle and unnecessary.



FWIW https://github.com/search?q=%22name+%3D%3D+%27ASCII-8BIT%27%22&type=code&p=1 shows very few matches and it's mostly copies of old VCR code.

The chance of that code running on Ruby 3.4+ seems almost nonexistent, there would likely be many more serious compatibility issues with such old code (e.g. kwargs changes).

And fixing it is really easy.



@matz Can we experiment for 3.4?

If we have pushback based on actual code then let's go more conservative, but otherwise I think we should do the clean fix here.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106183



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116172] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (28 preceding siblings ...)
  2024-01-11 10:26 ` [ruby-core:116170] " Eregon (Benoit Daloze) via ruby-core
@ 2024-01-11 10:30 ` Eregon (Benoit Daloze) via ruby-core
  2024-01-11 10:35 ` [ruby-core:116173] " byroot (Jean Boussier) via ruby-core
                   ` (15 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-01-11 10:30 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).





Also given the efforts of @byroot in https://bugs.ruby-lang.org/issues/18576#note-21 and the offer from @matz in https://bugs.ruby-lang.org/issues/18576#note-19, I'd like to do exactly what matz said:

> If you (or someone) estimate the compatibility issue is minimal, and want to experiment to see if it's true during pre-release, I'd say go.



I estimate it to be minimal.

We can know from the experiment if it's true or not, there are more than 11 months before 3.4, so plenty of time to discover any potential issue with it.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106185



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116173] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (29 preceding siblings ...)
  2024-01-11 10:30 ` [ruby-core:116172] " Eregon (Benoit Daloze) via ruby-core
@ 2024-01-11 10:35 ` byroot (Jean Boussier) via ruby-core
  2024-01-17  8:26 ` [ruby-core:116266] " naruse (Yui NARUSE) via ruby-core
                   ` (14 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-01-11 10:35 UTC (permalink / raw)
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #18576 has been updated by byroot (Jean Boussier).





I would also like to try this again for 3.4, if we do it early, the potential remaining issue will have a chance to be noticed with the first preview release.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106186



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116266] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (30 preceding siblings ...)
  2024-01-11 10:35 ` [ruby-core:116173] " byroot (Jean Boussier) via ruby-core
@ 2024-01-17  8:26 ` naruse (Yui NARUSE) via ruby-core
  2024-01-17  8:36 ` [ruby-core:116268] " byroot (Jean Boussier) via ruby-core
                   ` (13 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: naruse (Yui NARUSE) via ruby-core @ 2024-01-17  8:26 UTC (permalink / raw)
  To: ruby-core; +Cc: naruse (Yui NARUSE)

Issue #18576 has been updated by naruse (Yui NARUSE).





Even if you "fix" gems, the number of affected gems insists there are as many as private rails applications.

Such incompatibility is not acceptable.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106288



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116268] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (31 preceding siblings ...)
  2024-01-17  8:26 ` [ruby-core:116266] " naruse (Yui NARUSE) via ruby-core
@ 2024-01-17  8:36 ` byroot (Jean Boussier) via ruby-core
  2024-01-17  9:19 ` [ruby-core:116269] " zverok (Victor Shepelev) via ruby-core
                   ` (12 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-01-17  8:36 UTC (permalink / raw)
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #18576 has been updated by byroot (Jean Boussier).





@naruse no one is denying that there is private code out there that will be broken by such change. The question is how much and how hard it would be to detect and fix, and how much the change improve Ruby for its users.



We regularly make changes with much more breaking potential. So that alone isn't a reason to refuse the change in my opinion.



But if there is consensus that the cost/benefit isn't positive, then I'd like to propose again:



> We could keep Encoding#name as "ASCII-8BIT", but change Encoding#inspect and make sure EncodingError use the BINARY name in its error messages.



But slightly modified:



I'd like to change `Encoding::BINARY.inspect` from `"#<Encoding:ASCII-8BIT>"` to `"#<Encoding:ASCII-8BIT (BINARY)>"`.



Would that be acceptable?



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106290



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116269] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (32 preceding siblings ...)
  2024-01-17  8:36 ` [ruby-core:116268] " byroot (Jean Boussier) via ruby-core
@ 2024-01-17  9:19 ` zverok (Victor Shepelev) via ruby-core
  2024-01-18  0:58 ` [ruby-core:116280] " Dan0042 (Daniel DeLorme) via ruby-core
                   ` (11 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: zverok (Victor Shepelev) via ruby-core @ 2024-01-17  9:19 UTC (permalink / raw)
  To: ruby-core; +Cc: zverok (Victor Shepelev)

Issue #18576 has been updated by zverok (Victor Shepelev).


> Such incompatibility is not acceptable.

In all honesty, a selective application of this dogma doesn’t always look justified.
For better or worse, we break compatibility constantly.

One of the recent telling examples was the removal of `File.exists?` (an alias of `.exist?`), which, while "deprecated a long time ago," actually 
* broke a lot of gems/other software (because even with the "typically we have bare words as predicates" rule, it was more natural for people to write `exists?`, so while it was available, a _lot_ of code was using it); 
* improved absolutely nothing in Ruby’s friendliness and learnability save for "removed a reason to ask for `String#starts_with?` and similar methods" (while, say, Rails continues to prefer third-person verbs in its core extensions, like `String#starts_with?` or `Range#overlaps?`)

OTOH, renaming the unfortunately named encoding:
* makes Ruby friendlier (as a mentor, I saw a _lot_ of people confused with `ASCII-8BIT`),
* breaks not a lot of code: while fixing gems wouldn't fix _all_ of its usages, the (minuscule) amount of gems to fix gives a good estimation of how frequently this might be a problem,
* breaks code that mostly written in the "unexpected" way, so rethinking it might be a good idea anyway.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-106291

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
* Target version: 3.4
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116280] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (33 preceding siblings ...)
  2024-01-17  9:19 ` [ruby-core:116269] " zverok (Victor Shepelev) via ruby-core
@ 2024-01-18  0:58 ` Dan0042 (Daniel DeLorme) via ruby-core
  2024-01-18 15:19 ` [ruby-core:116298] " Eregon (Benoit Daloze) via ruby-core
                   ` (10 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Dan0042 (Daniel DeLorme) via ruby-core @ 2024-01-18  0:58 UTC (permalink / raw)
  To: ruby-core; +Cc: Dan0042 (Daniel DeLorme)

Issue #18576 has been updated by Dan0042 (Daniel DeLorme).





tenderlovemaking (Aaron Patterson) wrote in #note-7:

> I think this example should raise an exception:

> 

> ```ruby

> u = (b = "abcde".force_encoding('ASCII-8BIT')).encode('UTF-8')

> ```



I'm worried about the above misconception. No, this example shouldn't raise an exception, because being ascii-compatible is the entire reason there's "ASCII" in "ASCII-8BIT". If even @tenderlovemaking can have this misconception, I would wager it's a fairly common one. And if the encoding was renamed to "BINARY" it would further encourage the misconception. We'd wind up with a kind of Frankenstein encoding that pretends to be true binary by its name, but having the behavior of ascii-compatible encodings. This thread has several people currently agreeing that the ascii-compatible behavior should not change, but if the name was changed I can easily predict some people will call for a change in behavior because the name "binary" has that overtone.



zverok (Victor Shepelev) wrote in #note-34:

> For better or worse, we break compatibility constantly.

> One of the recent telling examples was the removal of `File.exists?`



I won't say we can never break compatibility, but there's a very big qualitative difference here. If you run into `File.exists?`, the program simply crashes with NoMethodError. If you run into `enc.name == "ASCII-8BIT"` the return value changes from true to false; the program may crash later or not, the bug can remain undetected for a long time, there's a potential for corrupted data. This is 2-3 orders of magnitude harder to debug than NoMethodError. Even if not many people are affected by this, it's a very nasty kind of incompatibility.



byroot (Jean Boussier) wrote in #note-15:

> We could keep `Encoding#name` as `"ASCII-8BIT"`, but change `Encoding#inspect` and make sure `EncodingError` use the `BINARY` name in its error messages.



I would really like that.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106302



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116298] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (34 preceding siblings ...)
  2024-01-18  0:58 ` [ruby-core:116280] " Dan0042 (Daniel DeLorme) via ruby-core
@ 2024-01-18 15:19 ` Eregon (Benoit Daloze) via ruby-core
  2024-01-21  9:46 ` [ruby-core:116355] " byroot (Jean Boussier) via ruby-core
                   ` (9 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-01-18 15:19 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).





I think everyone's opinion on the thread is pretty clear and not everyone agrees, that's fine.



@matz Could you decide whether it's OK to experiment with the Encoding#name changing to "BINARY" or not?

If not, is @byroot's proposal in https://bugs.ruby-lang.org/issues/18576#note-33 accepted?



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106320



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116355] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (35 preceding siblings ...)
  2024-01-18 15:19 ` [ruby-core:116298] " Eregon (Benoit Daloze) via ruby-core
@ 2024-01-21  9:46 ` byroot (Jean Boussier) via ruby-core
  2024-01-22 10:15 ` [ruby-core:116363] " Eregon (Benoit Daloze) via ruby-core
                   ` (8 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-01-21  9:46 UTC (permalink / raw)
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #18576 has been updated by byroot (Jean Boussier).





> @byroot's proposal



To caarify what I'm proposing if the rename is not acceptable is:



```ruby

>> Encoding::BINARY

=> #<Encoding:ASCII-8BIT>

```



becomes:



```ruby

>> Encoding::BINARY

=> #<Encoding:ASCII-8BIT (BINARY)>

```





And:



```ruby

>> "fée" + "fée".b

(irb):8:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



Becomes:



```ruby

>> "fée" + "fée".b

(irb):8:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (BINARY) (Encoding::CompatibilityError)

```







----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106378



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116363] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (36 preceding siblings ...)
  2024-01-21  9:46 ` [ruby-core:116355] " byroot (Jean Boussier) via ruby-core
@ 2024-01-22 10:15 ` Eregon (Benoit Daloze) via ruby-core
  2024-01-24  6:47 ` [ruby-core:116393] " shyouhei (Shyouhei Urabe) via ruby-core
                   ` (7 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-01-22 10:15 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).





I think for that last example, omitting `ASCII-8BIT` would be much clearer, also two sets of parens seems too much.

So:

```

(irb):8:in `+': incompatible character encodings: UTF-8 and BINARY (Encoding::CompatibilityError)

```

Otherwise we would likely still have the confusion that "ASCII" is not compatible with UTF-8 (which is untrue of course).



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106383



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116393] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (37 preceding siblings ...)
  2024-01-22 10:15 ` [ruby-core:116363] " Eregon (Benoit Daloze) via ruby-core
@ 2024-01-24  6:47 ` shyouhei (Shyouhei Urabe) via ruby-core
  2024-02-14  9:32 ` [ruby-core:116738] " naruse (Yui NARUSE) via ruby-core
                   ` (6 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: shyouhei (Shyouhei Urabe) via ruby-core @ 2024-01-24  6:47 UTC (permalink / raw)
  To: ruby-core; +Cc: shyouhei (Shyouhei Urabe)

Issue #18576 has been updated by shyouhei (Shyouhei Urabe).





@naruse is actually positive for changing error messages (see #note-28).  I guess everybody here is agreeing to @byroot's list of proposed changes in #note-37 (except wording)?





----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106416



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116738] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (38 preceding siblings ...)
  2024-01-24  6:47 ` [ruby-core:116393] " shyouhei (Shyouhei Urabe) via ruby-core
@ 2024-02-14  9:32 ` naruse (Yui NARUSE) via ruby-core
  2024-02-19 12:38 ` [ruby-core:116845] " byroot (Jean Boussier) via ruby-core
                   ` (5 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: naruse (Yui NARUSE) via ruby-core @ 2024-02-14  9:32 UTC (permalink / raw)
  To: ruby-core; +Cc: naruse (Yui NARUSE)

Issue #18576 has been updated by naruse (Yui NARUSE).





byroot (Jean Boussier) wrote in #note-33:

> I'd like to change `Encoding::BINARY.inspect` from `"#<Encoding:ASCII-8BIT>"` to `"#<Encoding:ASCII-8BIT (BINARY)>"`.

> 

> Would that be acceptable?



I agree the idea.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106757



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116845] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (39 preceding siblings ...)
  2024-02-14  9:32 ` [ruby-core:116738] " naruse (Yui NARUSE) via ruby-core
@ 2024-02-19 12:38 ` byroot (Jean Boussier) via ruby-core
  2024-02-19 23:02 ` [ruby-core:116855] " Dan0042 (Daniel DeLorme) via ruby-core
                   ` (4 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-02-19 12:38 UTC (permalink / raw)
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #18576 has been updated by byroot (Jean Boussier).





Proposed patch: https://github.com/ruby/ruby/pull/10018



I used my initial suggestion: `ASCII-8BIT (BINARY)`, but if the parentheses are deemed to much, I'm happy to adjust. 



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106871



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116855] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (40 preceding siblings ...)
  2024-02-19 12:38 ` [ruby-core:116845] " byroot (Jean Boussier) via ruby-core
@ 2024-02-19 23:02 ` Dan0042 (Daniel DeLorme) via ruby-core
  2024-02-19 23:23 ` [ruby-core:116857] " duerst via ruby-core
                   ` (3 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: Dan0042 (Daniel DeLorme) via ruby-core @ 2024-02-19 23:02 UTC (permalink / raw)
  To: ruby-core; +Cc: Dan0042 (Daniel DeLorme)

Issue #18576 has been updated by Dan0042 (Daniel DeLorme).





I've come to realize something; when an ASCII-8BIT string contains only ascii characters, it behaves exactly like a US-ASCII string and in such a case it feels unnatural to call it "binary" (at least for me). But as soon as there is a non-ascii byte, it becomes incompatible with every other encoding and then truly deserves to be called BINARY. And that's when encoding errors occur. So in error messages, "BINARY" makes perfect sense to me since the error occurs due to the string being in "binary" state rather than "ascii-only" state. The distinction may be irrelevant to others but at least it has helped me put into words and understand why it felt so uncomfortable to change the name to "BINARY". Just my 2¢



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106887



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116857] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (41 preceding siblings ...)
  2024-02-19 23:02 ` [ruby-core:116855] " Dan0042 (Daniel DeLorme) via ruby-core
@ 2024-02-19 23:23 ` duerst via ruby-core
  2024-02-20  7:51 ` [ruby-core:116868] " byroot (Jean Boussier) via ruby-core
                   ` (2 subsequent siblings)
  45 siblings, 0 replies; 47+ messages in thread
From: duerst via ruby-core @ 2024-02-19 23:23 UTC (permalink / raw)
  To: ruby-core; +Cc: duerst

Issue #18576 has been updated by duerst (Martin Dürst).





What about

```

>> "fée" + "fée".b

(irb):8:in `+': incompatible character encodings: UTF-8 and BINARY (ASCII-8BIT) (Encoding::CompatibilityError)

```



This still leaves "ASCII-8BIT" in (because I think it's important to help people understand that BINARY and ASCII-8BIT are the same).



[It also keeps the wart of consecutive parentheticals, but that can be dealt with separately.]



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106889



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116868] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (42 preceding siblings ...)
  2024-02-19 23:23 ` [ruby-core:116857] " duerst via ruby-core
@ 2024-02-20  7:51 ` byroot (Jean Boussier) via ruby-core
  2024-02-20 11:16 ` [ruby-core:116875] " Eregon (Benoit Daloze) via ruby-core
  2024-04-13 19:29 ` [ruby-core:117508] " alexander-s (Alexander S) via ruby-core
  45 siblings, 0 replies; 47+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-02-20  7:51 UTC (permalink / raw)
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #18576 has been updated by byroot (Jean Boussier).





```

>> "fée" + "fée".b

(irb):8:in `+': incompatible character encodings: UTF-8 and BINARY (ASCII-8BIT) (Encoding::CompatibilityError)

```



I don't mind `BINARY` being first or last. I'll adjust my PR.



As for the consecutive parentheteses, what about:



```

>> "fée" + "fée".b

(irb):8:in `+': incompatible character encodings: UTF-8 and BINARY / ASCII-8BIT (Encoding::CompatibilityError)

```







----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106903



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:116875] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (43 preceding siblings ...)
  2024-02-20  7:51 ` [ruby-core:116868] " byroot (Jean Boussier) via ruby-core
@ 2024-02-20 11:16 ` Eregon (Benoit Daloze) via ruby-core
  2024-04-13 19:29 ` [ruby-core:117508] " alexander-s (Alexander S) via ruby-core
  45 siblings, 0 replies; 47+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-02-20 11:16 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #18576 has been updated by Eregon (Benoit Daloze).





`BINARY (ASCII-8BIT)` seems a good compromise.



The `/` seems potentially confusing for:

`incompatible character encodings: BINARY / ASCII-8BIT and EUC-JP (Encoding::CompatibilityError)`.

So I think using parenthesis is OK and clearer than `/`.



----------------------------------------

Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`

https://bugs.ruby-lang.org/issues/18576#change-106910



* Author: byroot (Jean Boussier)

* Status: Open

* Priority: Normal

* Target version: 3.4

----------------------------------------

### Context



I'm now used to it, but something that confused me for years was errors such as:



```ruby

>> "fée" + "\xFF".b

(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

```



When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".



And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.



The `Encoding::BINARY` alias is much more telling IMHO.



### Proposal



Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.



The only concern I could see would be the consistency with a handful of C API functions:



  - `rb_encoding *rb_ascii8bit_encoding(void)`

  - `int rb_ascii8bit_encindex(void)`

  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`



But that's for much more advanced users, so I don't think it's much of a concern.









-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [ruby-core:117508] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`
  2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
                   ` (44 preceding siblings ...)
  2024-02-20 11:16 ` [ruby-core:116875] " Eregon (Benoit Daloze) via ruby-core
@ 2024-04-13 19:29 ` alexander-s (Alexander S) via ruby-core
  45 siblings, 0 replies; 47+ messages in thread
From: alexander-s (Alexander S) via ruby-core @ 2024-04-13 19:29 UTC (permalink / raw)
  To: ruby-core; +Cc: alexander-s (Alexander S)

Issue #18576 has been updated by alexander-s (Alexander S).


matz (Yukihiro Matsumoto) wrote in #note-14:
> I don't object to the proposal itself. But as @ko1 searched, there are so many gems that compare `Encoding#name` and `ASCII-8BIT`.
> So I don't accept the proposal for the sake of compatibility.
> 
> Matz.

I've been developing with Ruby for some 10+ years now, and overall I really like the language.

However, I also feel that Ruby sometimes seems too focused on being backwards compatible, to a point where it risk hurting the ecosystem. I think this thread is a good example, because it seems like such a small and benign change, yet it's taken several years and lots of back and forth, and in the end the proposed change wasn't even applied(!?).

At the same time, several parts of the standard library feels outdated (Net::HTTP for example), and others misplaced (OLE-automation anyone?). On the other hand, new "cool features" are sometimes introduced that I don't really see any value in. For example 'endless ranges' and 'single line end-less method definition'. In short, I share much of Bbatsov's (RuboCop author) sentiment from this article (https://metaredux.com/posts/2019/04/02/ruby-s-creed.html).

There is good progress too, I'll happily admit. A few examples that comes to mind are 'keyword params', 'unifying Integer/Fixnum', 'UTF-8 encoding by default', the Prism parser and the focus on performance. All these seemed like sensible improvements, in alignment with development in other modern languages.

Others probably have a much better ideas on what old stuff could be improved, but it could be for example:

- Remove or deprecate globals
- Update the Rubydoc system (many other languages have better documentation systems)
- Continue cleaning up the stdlib (some progress has been made in recent Ruby releases, which is good)
- Look at popular rules in RuboCop etc, for stuff that people are frequently disabling with linting, and consider deprecating them.
- Take it easy with new syntax, ruby already have 'many ways to solve the same problem'. Something like end-less method definition seems like a pointless addition. On our team, we just disabled it with linting on day one.

To summarize, obviously backwards compatibility is important. But progress is inevitable and a language that doesn't development at a reasonable pace will eventually stagnate and die. I don't think ruby is there yet, but I'd hate to see it go down that path. I also think think much of this can be managed with deprecation messages and the like.

----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-107895

* Author: byroot (Jean Boussier)
* Status: Open
* Target version: 3.4
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "fée" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-04-13 19:29 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08  9:08 [ruby-core:107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` byroot (Jean Boussier)
2022-02-08  9:21 ` [ruby-core:107515] " duerst
2022-02-08  9:28 ` [ruby-core:107516] " byroot (Jean Boussier)
2022-02-08  9:41 ` [ruby-core:107517] " naruse (Yui NARUSE)
2022-02-08 10:15 ` [ruby-core:107518] " Eregon (Benoit Daloze)
2022-02-08 10:21 ` [ruby-core:107519] " Eregon (Benoit Daloze)
2022-02-09  9:51 ` [ruby-core:107527] " naruse (Yui NARUSE)
2022-02-09 16:52 ` [ruby-core:107531] " tenderlovemaking (Aaron Patterson)
2022-02-09 17:34 ` [ruby-core:107532] " Eregon (Benoit Daloze)
2022-02-09 17:49 ` [ruby-core:107533] " jeremyevans0 (Jeremy Evans)
2022-02-09 23:35 ` [ruby-core:107537] " tenderlovemaking (Aaron Patterson)
2022-02-10  7:53 ` [ruby-core:107549] " duerst
2022-02-10  9:11 ` [ruby-core:107550] " byroot (Jean Boussier)
2022-02-10 14:15 ` [ruby-core:107553] " Eregon (Benoit Daloze)
2022-02-17  9:14 ` [ruby-core:107619] " matz (Yukihiro Matsumoto)
2022-02-17  9:16 ` [ruby-core:107620] " byroot (Jean Boussier)
2022-02-17  9:24 ` [ruby-core:107621] " matz (Yukihiro Matsumoto)
2022-02-17  9:27 ` [ruby-core:107622] " byroot (Jean Boussier)
2022-02-17 13:30 ` [ruby-core:107634] " Eregon (Benoit Daloze)
2022-02-17 13:58 ` [ruby-core:107636] " matz (Yukihiro Matsumoto)
2022-02-17 14:00 ` [ruby-core:107637] " byroot (Jean Boussier)
2022-02-17 15:34 ` [ruby-core:107640] " byroot (Jean Boussier)
2022-02-19 10:59 ` [ruby-core:107666] " byroot (Jean Boussier)
2022-02-21  8:23 ` [ruby-core:107680] " byroot (Jean Boussier)
2022-03-17  9:03 ` [ruby-core:107943] " matz (Yukihiro Matsumoto)
2022-03-17 11:08 ` [ruby-core:107944] " Eregon (Benoit Daloze)
2022-03-17 15:06 ` [ruby-core:107956] " larskanis (Lars Kanis)
2023-12-06 12:36 ` [ruby-core:115604] " Eregon (Benoit Daloze) via ruby-core
2023-12-20  8:44 ` [ruby-core:115813] " naruse (Yui NARUSE) via ruby-core
2024-01-11 10:26 ` [ruby-core:116170] " Eregon (Benoit Daloze) via ruby-core
2024-01-11 10:30 ` [ruby-core:116172] " Eregon (Benoit Daloze) via ruby-core
2024-01-11 10:35 ` [ruby-core:116173] " byroot (Jean Boussier) via ruby-core
2024-01-17  8:26 ` [ruby-core:116266] " naruse (Yui NARUSE) via ruby-core
2024-01-17  8:36 ` [ruby-core:116268] " byroot (Jean Boussier) via ruby-core
2024-01-17  9:19 ` [ruby-core:116269] " zverok (Victor Shepelev) via ruby-core
2024-01-18  0:58 ` [ruby-core:116280] " Dan0042 (Daniel DeLorme) via ruby-core
2024-01-18 15:19 ` [ruby-core:116298] " Eregon (Benoit Daloze) via ruby-core
2024-01-21  9:46 ` [ruby-core:116355] " byroot (Jean Boussier) via ruby-core
2024-01-22 10:15 ` [ruby-core:116363] " Eregon (Benoit Daloze) via ruby-core
2024-01-24  6:47 ` [ruby-core:116393] " shyouhei (Shyouhei Urabe) via ruby-core
2024-02-14  9:32 ` [ruby-core:116738] " naruse (Yui NARUSE) via ruby-core
2024-02-19 12:38 ` [ruby-core:116845] " byroot (Jean Boussier) via ruby-core
2024-02-19 23:02 ` [ruby-core:116855] " Dan0042 (Daniel DeLorme) via ruby-core
2024-02-19 23:23 ` [ruby-core:116857] " duerst via ruby-core
2024-02-20  7:51 ` [ruby-core:116868] " byroot (Jean Boussier) via ruby-core
2024-02-20 11:16 ` [ruby-core:116875] " Eregon (Benoit Daloze) via ruby-core
2024-04-13 19:29 ` [ruby-core:117508] " alexander-s (Alexander S) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).