ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows
@ 2021-11-20 14:33 koleq
  2021-11-22  6:47 ` [ruby-core:106200] " nobu (Nobuyoshi Nakada)
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: koleq @ 2021-11-20 14:33 UTC (permalink / raw
  To: ruby-core

Issue #18353 has been reported by koleq (Ondřej Kurz).

----------------------------------------
Bug #18353: Czech keyboard input encoding on czech Windows
https://bugs.ruby-lang.org/issues/18353

* Author: koleq (Ondřej Kurz)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
Inputing czech characters in czech Windows does not work unless "`text.force_encoding("CP852")`" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "`ěščřžýáíé`" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

**Ruby**

``` ruby
puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
����젡�
```

"```text.encoding```" returns "```UTF-8```"
"```text.bytes.inspect```" returns "```[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]```"

**Python 3**


``` python
print("ěščřžýáíé")
text = input()
print(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
ěščřžýáíé
```

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:106200] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows
  2021-11-20 14:33 [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows koleq
@ 2021-11-22  6:47 ` nobu (Nobuyoshi Nakada)
  2021-11-22  8:12 ` [ruby-core:106202] " koleq
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: nobu (Nobuyoshi Nakada) @ 2021-11-22  6:47 UTC (permalink / raw
  To: ruby-core

Issue #18353 has been updated by nobu (Nobuyoshi Nakada).


Seems default external encoding doesn't match.
What does `chcp.com` command say?
And what does `ruby -e 'p Encoding.default_encoding, Encoding.default_internal, Encoding.locale_charmap'`?

----------------------------------------
Bug #18353: Czech keyboard input encoding on czech Windows
https://bugs.ruby-lang.org/issues/18353#change-94805

* Author: koleq (Ondřej Kurz)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
Inputing czech characters in czech Windows does not work unless "`text.force_encoding("CP852")`" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "`ěščřžýáíé`" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

**Ruby**

``` ruby
puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
����젡�
```

"```text.encoding```" returns "```UTF-8```"
"```text.bytes.inspect```" returns "```[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]```"

**Python 3**


``` python
print("ěščřžýáíé")
text = input()
print(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
ěščřžýáíé
```

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:106202] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows
  2021-11-20 14:33 [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows koleq
  2021-11-22  6:47 ` [ruby-core:106200] " nobu (Nobuyoshi Nakada)
@ 2021-11-22  8:12 ` koleq
  2021-11-22 11:28 ` [ruby-core:106207] " nobu (Nobuyoshi Nakada)
  2021-11-23  9:38 ` [ruby-core:106221] " koleq
  3 siblings, 0 replies; 5+ messages in thread
From: koleq @ 2021-11-22  8:12 UTC (permalink / raw
  To: ruby-core

Issue #18353 has been updated by koleq (Ondřej Kurz).


nobu (Nobuyoshi Nakada) wrote in #note-1:
> Seems default external encoding doesn't match.
> What does `chcp.com` command say?
> And what does `ruby -e 'p Encoding.default_encoding, Encoding.default_internal, Encoding.locale_charmap'`?

at the time of writing this I'm at work but I also have ruby here so here are result from my work pc.
```
H:\>chcp
Active code page: 852

H:\>ruby -e 'p Encoding.default_encoding, Encoding.default_internal, Encoding.locale_charmap'
-e:1:in `<main>': undefined method `default_encoding' for Encoding:Class (NoMethodError)
Did you mean?  default_internal

H:\>ruby -e 'p Encoding.default_internal, Encoding.locale_charmap'
nil
"CP852"

H:\>ruby -v
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
```

----------------------------------------
Bug #18353: Czech keyboard input encoding on czech Windows
https://bugs.ruby-lang.org/issues/18353#change-94808

* Author: koleq (Ondřej Kurz)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
Inputing czech characters in czech Windows does not work unless "`text.force_encoding("CP852")`" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "`ěščřžýáíé`" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

**Ruby**

``` ruby
puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
����젡�
```

"```text.encoding```" returns "```UTF-8```"
"```text.bytes.inspect```" returns "```[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]```"

**Python 3**


``` python
print("ěščřžýáíé")
text = input()
print(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
ěščřžýáíé
```

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:106207] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows
  2021-11-20 14:33 [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows koleq
  2021-11-22  6:47 ` [ruby-core:106200] " nobu (Nobuyoshi Nakada)
  2021-11-22  8:12 ` [ruby-core:106202] " koleq
@ 2021-11-22 11:28 ` nobu (Nobuyoshi Nakada)
  2021-11-23  9:38 ` [ruby-core:106221] " koleq
  3 siblings, 0 replies; 5+ messages in thread
From: nobu (Nobuyoshi Nakada) @ 2021-11-22 11:28 UTC (permalink / raw
  To: ruby-core

Issue #18353 has been updated by nobu (Nobuyoshi Nakada).

Status changed from Open to Feedback

koleq (Ondřej Kurz) wrote in #note-2:
> ```
> H:\>ruby -e 'p Encoding.default_encoding, Encoding.default_internal, Encoding.locale_charmap'
> -e:1:in `<main>': undefined method `default_encoding' for Encoding:Class (NoMethodError)

Sorry, it's a typo, should be `Encoding.default_external`.

And is the environment variable `RUBYOPT` set?

----------------------------------------
Bug #18353: Czech keyboard input encoding on czech Windows
https://bugs.ruby-lang.org/issues/18353#change-94814

* Author: koleq (Ondřej Kurz)
* Status: Feedback
* Priority: Normal
* ruby -v: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
Inputing czech characters in czech Windows does not work unless "`text.force_encoding("CP852")`" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "`ěščřžýáíé`" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

**Ruby**

``` ruby
puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
����젡�
```

"```text.encoding```" returns "```UTF-8```"
"```text.bytes.inspect```" returns "```[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]```"

**Python 3**


``` python
print("ěščřžýáíé")
text = input()
print(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
ěščřžýáíé
```

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:106221] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows
  2021-11-20 14:33 [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows koleq
                   ` (2 preceding siblings ...)
  2021-11-22 11:28 ` [ruby-core:106207] " nobu (Nobuyoshi Nakada)
@ 2021-11-23  9:38 ` koleq
  3 siblings, 0 replies; 5+ messages in thread
From: koleq @ 2021-11-23  9:38 UTC (permalink / raw
  To: ruby-core

Issue #18353 has been updated by koleq (Ondřej Kurz).


nobu (Nobuyoshi Nakada) wrote in #note-3:
> koleq (Ondřej Kurz) wrote in #note-2:
> > ```
> > H:\>ruby -e 'p Encoding.default_encoding, Encoding.default_internal, Encoding.locale_charmap'
> > -e:1:in `<main>': undefined method `default_encoding' for Encoding:Class (NoMethodError)
> 
> Sorry, it's a typo, should be `Encoding.default_external`.
> 
> And is the environment variable `RUBYOPT` set?

```
H:\>ruby -e 'p Encoding.default_external, Encoding.default_internal, Encoding.locale_charmap'
#<Encoding:UTF-8>
nil
"CP852"
```

RUBYOPT enviroment variable is not set, if it was not set by RubyInstaller for Windows, I checked my system and it does not seem to be set. I do not even know what it is, or what it should be.




----------------------------------------
Bug #18353: Czech keyboard input encoding on czech Windows
https://bugs.ruby-lang.org/issues/18353#change-94831

* Author: koleq (Ondřej Kurz)
* Status: Feedback
* Priority: Normal
* ruby -v: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
Inputing czech characters in czech Windows does not work unless "`text.force_encoding("CP852")`" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "`ěščřžýáíé`" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

**Ruby**

``` ruby
puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
����젡�
```

"```text.encoding```" returns "```UTF-8```"
"```text.bytes.inspect```" returns "```[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]```"

**Python 3**


``` python
print("ěščřžýáíé")
text = input()
print(text)
```
output:
```
ěščřžýáíé
ěščřžýáíé
ěščřžýáíé
```

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-23  9:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-20 14:33 [ruby-core:106191] [Ruby master Bug#18353] Czech keyboard input encoding on czech Windows koleq
2021-11-22  6:47 ` [ruby-core:106200] " nobu (Nobuyoshi Nakada)
2021-11-22  8:12 ` [ruby-core:106202] " koleq
2021-11-22 11:28 ` [ruby-core:106207] " nobu (Nobuyoshi Nakada)
2021-11-23  9:38 ` [ruby-core:106221] " koleq

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).