ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:92520] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase
       [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
@ 2019-05-01 22:27 ` jonathan
  2019-05-09  3:38 ` [ruby-core:92602] " merch-redmine
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: jonathan @ 2019-05-01 22:27 UTC (permalink / raw)
  To: ruby-core

Issue #15816 has been reported by jonathanhefner (Jonathan Hefner).

----------------------------------------
Bug #15816: String#casecmp compares uppercase characters instead of lowercase
https://bugs.ruby-lang.org/issues/15816

* Author: jonathanhefner (Jonathan Hefner)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
The current implementation of `String#casecmp` converts characters to uppercase before comparing them.  However, all references I've found for `strcasecmp` (the C function on which `String#casecmp` is based) indicate characters should be converted to lowercase before being compared.

For example, [this man page](http://manpages.ubuntu.com/manpages/eoan/man3/strcasecmp.3.html) says:

> The POSIX.1-2008 standard says ... shall behave as if the strings had been converted to lowercase and then a byte comparison performed.

The difference in behavior is apparent when comparing / sorting strings containing `[`, `\`, `]`, `^`, `_`, or `` ` `` (the characters that occur between `Z` and `a`).  Converting to lowercase sorts these punctuation characters before `A`-`z` along with most of the other punctuation in ASCII, but converting to uppercase sorts these characters after `A`-`z` instead.




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:92602] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase
       [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
  2019-05-01 22:27 ` [ruby-core:92520] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase jonathan
@ 2019-05-09  3:38 ` merch-redmine
  2019-05-09  7:35 ` [ruby-core:92610] " mame
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 5+ messages in thread
From: merch-redmine @ 2019-05-09  3:38 UTC (permalink / raw)
  To: ruby-core

Issue #15816 has been updated by jeremyevans0 (Jeremy Evans).

File casecmp-lowercase.patch added

The documentation of `String#casecmp` does not specify how it is is implemented, so it seems fair to consider switching.  However, this change is likely to cause backwards compatibility issues.  While it seems unlikely there are many applications relying on the current behavior, I would guess there are at least a few.

Considering that `String#casecmp?` uses lowercase and not uppercase, I think making such a change is reasonable, but we may want to delay making this change until Ruby 3.

Attached is a patch if we want to make this change.

 

----------------------------------------
Bug #15816: String#casecmp compares uppercase characters instead of lowercase
https://bugs.ruby-lang.org/issues/15816#change-77961

* Author: jonathanhefner (Jonathan Hefner)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
The current implementation of `String#casecmp` converts characters to uppercase before comparing them.  However, all references I've found for `strcasecmp` (the C function on which `String#casecmp` is based) indicate characters should be converted to lowercase before being compared.

For example, [this man page](http://manpages.ubuntu.com/manpages/eoan/man3/strcasecmp.3.html) says:

> The POSIX.1-2008 standard says ... shall behave as if the strings had been converted to lowercase and then a byte comparison performed.

The difference in behavior is apparent when comparing / sorting strings containing `[`, `\`, `]`, `^`, `_`, or `` ` `` (the characters that occur between `Z` and `a`).  Converting to lowercase sorts these punctuation characters before `A`-`z` along with most of the other punctuation in ASCII, but converting to uppercase sorts these characters after `A`-`z` instead.


---Files--------------------------------
casecmp-lowercase.patch (1.3 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:92610] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase
       [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
  2019-05-01 22:27 ` [ruby-core:92520] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase jonathan
  2019-05-09  3:38 ` [ruby-core:92602] " merch-redmine
@ 2019-05-09  7:35 ` mame
  2019-05-09  8:01 ` [ruby-core:92611] " nobu
  2019-10-02 15:01 ` [ruby-core:95189] [Ruby master " merch-redmine
  4 siblings, 0 replies; 5+ messages in thread
From: mame @ 2019-05-09  7:35 UTC (permalink / raw)
  To: ruby-core

Issue #15816 has been updated by mame (Yusuke Endoh).


Until ruby 1.8.7, it seemed to use downcase.  It was changed at r14227 to support encoding.  I think the behavior change was not intended, so this is merely a bug?

```
# ./bin/ruby-1.8.7-p374 -e 'p "a".casecmp("[")'
1

# ./bin/ruby-1.9.0-0 -e 'p "a".casecmp("[")'
-1
```

----------------------------------------
Bug #15816: String#casecmp compares uppercase characters instead of lowercase
https://bugs.ruby-lang.org/issues/15816#change-77970

* Author: jonathanhefner (Jonathan Hefner)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
The current implementation of `String#casecmp` converts characters to uppercase before comparing them.  However, all references I've found for `strcasecmp` (the C function on which `String#casecmp` is based) indicate characters should be converted to lowercase before being compared.

For example, [this man page](http://manpages.ubuntu.com/manpages/eoan/man3/strcasecmp.3.html) says:

> The POSIX.1-2008 standard says ... shall behave as if the strings had been converted to lowercase and then a byte comparison performed.

The difference in behavior is apparent when comparing / sorting strings containing `[`, `\`, `]`, `^`, `_`, or `` ` `` (the characters that occur between `Z` and `a`).  Converting to lowercase sorts these punctuation characters before `A`-`z` along with most of the other punctuation in ASCII, but converting to uppercase sorts these characters after `A`-`z` instead.


---Files--------------------------------
casecmp-lowercase.patch (1.3 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:92611] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase
       [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2019-05-09  7:35 ` [ruby-core:92610] " mame
@ 2019-05-09  8:01 ` nobu
  2019-10-02 15:01 ` [ruby-core:95189] [Ruby master " merch-redmine
  4 siblings, 0 replies; 5+ messages in thread
From: nobu @ 2019-05-09  8:01 UTC (permalink / raw)
  To: ruby-core

Issue #15816 has been updated by nobu (Nobuyoshi Nakada).


Indeed, `rb_enc_upper` is used at https://github.com/ruby/ruby/commit/269bd16b28e86d1333969389b7b402f2915e336f#diff-7a2f2c7dfe0bf61d38272aeaf68ac768R1431, while previous `rb_memcicmp` maps to the lowercase.

----------------------------------------
Bug #15816: String#casecmp compares uppercase characters instead of lowercase
https://bugs.ruby-lang.org/issues/15816#change-77971

* Author: jonathanhefner (Jonathan Hefner)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
The current implementation of `String#casecmp` converts characters to uppercase before comparing them.  However, all references I've found for `strcasecmp` (the C function on which `String#casecmp` is based) indicate characters should be converted to lowercase before being compared.

For example, [this man page](http://manpages.ubuntu.com/manpages/eoan/man3/strcasecmp.3.html) says:

> The POSIX.1-2008 standard says ... shall behave as if the strings had been converted to lowercase and then a byte comparison performed.

The difference in behavior is apparent when comparing / sorting strings containing `[`, `\`, `]`, `^`, `_`, or `` ` `` (the characters that occur between `Z` and `a`).  Converting to lowercase sorts these punctuation characters before `A`-`z` along with most of the other punctuation in ASCII, but converting to uppercase sorts these characters after `A`-`z` instead.


---Files--------------------------------
casecmp-lowercase.patch (1.3 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:95189] [Ruby master Bug#15816] String#casecmp compares uppercase characters instead of lowercase
       [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2019-05-09  8:01 ` [ruby-core:92611] " nobu
@ 2019-10-02 15:01 ` merch-redmine
  4 siblings, 0 replies; 5+ messages in thread
From: merch-redmine @ 2019-10-02 15:01 UTC (permalink / raw)
  To: ruby-core

Issue #15816 has been updated by jeremyevans0 (Jeremy Evans).

Status changed from Open to Closed

Fixed in commit:082424ef58116db9663a754157d6c441d60fd101.

----------------------------------------
Bug #15816: String#casecmp compares uppercase characters instead of lowercase
https://bugs.ruby-lang.org/issues/15816#change-81825

* Author: jonathanhefner (Jonathan Hefner)
* Status: Closed
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
The current implementation of `String#casecmp` converts characters to uppercase before comparing them.  However, all references I've found for `strcasecmp` (the C function on which `String#casecmp` is based) indicate characters should be converted to lowercase before being compared.

For example, [this man page](http://manpages.ubuntu.com/manpages/eoan/man3/strcasecmp.3.html) says:

> The POSIX.1-2008 standard says ... shall behave as if the strings had been converted to lowercase and then a byte comparison performed.

The difference in behavior is apparent when comparing / sorting strings containing `[`, `\`, `]`, `^`, `_`, or `` ` `` (the characters that occur between `Z` and `a`).  Converting to lowercase sorts these punctuation characters before `A`-`z` along with most of the other punctuation in ASCII, but converting to uppercase sorts these characters after `A`-`z` instead.


---Files--------------------------------
casecmp-lowercase.patch (1.3 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-02 15:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-15816.20190501222739@ruby-lang.org>
2019-05-01 22:27 ` [ruby-core:92520] [Ruby trunk Bug#15816] String#casecmp compares uppercase characters instead of lowercase jonathan
2019-05-09  3:38 ` [ruby-core:92602] " merch-redmine
2019-05-09  7:35 ` [ruby-core:92610] " mame
2019-05-09  8:01 ` [ruby-core:92611] " nobu
2019-10-02 15:01 ` [ruby-core:95189] [Ruby master " merch-redmine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).