ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases
@ 2022-02-01 19:44 shan (Shannon Skipper)
  2022-02-01 20:27 ` [ruby-core:107419] " mame (Yusuke Endoh)
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: shan (Shannon Skipper) @ 2022-02-01 19:44 UTC (permalink / raw
  To: ruby-core

Issue #18563 has been reported by shan (Shannon Skipper).

----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme aliases
https://bugs.ruby-lang.org/issues/18563

* Author: shan (Shannon Skipper)
* Status: Open
* Priority: Normal
----------------------------------------
> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

> Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:107419] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases
  2022-02-01 19:44 [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases shan (Shannon Skipper)
@ 2022-02-01 20:27 ` mame (Yusuke Endoh)
  2022-03-17  8:14 ` [ruby-core:107940] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases nobu (Nobuyoshi Nakada)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: mame (Yusuke Endoh) @ 2022-02-01 20:27 UTC (permalink / raw
  To: ruby-core

Issue #18563 has been updated by mame (Yusuke Endoh).

Description updated

(I have added to the description an url to matz's original statement)

----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme aliases
https://bugs.ruby-lang.org/issues/18563#change-96319

* Author: shan (Shannon Skipper)
* Status: Open
* Priority: Normal
----------------------------------------
https://bugs.ruby-lang.org/issues/13780#note-10

> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

> Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:107940] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases
  2022-02-01 19:44 [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases shan (Shannon Skipper)
  2022-02-01 20:27 ` [ruby-core:107419] " mame (Yusuke Endoh)
@ 2022-03-17  8:14 ` nobu (Nobuyoshi Nakada)
  2022-03-17  8:56 ` [ruby-core:107942] " matz (Yukihiro Matsumoto)
  2022-03-17 18:18 ` [ruby-core:107958] " Dan0042 (Daniel DeLorme)
  3 siblings, 0 replies; 5+ messages in thread
From: nobu (Nobuyoshi Nakada) @ 2022-03-17  8:14 UTC (permalink / raw
  To: ruby-core

Issue #18563 has been updated by nobu (Nobuyoshi Nakada).


How about `letters` and `each_letter`?

----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme" aliases
https://bugs.ruby-lang.org/issues/18563#change-96890

* Author: shan (Shannon Skipper)
* Status: Open
* Priority: Normal
----------------------------------------
https://bugs.ruby-lang.org/issues/13780#note-10

> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

> Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:107942] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases
  2022-02-01 19:44 [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases shan (Shannon Skipper)
  2022-02-01 20:27 ` [ruby-core:107419] " mame (Yusuke Endoh)
  2022-03-17  8:14 ` [ruby-core:107940] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases nobu (Nobuyoshi Nakada)
@ 2022-03-17  8:56 ` matz (Yukihiro Matsumoto)
  2022-03-17 18:18 ` [ruby-core:107958] " Dan0042 (Daniel DeLorme)
  3 siblings, 0 replies; 5+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-03-17  8:56 UTC (permalink / raw
  To: ruby-core

Issue #18563 has been updated by matz (Yukihiro Matsumoto).

Status changed from Open to Closed

For the record, "Grapheme" and "Grapheme cluster" are different concepts. If we call them "grapheme", It's kind of like calling "Wikipedia" as "Wiki".
Until Unicode consortium defines a shorter name for them or the convention calling them "grapheme" become popular as common sense, we don't provide such aliases. So my opinion has not been changed since.

Short answer: "not yet".

Matz.

----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme" aliases
https://bugs.ruby-lang.org/issues/18563#change-96892

* Author: shan (Shannon Skipper)
* Status: Closed
* Priority: Normal
----------------------------------------
https://bugs.ruby-lang.org/issues/13780#note-10

> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

> Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ruby-core:107958] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases
  2022-02-01 19:44 [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases shan (Shannon Skipper)
                   ` (2 preceding siblings ...)
  2022-03-17  8:56 ` [ruby-core:107942] " matz (Yukihiro Matsumoto)
@ 2022-03-17 18:18 ` Dan0042 (Daniel DeLorme)
  3 siblings, 0 replies; 5+ messages in thread
From: Dan0042 (Daniel DeLorme) @ 2022-03-17 18:18 UTC (permalink / raw
  To: ruby-core

Issue #18563 has been updated by Dan0042 (Daniel DeLorme).


nobu (Nobuyoshi Nakada) wrote in #note-4:
> How about `letters` and `each_letter`?

I like the general idea, but to me "letters" mean \p{L}
Ideally, what is now a "char" should be called a grapheme (like "a" and "\u0300"), and "grapheme_clusters" should be called chars (like "a" and "a\u0300")

It may sound like a radical idea, but what about having `each_char` output grapheme clusters? The vast majority of the time they are the same thing, and for the few exceptions we probably want `"été".chars` to return 3 characters even if they are encoded as "\u0065\u0301\u0074\u00e9" (i.e. have the "intuitively correct" result even without unicode normalization)

Or how about `characters` and `each_character`?

----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme" aliases
https://bugs.ruby-lang.org/issues/18563#change-96908

* Author: shan (Shannon Skipper)
* Status: Closed
* Priority: Normal
----------------------------------------
https://bugs.ruby-lang.org/issues/13780#note-10

> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

> Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-17 18:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-01 19:44 [ruby-core:107416] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme aliases shan (Shannon Skipper)
2022-02-01 20:27 ` [ruby-core:107419] " mame (Yusuke Endoh)
2022-03-17  8:14 ` [ruby-core:107940] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases nobu (Nobuyoshi Nakada)
2022-03-17  8:56 ` [ruby-core:107942] " matz (Yukihiro Matsumoto)
2022-03-17 18:18 ` [ruby-core:107958] " Dan0042 (Daniel DeLorme)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).