ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:107304] [Ruby master Bug#18554] Move unicode_normalize to a default gem
@ 2022-01-27 16:49 headius (Charles Nutter)
  2022-01-28  1:03 ` [ruby-core:107309] [Ruby master Feature#18554] " shyouhei (Shyouhei Urabe)
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: headius (Charles Nutter) @ 2022-01-27 16:49 UTC (permalink / raw
  To: ruby-core

Issue #18554 has been reported by headius (Charles Nutter).

----------------------------------------
Bug #18554: Move unicode_normalize to a default gem
https://bugs.ruby-lang.org/issues/18554

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
Could we move the rest of unicode_normalize to a default gem?

The recent updates were mostly updating the Unicode tables, which a user might want to be able to update in an existing Ruby installation. Additionally, this is one of the few stdlib we have to copy into JRuby from the CRuby repository; it would be easier for both if we just pulled in a default gem.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:107309] [Ruby master Feature#18554] Move unicode_normalize to a default gem
  2022-01-27 16:49 [ruby-core:107304] [Ruby master Bug#18554] Move unicode_normalize to a default gem headius (Charles Nutter)
@ 2022-01-28  1:03 ` shyouhei (Shyouhei Urabe)
  2022-01-28  9:33 ` [ruby-core:107317] " duerst
  2022-01-31 17:51 ` [ruby-core:107397] " headius (Charles Nutter)
  2 siblings, 0 replies; 4+ messages in thread
From: shyouhei (Shyouhei Urabe) @ 2022-01-28  1:03 UTC (permalink / raw
  To: ruby-core

Issue #18554 has been updated by shyouhei (Shyouhei Urabe).


Just leaving my :+1: to this idea; not sure how difficult though.

----------------------------------------
Feature #18554: Move unicode_normalize to a default gem
https://bugs.ruby-lang.org/issues/18554#change-96208

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
Could we move the rest of unicode_normalize to a default gem?

The recent updates were mostly updating the Unicode tables, which a user might want to be able to update in an existing Ruby installation. Additionally, this is one of the few stdlib we have to copy into JRuby from the CRuby repository; it would be easier for both if we just pulled in a default gem.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:107317] [Ruby master Feature#18554] Move unicode_normalize to a default gem
  2022-01-27 16:49 [ruby-core:107304] [Ruby master Bug#18554] Move unicode_normalize to a default gem headius (Charles Nutter)
  2022-01-28  1:03 ` [ruby-core:107309] [Ruby master Feature#18554] " shyouhei (Shyouhei Urabe)
@ 2022-01-28  9:33 ` duerst
  2022-01-31 17:51 ` [ruby-core:107397] " headius (Charles Nutter)
  2 siblings, 0 replies; 4+ messages in thread
From: duerst @ 2022-01-28  9:33 UTC (permalink / raw
  To: ruby-core

Issue #18554 has been updated by duerst (Martin Dürst).


Just a few comments, not sure I have thought everything through completely.

One of the motivations for implementing unciode_normalize in pure Ruby was to make it easy for other Ruby implementations to use this code, so from this viewpoint, if it helps JRuby, that would be a plus.

However, contrary to stuff that is in gems now, unicode_normalize part and parcel of the String class, without needing require. It just is placed in lib/ because there was no other, better, place for it. There is already some mechanism for automatic requiring, see function unicode_normalize_common in 

Regarding Unicode versions, if somebody wants to change to a specific Unicode version different from what a Ruby version offers, then this would apply not only to unicode_normalize, but also, and probably much more importantly, to regular expressions. But regular expressions are quite tightly linked with Ruby itself, and it would probably be difficult to disentangle them, because it's not much Ruby and a lot of C.

Also, the updating of Unicode versions uses the same logic to get the necessary data for both regular expressions and unicode_normalize, so if unicode_normalize would be separated into a gem, that part might have to be duplicated, creating additional work on this end.

----------------------------------------
Feature #18554: Move unicode_normalize to a default gem
https://bugs.ruby-lang.org/issues/18554#change-96215

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
Could we move the rest of unicode_normalize to a default gem?

The recent updates were mostly updating the Unicode tables, which a user might want to be able to update in an existing Ruby installation. Additionally, this is one of the few stdlib we have to copy into JRuby from the CRuby repository; it would be easier for both if we just pulled in a default gem.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [ruby-core:107397] [Ruby master Feature#18554] Move unicode_normalize to a default gem
  2022-01-27 16:49 [ruby-core:107304] [Ruby master Bug#18554] Move unicode_normalize to a default gem headius (Charles Nutter)
  2022-01-28  1:03 ` [ruby-core:107309] [Ruby master Feature#18554] " shyouhei (Shyouhei Urabe)
  2022-01-28  9:33 ` [ruby-core:107317] " duerst
@ 2022-01-31 17:51 ` headius (Charles Nutter)
  2 siblings, 0 replies; 4+ messages in thread
From: headius (Charles Nutter) @ 2022-01-31 17:51 UTC (permalink / raw
  To: ruby-core

Issue #18554 has been updated by headius (Charles Nutter).


@duerst Thank you for spelling that out. I figured there are some additional nuances to this, since the unicode tables are also in C code and used internally by many parts of Ruby. In that regard, it is at least much easier for us to import the unicode_normalize tables since they are just a matter of copying Ruby code from CRuby to JRuby.

I would like to understand what would break if a user updated unicode_normalize to a newer (or older) version of Unicode than what is natively supported in CRuby. Is this situation likely to break something?

Along a similar line, could the unicode tables in C code also be moved out to a default gem and be made upgradable without rebuilding CRuby? If this were the case, we would contribute code to generate the same tables in Java and have full CRuby/JRuby support for upgrading Unicode tables from a gem.

Granted that these tables are probably used at the lowest levels of CRuby, during boot and otherwise, so I am unsure what other mine fields lie along this path.

----------------------------------------
Feature #18554: Move unicode_normalize to a default gem
https://bugs.ruby-lang.org/issues/18554#change-96299

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
Could we move the rest of unicode_normalize to a default gem?

The recent updates were mostly updating the Unicode tables, which a user might want to be able to update in an existing Ruby installation. Additionally, this is one of the few stdlib we have to copy into JRuby from the CRuby repository; it would be easier for both if we just pulled in a default gem.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-01-31 17:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-27 16:49 [ruby-core:107304] [Ruby master Bug#18554] Move unicode_normalize to a default gem headius (Charles Nutter)
2022-01-28  1:03 ` [ruby-core:107309] [Ruby master Feature#18554] " shyouhei (Shyouhei Urabe)
2022-01-28  9:33 ` [ruby-core:107317] " duerst
2022-01-31 17:51 ` [ruby-core:107397] " headius (Charles Nutter)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).