ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️
@ 2020-06-26 20:18 marcandre-ruby-core
  2020-06-27  2:14 ` [ruby-core:98973] " nobu
                   ` (31 more replies)
  0 siblings, 32 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-06-26 20:18 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been reported by marcandre (Marc-Andre Lafortune).

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989

* Author: marcandre (Marc-Andre Lafortune)
* Status: Open
* Priority: Normal
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:98973] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
@ 2020-06-27  2:14 ` nobu
  2020-08-24  4:13 ` [ruby-core:99677] " mame
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: nobu @ 2020-06-27  2:14 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by nobu (Nobuyoshi Nakada).

Assignee set to knu (Akinori MUSHA)
Status changed from Open to Assigned

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-86354

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99677] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
  2020-06-27  2:14 ` [ruby-core:98973] " nobu
@ 2020-08-24  4:13 ` mame
  2020-08-24 12:55 ` [ruby-core:99680] " marcandre-ruby-core
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: mame @ 2020-08-24  4:13 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by mame (Yusuke Endoh).


We discussed this issue at the last-month meeting.

https://github.com/ruby/dev-meeting-log/blob/master/DevelopersMeeting20200720Japan.md

> matz: Positive to introduce Set into core. But we need to first introduce a set literal. { x, y, z } is good, but JavaScript uses it as another meanings, so it would bring confusion. I have no idea.
> knu: As a set library maintainer, I'll first communicate, and try to solve the easy issues that can be sorted out in the library side.

@knu ?

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87164

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99680] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
  2020-06-27  2:14 ` [ruby-core:98973] " nobu
  2020-08-24  4:13 ` [ruby-core:99677] " mame
@ 2020-08-24 12:55 ` marcandre-ruby-core
  2020-08-24 15:12 ` [ruby-core:99683] " daniel
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-08-24 12:55 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


That's great news.

Was there discussion for a *frozen* set literal of symbols / strings as I proposed in #16994? For sets that need to be mutable or containing different types of objects, I find the existing `Set[...]` quite convenient.
Moreover, I would like to avoid having to add a magic comment `# frozen_set_literals: true`...

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87167

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99683] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (2 preceding siblings ...)
  2020-08-24 12:55 ` [ruby-core:99680] " marcandre-ruby-core
@ 2020-08-24 15:12 ` daniel
  2020-09-02  9:52 ` [ruby-core:99834] " eregontp
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: daniel @ 2020-08-24 15:12 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Dan0042 (Daniel DeLorme).


> matz: Positive to introduce Set into core. But we need to first introduce a set literal. { x, y, z } is good, but JavaScript uses it as another meanings, so it would bring confusion.

I have to ask, is [lack of] similarity with javascript really a problem? Apparently Python uses `{ x, y, z }` (and `set()` for an empty set). #5478 has a lot of discussion on Set literals but ultimately nothing beats the simplicity and obviousness of `{ x, y, z }`


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87171

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99834] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (3 preceding siblings ...)
  2020-08-24 15:12 ` [ruby-core:99683] " daniel
@ 2020-09-02  9:52 ` eregontp
  2020-09-02 13:44 ` [ruby-core:99838] " marcandre-ruby-core
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: eregontp @ 2020-09-02  9:52 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Eregon (Benoit Daloze).


IMHO there is no need for a literal Set notation (which would anyway restrict the type of keys, etc).

> Some of the upcoming feature requests would be easier (or only possible) to implement were Sets builtin.

@marcandre Which feature requests except having a literal notation (#16994) require Set to be builtin? I think only #16993.
I think we could make good progress here by addressing the other feature requests first.

Also moving Set to core makes might make it more complicated to backport to older Rubies.
(Unfortunately Set is only a default gem in 3.0, but it's probably easy enough to require some Set additions with a different require name)

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87357

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99838] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (4 preceding siblings ...)
  2020-09-02  9:52 ` [ruby-core:99834] " eregontp
@ 2020-09-02 13:44 ` marcandre-ruby-core
  2020-09-02 14:45 ` [ruby-core:99843] " daniel
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-09-02 13:44 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


Eregon (Benoit Daloze) wrote in #note-11:
> IMHO there is no need for a literal Set notation (which would anyway restrict the type of keys, etc).

I agree. `Set[...]` works very well already for constructing sets at runtime out of any time of elements.

> > Some of the upcoming feature requests would be easier (or only possible) to implement were Sets builtin.
> 
> @marcandre Which feature requests except having a literal notation (#16994) require Set to be builtin? I think only #16993.

Indeed, the static notation requires a change to the Ruby parser.

#16993 would be easier and faster if builtin, although it is implementable with a `transform_values` as I showed in the issue.

A bigger issue (and more important feature imo) is interoperability with Array #16990. If not in core, this would require monkeypatching a bunch of `Array` methods if not in core and would also require looping in Ruby so would not be as efficient.

> I think we could make good progress here by addressing the other feature requests first.
> 
> Also moving Set to core makes might make it more complicated to backport to older Rubies.

If we want to backport to older Rubies, this could still partly be done with a gem, we'll just need some conditional definitions.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87363

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99843] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (5 preceding siblings ...)
  2020-09-02 13:44 ` [ruby-core:99838] " marcandre-ruby-core
@ 2020-09-02 14:45 ` daniel
  2020-09-02 21:00 ` [ruby-core:99853] " eregontp
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: daniel @ 2020-09-02 14:45 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Dan0042 (Daniel DeLorme).


> which would anyway restrict the type of keys

I don't get that part. If you have a literal like `{ x, y, z }` then x/y/z can be any type of object. There's no restriction. But I agree `Set[]` is short and good enough.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87367

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99853] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (6 preceding siblings ...)
  2020-09-02 14:45 ` [ruby-core:99843] " daniel
@ 2020-09-02 21:00 ` eregontp
  2020-09-02 21:01 ` [ruby-core:99854] " eregontp
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: eregontp @ 2020-09-02 21:00 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Eregon (Benoit Daloze).


Also, does moving Set to core mean rewriting it to C?
I think that would be suboptimal, because every implementation would have to maintain its own native implementation then (or reuse the C extension).
It would also hurt readability.
Much of Ruby's performance can be achieved by JIT'ing Ruby code, and Hash is well optimized, so I'm not sure it's a performance gain either.

BTW SortedSet has this `rbtree` usage, that seems problematic to move in core (core should not depend on RubyGems):
https://github.com/ruby/ruby/blob/eada6350332155972f19bad52bd8621f607520a2/lib/set.rb#L708
Only moving Set but not SortedSet in core would avoid that issue though, but be somewhat inconsistent.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87381

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99854] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (7 preceding siblings ...)
  2020-09-02 21:00 ` [ruby-core:99853] " eregontp
@ 2020-09-02 21:01 ` eregontp
  2020-09-02 21:15 ` [ruby-core:99855] " marcandre-ruby-core
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: eregontp @ 2020-09-02 21:01 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Eregon (Benoit Daloze).


Dan0042 (Daniel DeLorme) wrote in #note-13:
> I don't get that part. If you have a literal like `{ x, y, z }` then x/y/z can be any type of object. There's no restriction. But I agree `Set[]` is short and good enough.

Right, I was thinking of #16994.
`Set[]` seems already good enough as a general syntax, and no confusion with JS object literals / potentially future Ruby Hash shorthand literals.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87382

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99855] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (8 preceding siblings ...)
  2020-09-02 21:01 ` [ruby-core:99854] " eregontp
@ 2020-09-02 21:15 ` marcandre-ruby-core
  2020-09-03  1:14 ` [ruby-core:99859] " matz
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-09-02 21:15 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


Eregon (Benoit Daloze) wrote in #note-14:
> Also, does moving Set to core mean rewriting it to C?
> I think that would be suboptimal, because every implementation would have to maintain its own native implementation then (or reuse the C extension).

I'm not sure, but `Set` is quite small, most of the processing uses `Hash`...

> Only moving Set but not SortedSet in core would avoid that issue though, but be somewhat inconsistent.

I should have clarified: while I am convinced that `Set` is a basic fundamental class, I have yet to use `SortedSet` or see it in use. I would keep it out of core (in `set`).


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87383

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99859] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (9 preceding siblings ...)
  2020-09-02 21:15 ` [ruby-core:99855] " marcandre-ruby-core
@ 2020-09-03  1:14 ` matz
  2020-09-03  7:48 ` [ruby-core:99865] " knu
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: matz @ 2020-09-03  1:14 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by matz (Yukihiro Matsumoto).


I agree with some of your proposals (#16990, #16991, #16993, #16995). I want @knu to work on this. If I missed something, he will tell us.

I strongly disagree with #16994. There's no evidence we need frozen sets of strings or symbols that much. Even if we do, I think frozen arrays should come first.

I weakly disagree with #16992. Currently set orders are determined by the internal hash. We may change the implementation in the future to improve performance or memory overhead. Fixing the order could possibly restrict the future implementation choice.

Matz.


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87386

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99865] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (10 preceding siblings ...)
  2020-09-03  1:14 ` [ruby-core:99859] " matz
@ 2020-09-03  7:48 ` knu
  2020-09-03 21:44 ` [ruby-core:99901] " marcandre-ruby-core
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: knu @ 2020-09-03  7:48 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by knu (Akinori MUSHA).


OK, I think it's good for us now to diverge from the original philosophy of Set when I first wrote it and pursue the performance and integrity with other parts of Ruby.  There are many parts in Set where I avoided optimization in order to retain extensibility (like subclassing and stuff), but I'll unlock the bar.

I'm also planning to remove SortedSet and leave it to an external gem because of the partial dependency on rbtree and the fallback implementation which performs quite poorly.

I'm not absolutely sure about introducing literals in the form of `{ a, b, c }`  because I myself is the one who is quite familiar with the shorthand notation introduced in ES6 and would like to have something similar in Ruby. 😆

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87392

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99901] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (11 preceding siblings ...)
  2020-09-03  7:48 ` [ruby-core:99865] " knu
@ 2020-09-03 21:44 ` marcandre-ruby-core
  2020-09-03 21:44 ` [ruby-core:99902] " marcandre-ruby-core
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-09-03 21:44 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


knu (Akinori MUSHA) wrote in #note-18:
> OK, I think it's good for us now to diverge from the original philosophy of Set when I first wrote it and pursue the performance and integrity with other parts of Ruby.  There are many parts in Set where I avoided optimization in order to retain extensibility (like subclassing and stuff), but I'll unlock the bar.
> 
> I'm also planning to remove SortedSet and leave it to an external gem because of the partial dependency on rbtree and the fallback implementation which performs quite poorly.
> 
> I'm not absolutely sure about introducing literals in the form of `{ a, b, c }`  because I myself is the one who is quite familiar with the shorthand notation introduced in ES6 and would like to have something similar in Ruby. 😆

This sounds great! 👍
Let me know if I can help.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87433

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:99902] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (12 preceding siblings ...)
  2020-09-03 21:44 ` [ruby-core:99901] " marcandre-ruby-core
@ 2020-09-03 21:44 ` marcandre-ruby-core
  2020-12-27 10:24 ` [ruby-core:101736] " mame
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2020-09-03 21:44 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


matz (Yukihiro Matsumoto) wrote in #note-17:
> I agree with some of your proposals (#16990, #16991, #16993, #16995). I want @knu to work on this. If I missed something, he will tell us.

Thank you for reviewing these :-)

> I strongly disagree with #16994. There's no evidence we need frozen sets of strings or symbols that much. Even if we do, I think frozen arrays should come first.

Right, I agree that there is a larger discussion about frozen & static literals to be had.

Would a non-frozen notation for Set of strings / symbols have a chance of being accepted? I would like to avoid `Set.new(%w[a list of words])` as can be currently seen in `RuboCop` or in Rails:

https://github.com/rails/rails/blob/master/actionpack/lib/action_dispatch/http/cache.rb#L128
https://github.com/rails/rails/blob/master/actionpack/lib/action_dispatch/http/headers.rb#L25-L44

I think that many gems have simply not really taken care about `Sets`. For example, this file in `Bundler` defines a 40-element array constant, only to call `include?` on it...

https://github.com/rubygems/rubygems/blob/master/bundler/lib/bundler/settings.rb#L9

I feel that a builtin way to write this might help.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-87434

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:101736] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (13 preceding siblings ...)
  2020-09-03 21:44 ` [ruby-core:99902] " marcandre-ruby-core
@ 2020-12-27 10:24 ` mame
  2021-01-12  4:46 ` [ruby-core:102013] " akr
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: mame @ 2020-12-27 10:24 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by mame (Yusuke Endoh).


FYI: SortedSet has been extracted out from set.rb since 3.0. So there is no longer the dependency problem.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89558

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102013] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (14 preceding siblings ...)
  2020-12-27 10:24 ` [ruby-core:101736] " mame
@ 2021-01-12  4:46 ` akr
  2021-01-12  4:57 ` [ruby-core:102014] " knu
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: akr @ 2021-01-12  4:46 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by akr (Akira Tanaka).


I like extending Hash instead of incorporating Set.
For example, #inspect omit hash values.
(Of course, modifying Hash#inspect is clearly not acceptable, I think some compromise is possible)

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89861

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102014] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (15 preceding siblings ...)
  2021-01-12  4:46 ` [ruby-core:102013] " akr
@ 2021-01-12  4:57 ` knu
  2021-01-12 16:05 ` [ruby-core:102033] " marcandre-ruby-core
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: knu @ 2021-01-12  4:57 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by knu (Akinori MUSHA).


akr (Akira Tanaka) wrote in #note-22:
> I like extending Hash instead of incorporating Set.
> For example, #inspect omit hash values.
> (Of course, modifying Hash#inspect is clearly not acceptable, I think some compromise is possible)

How would you implement methods and operators of Set in Hash?  Would they only work for hashes that meet special conditions?

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89862

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102033] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (16 preceding siblings ...)
  2021-01-12  4:57 ` [ruby-core:102014] " knu
@ 2021-01-12 16:05 ` marcandre-ruby-core
  2021-01-12 16:26 ` [ruby-core:102034] " akr
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2021-01-12 16:05 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


akr (Akira Tanaka) wrote in #note-22:
> I like extending Hash instead of incorporating Set.

I don't understand the impacts

One question thing that having builtin set would change:

```ruby
# shareable_constant_value: literal
X = Set[1, 2, 3]  # => error, or acceptable (and frozen)
```

I would like it to be acceptable and frozen.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89887

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102034] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (17 preceding siblings ...)
  2021-01-12 16:05 ` [ruby-core:102033] " marcandre-ruby-core
@ 2021-01-12 16:26 ` akr
  2021-01-12 16:42 ` [ruby-core:102035] " marcandre-ruby-core
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: akr @ 2021-01-12 16:26 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by akr (Akira Tanaka).


knu (Akinori MUSHA) wrote in #note-23:
> akr (Akira Tanaka) wrote in #note-22:
> > I like extending Hash instead of incorporating Set.
> > For example, #inspect omit hash values.
> > (Of course, modifying Hash#inspect is clearly not acceptable, I think some compromise is possible)
> 
> How would you implement methods and operators of Set in Hash?  Would they only work for hashes that meet special conditions?

- define a special constant (Hash::DummyValue = Object.new)
- Hash#inspect doesn't show the value: {1 => Hash::DummyValue}.inspect #=> "{1}"
- `{}` can be used as an empty set
- `hash << v` would be defined as `hash[v] = Hash::DummyValue`
- Hash#each doesn't work well as Set but we can use Hash#each_key.
- Hash#to_a doesn't work well as Set but we can use Hash#keys.


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89888

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102035] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (18 preceding siblings ...)
  2021-01-12 16:26 ` [ruby-core:102034] " akr
@ 2021-01-12 16:42 ` marcandre-ruby-core
  2021-01-13  0:13 ` [ruby-core:102040] " duerst
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2021-01-12 16:42 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


I still don't see the benefit of not simply having `Set` in core.

Moreover, does this make `ary & set` possible? efficient?

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89889

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102040] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (19 preceding siblings ...)
  2021-01-12 16:42 ` [ruby-core:102035] " marcandre-ruby-core
@ 2021-01-13  0:13 ` duerst
  2021-01-13  1:26 ` [ruby-core:102042] " akr
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: duerst @ 2021-01-13  0:13 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by duerst (Martin Dürst).


marcandre (Marc-Andre Lafortune) wrote in #note-26:
> I still don't see the benefit of not simply having `Set` in core.

One direction that distinguishes Ruby from some other programming languages is big classes. A typical example is that there's no `Stack` class, the stack interface is just part of `Array`. I don't know the history, but I think the original idea was to make the set interface also be part of Array. Your `ary & set` example also alludes to that idea. Most of that interface (maybe all) is currently available, but it's not very efficient. Using `Hash` for sets eliminates the efficiency problem, as we know.

> Moreover, does this make `ary & set` possible? efficient?

I haven't checked the details, but it shouldn't be too difficult to implement this, in a reasonably efficient way. If there are any efficiency problems, they would be on the Array side, not on the Hash side.

Another advantage of using Hash may be that we can get to a syntax that is very close to the actual Mathematical set syntax. We would like to write `{1, 2, 3}`, which may even be possible. At the least, it should be possible to move to the following: `{1 => Hash::DummyValue, 2, 3}`, i.e. if the first key has a special value, then the remaining values can be left out. We may want to choose a better name for `Hash::DummyValue`, maybe something like `Hash::SetDummy`.



----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89892

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102042] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (20 preceding siblings ...)
  2021-01-13  0:13 ` [ruby-core:102040] " duerst
@ 2021-01-13  1:26 ` akr
  2021-01-13  1:26 ` [ruby-core:102043] " knu
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: akr @ 2021-01-13  1:26 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by akr (Akira Tanaka).


akr (Akira Tanaka) wrote in #note-25:

> - Hash#each doesn't work well as Set but we can use Hash#each_key.

To be fair, "Hash#each doesn't work well" means Enumerable methods doesn't work well.
`enum_for(:each_key)` can be used as `{1, 2}.enum_for(:each_key).max`, though.

(I forgot to mention that Ruby can interpret `{1, 2}` as `{ 1 => Hash::SetDummy, 2 => Hash::SetDummy }`.  I feel the name SetDummy is better than DummyValue.  Thanks.)

It may be possible to solve this by Hash#each yields `key` instead of `[key, value]` if `value` is `Hash::SetDummy`.
I doubt it will be accepted, though.





----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89894

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102043] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (21 preceding siblings ...)
  2021-01-13  1:26 ` [ruby-core:102042] " akr
@ 2021-01-13  1:26 ` knu
  2021-01-13  1:48 ` [ruby-core:102044] " knu
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: knu @ 2021-01-13  1:26 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by knu (Akinori MUSHA).


I don't like the idea of making `array & hash` possible, as that would go against the trends of making Ruby type-friendly and error-prone.

akr (Akira Tanaka) wrote in #note-25:
> - define a special constant (Hash::DummyValue = Object.new)
> - Hash#inspect doesn't show the value: {1 => Hash::DummyValue}.inspect #=> "{1}"
> - `{}` can be used as an empty set
> - `hash << v` would be defined as `hash[v] = Hash::DummyValue`
> - Hash#each doesn't work well as Set but we can use Hash#each_key.
> - Hash#to_a doesn't work well as Set but we can use Hash#keys.

I'd appreciate the list as implementation details, but if it were like some hashes are used as sets, programs might get less readable.
For example, a program that goes `hash = {}; hash << 1; hash[:key] = :value` would likely be a mistake, but that couldn't be easily detected without explicit type annotation.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89895

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102044] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (22 preceding siblings ...)
  2021-01-13  1:26 ` [ruby-core:102043] " knu
@ 2021-01-13  1:48 ` knu
  2021-01-13  3:31 ` [ruby-core:102046] " marcandre-ruby-core
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: knu @ 2021-01-13  1:48 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by knu (Akinori MUSHA).


knu (Akinori MUSHA) wrote in #note-29:
> For example, a program that goes `hash = {}; hash << 1; hash[:key] = :value` would likely be a mistake, but that couldn't be easily detected without explicit type annotation.

Well, even with type annotation as long as it is a direct Hash instance.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89896

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102046] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (23 preceding siblings ...)
  2021-01-13  1:48 ` [ruby-core:102044] " knu
@ 2021-01-13  3:31 ` marcandre-ruby-core
  2021-01-13  5:42 ` [ruby-core:102047] " matz
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: marcandre-ruby-core @ 2021-01-13  3:31 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by marcandre (Marc-Andre Lafortune).


I'm sorry, I am completely confused by this discussion, I can't make sense of it.

From what I read:

```ruby
set = {:a, :b, :c} # or whatever notation is used
set.class # => Hash ???
set[:a] # => Hash::SetDummy ???
set[:elem] = :something_else # => what does that mean?
set.to_a # => [:a, :b, :c, [:elem, :something_else]] ???
set.transform_values { ... } ???
set + [1, 2, 3] # => ???
set + Set[:d, :e] # => ???
```

Am I understanding correctly?

duerst (Martin Dürst) wrote in #note-27:
> Using `Hash` for sets eliminates the efficiency problem, as we know.

Are you talking about using `Hash` internally, or having Hash's class and API?

> Another advantage of using Hash may be that we can get to a syntax that is very close to the actual Mathematical set syntax. We would like to write `{1, 2, 3}`, which may even be possible.

I don't understand how this might be related in any way to using a Hash? What meaning we decide to give to `{a, b, c}` is completely independent to what `{a: :a, b: :b}` means, or to what `%w{a, b, c}` means, or to `{|a, b| c}` for that matter.

> At the least, it should be possible to move to the following: `{1 => Hash::DummyValue, 2, 3}`, i.e. if the first key has a special value, then the remaining values can be left out. We may want to choose a better name for `Hash::DummyValue`, maybe something like `Hash::SetDummy`.

Are you suggesting that instead of writing `Set[1, 2, 3]` we would literally write `{1 => Hash::SetDummy, 2, 3}`?


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89898

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102047] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (24 preceding siblings ...)
  2021-01-13  3:31 ` [ruby-core:102046] " marcandre-ruby-core
@ 2021-01-13  5:42 ` matz
  2021-02-22 23:48 ` [ruby-core:102574] " blogger
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: matz @ 2021-01-13  5:42 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by matz (Yukihiro Matsumoto).


I agree with adding `Set` by default. But no set literals nor `SortedSet` at the moment.

Matz.


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-89903

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102574] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (25 preceding siblings ...)
  2021-01-13  5:42 ` [ruby-core:102047] " matz
@ 2021-02-22 23:48 ` blogger
  2021-02-23 12:39 ` [ruby-core:102577] " eregontp
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: blogger @ 2021-02-22 23:48 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Student (Nathan Zook).


marcandre (Marc-Andre Lafortune) wrote in #note-31:
> I'm sorry, I am completely confused by this discussion, I can't make sense of it.
> 
> From what I read:
> 
> ```ruby
> set = {:a, :b, :c} # or whatever notation is used
> set.class # => Hash ???
> set[:a] # => Hash::SetDummy ???
> set[:elem] = :something_else # => what does that mean?
> set.to_a # => [:a, :b, :c, [:elem, :something_else]] ???
> set.transform_values { ... } ???
> set + [1, 2, 3] # => ???
> set + Set[:d, :e] # => ???
> ```
> 
> Am I understanding correctly?
> 
> duerst (Martin Dürst) wrote in #note-27:
> > Using `Hash` for sets eliminates the efficiency problem, as we know.
> 
> Are you talking about using `Hash` internally, or having Hash's class and API?
> 
> > Another advantage of using Hash may be that we can get to a syntax that is very close to the actual Mathematical set syntax. We would like to write `{1, 2, 3}`, which may even be possible.
> 
> I don't understand how this might be related in any way to using a Hash? What meaning we decide to give to `{a, b, c}` is completely independent to what `{a: :a, b: :b}` means, or to what `%w{a, b, c}` means, or to `{|a, b| c}` for that matter.
> 
> > At the least, it should be possible to move to the following: `{1 => Hash::DummyValue, 2, 3}`, i.e. if the first key has a special value, then the remaining values can be left out. We may want to choose a better name for `Hash::DummyValue`, maybe something like `Hash::SetDummy`.
> 
> Are you suggesting that instead of writing `Set[1, 2, 3]` we would literally write `{1 => Hash::SetDummy, 2, 3}`?

THIS.

Sets are in no way related to Hashes.  The fact that they are/can be *implemented* as hash keys is convenient, but surfacing such an implementations detail violates programming best practice to a crazy amount.


----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-90558

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:102577] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (26 preceding siblings ...)
  2021-02-22 23:48 ` [ruby-core:102574] " blogger
@ 2021-02-23 12:39 ` eregontp
  2021-04-28  6:05 ` [ruby-core:103638] " grzegorz.jakubiak
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: eregontp @ 2021-02-23 12:39 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by Eregon (Benoit Daloze).


I don't think mixing Hash and Set is good at all. They have fundamentally different APIs.

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-90561

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:103638] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (27 preceding siblings ...)
  2021-02-23 12:39 ` [ruby-core:102577] " eregontp
@ 2021-04-28  6:05 ` grzegorz.jakubiak
  2021-10-23 21:04 ` [ruby-core:105760] " greggzst (Grzegorz Jakubiak)
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: grzegorz.jakubiak @ 2021-04-28  6:05 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by greggzst (Grzegorz Jakubiak).


Is there a chance this feature lands in 3.1?

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-91734

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:105760] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (28 preceding siblings ...)
  2021-04-28  6:05 ` [ruby-core:103638] " grzegorz.jakubiak
@ 2021-10-23 21:04 ` greggzst (Grzegorz Jakubiak)
  2022-02-17 10:12 ` [ruby-core:107623] " knu (Akinori MUSHA)
  2022-02-17 12:13 ` [ruby-core:107628] " matz (Yukihiro Matsumoto)
  31 siblings, 0 replies; 33+ messages in thread
From: greggzst (Grzegorz Jakubiak) @ 2021-10-23 21:04 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by greggzst (Grzegorz Jakubiak).


Any update on that?

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-94269

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:107623] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (29 preceding siblings ...)
  2021-10-23 21:04 ` [ruby-core:105760] " greggzst (Grzegorz Jakubiak)
@ 2022-02-17 10:12 ` knu (Akinori MUSHA)
  2022-02-17 12:13 ` [ruby-core:107628] " matz (Yukihiro Matsumoto)
  31 siblings, 0 replies; 33+ messages in thread
From: knu (Akinori MUSHA) @ 2022-02-17 10:12 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by knu (Akinori MUSHA).


https://github.com/ruby/ruby/pull/5563

----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-96534

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [ruby-core:107628] [Ruby master Feature#16989] Sets: need ♥️
  2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
                   ` (30 preceding siblings ...)
  2022-02-17 10:12 ` [ruby-core:107623] " knu (Akinori MUSHA)
@ 2022-02-17 12:13 ` matz (Yukihiro Matsumoto)
  31 siblings, 0 replies; 33+ messages in thread
From: matz (Yukihiro Matsumoto) @ 2022-02-17 12:13 UTC (permalink / raw)
  To: ruby-core

Issue #16989 has been updated by matz (Yukihiro Matsumoto).


I agree with merging the pull request above, and see how it goes.

Matz.



----------------------------------------
Feature #16989: Sets: need ♥️
https://bugs.ruby-lang.org/issues/16989#change-96539

* Author: marcandre (Marc-Andre Lafortune)
* Status: Assigned
* Priority: Normal
* Assignee: knu (Akinori MUSHA)
----------------------------------------
I am opening a series of feature requests on `Set`, all of them based on this usecase.

The main usecase I have in mind is my recent experience with `RuboCop`. I noticed a big number of frozen arrays being used only to later call `include?` on them. This is `O(n)` instead of `O(1)`.

Trying to convert them to `Set`s causes major compatibility issues, as well as very frustrating situations and some cases that would make them much less efficient.

Because of these incompatibilities, `RuboCop` is in the process of using a custom class based on `Array` with optimized `include?` and `===`. `RuboCop` runs multiple checks on Ruby code. Those checks are called cops. `RuboCop` performance is (IMO) pretty bad and some cops  currently are in `O(n^2)` where n is the size of the code being inspected. Even given these extremely inefficient cops, optimizing the 100+ such arrays (most of which are quite small btw) gave a 5% speed boost.

RuboCop PRs for reference: https://github.com/rubocop-hq/rubocop-ast/pull/29
https://github.com/rubocop-hq/rubocop/pull/8133

My experience tells me that there are many other opportunities to use `Set`s that are missed because `Set`s are not builtin, not known enough and have no shorthand notation.

In this issue I'd like to concentrate the discussion on the following request: `Set`s should be core objects, in the same way that `Complex` were not and are now. Some of the upcoming feature requests would be easier (or only possible) to implement were `Set`s builtin.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-02-17 12:13 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 20:18 [ruby-core:98964] [Ruby master Feature#16989] Sets: need ♥️ marcandre-ruby-core
2020-06-27  2:14 ` [ruby-core:98973] " nobu
2020-08-24  4:13 ` [ruby-core:99677] " mame
2020-08-24 12:55 ` [ruby-core:99680] " marcandre-ruby-core
2020-08-24 15:12 ` [ruby-core:99683] " daniel
2020-09-02  9:52 ` [ruby-core:99834] " eregontp
2020-09-02 13:44 ` [ruby-core:99838] " marcandre-ruby-core
2020-09-02 14:45 ` [ruby-core:99843] " daniel
2020-09-02 21:00 ` [ruby-core:99853] " eregontp
2020-09-02 21:01 ` [ruby-core:99854] " eregontp
2020-09-02 21:15 ` [ruby-core:99855] " marcandre-ruby-core
2020-09-03  1:14 ` [ruby-core:99859] " matz
2020-09-03  7:48 ` [ruby-core:99865] " knu
2020-09-03 21:44 ` [ruby-core:99901] " marcandre-ruby-core
2020-09-03 21:44 ` [ruby-core:99902] " marcandre-ruby-core
2020-12-27 10:24 ` [ruby-core:101736] " mame
2021-01-12  4:46 ` [ruby-core:102013] " akr
2021-01-12  4:57 ` [ruby-core:102014] " knu
2021-01-12 16:05 ` [ruby-core:102033] " marcandre-ruby-core
2021-01-12 16:26 ` [ruby-core:102034] " akr
2021-01-12 16:42 ` [ruby-core:102035] " marcandre-ruby-core
2021-01-13  0:13 ` [ruby-core:102040] " duerst
2021-01-13  1:26 ` [ruby-core:102042] " akr
2021-01-13  1:26 ` [ruby-core:102043] " knu
2021-01-13  1:48 ` [ruby-core:102044] " knu
2021-01-13  3:31 ` [ruby-core:102046] " marcandre-ruby-core
2021-01-13  5:42 ` [ruby-core:102047] " matz
2021-02-22 23:48 ` [ruby-core:102574] " blogger
2021-02-23 12:39 ` [ruby-core:102577] " eregontp
2021-04-28  6:05 ` [ruby-core:103638] " grzegorz.jakubiak
2021-10-23 21:04 ` [ruby-core:105760] " greggzst (Grzegorz Jakubiak)
2022-02-17 10:12 ` [ruby-core:107623] " knu (Akinori MUSHA)
2022-02-17 12:13 ` [ruby-core:107628] " matz (Yukihiro Matsumoto)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).