[ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses

ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed

* [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses
@ 2023-11-27 20:16 kddnewton (Kevin Newton) via ruby-core
  2023-11-28 10:28 ` [ruby-core:115503] " Eregon (Benoit Daloze) via ruby-core
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: kddnewton (Kevin Newton) via ruby-core @ 2023-11-27 20:16 UTC (permalink / raw
  To: ruby-core; +Cc: kddnewton (Kevin Newton)

Issue #20024 has been reported by kddnewton (Kevin Newton).

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115503] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
@ 2023-11-28 10:28 ` Eregon (Benoit Daloze) via ruby-core
  2023-11-28 13:35 ` [ruby-core:115506] " kddnewton (Kevin Newton) via ruby-core
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2023-11-28 10:28 UTC (permalink / raw
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #20024 has been updated by Eregon (Benoit Daloze).

There are dozens of different error messages for SyntaxError so that would mean dozens of subclasses.
That seems a lot to me.
Also it would be equally hard to name all these subclasses.
And we could not rename them easily either (more easily than changing messages but still).

> Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

I think we are testing what is necessary. As you see IRB does not allow for case changes there because it does not expect it.
Many other gems expect a very specific exception message (not just for SyntaxError).
So I think it's clear Prism needs to have the exact same messages whenever possible, when used as a Ruby parser for running Ruby code.

For tooling I think having another more verbose/clearer/improved set of messages would make sense (as suggested by Andrii on the Prism issue tracker). I.e. the gem could provide improved messages (maybe as an option?).

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105432

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115506] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
  2023-11-28 10:28 ` [ruby-core:115503] " Eregon (Benoit Daloze) via ruby-core
@ 2023-11-28 13:35 ` kddnewton (Kevin Newton) via ruby-core
  2023-11-28 13:44 ` [ruby-core:115507] " byroot (Jean Boussier) via ruby-core
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: kddnewton (Kevin Newton) via ruby-core @ 2023-11-28 13:35 UTC (permalink / raw
  To: ruby-core; +Cc: kddnewton (Kevin Newton)

Issue #20024 has been updated by kddnewton (Kevin Newton).

I understand the need for compatibility with regard to Prism, but this ticket isn't about Prism specifically. I'm proposing this as a solution to the problem in general, devoid of the Prism context.

Regardless of other parsers, there exists a problem that tools are relying on error messages for behavior. We can fix that by introducing subclasses. Of course it won't fix the immediate situation for anyone, but going forward it will be a much better situation.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105436

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115507] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
  2023-11-28 10:28 ` [ruby-core:115503] " Eregon (Benoit Daloze) via ruby-core
  2023-11-28 13:35 ` [ruby-core:115506] " kddnewton (Kevin Newton) via ruby-core
@ 2023-11-28 13:44 ` byroot (Jean Boussier) via ruby-core
  2023-11-28 15:19 ` [ruby-core:115508] " Eregon (Benoit Daloze) via ruby-core
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2023-11-28 13:44 UTC (permalink / raw
  To: ruby-core; +Cc: byroot (Jean Boussier)

Issue #20024 has been updated by byroot (Jean Boussier).

Rather than sub classes, which have the annoying issue of having to stick around even if for some reason the case no longer exist, could we simply expose `SyntaxError#issue` (the name isn't very inspired) that would return a symbol?

e.g. `:void_value`, `:number_parameter_used`, etc etc.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105437

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115508] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (2 preceding siblings ...)
  2023-11-28 13:44 ` [ruby-core:115507] " byroot (Jean Boussier) via ruby-core
@ 2023-11-28 15:19 ` Eregon (Benoit Daloze) via ruby-core
  2023-11-28 15:27 ` [ruby-core:115509] " Dan0042 (Daniel DeLorme) via ruby-core
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2023-11-28 15:19 UTC (permalink / raw
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #20024 has been updated by Eregon (Benoit Daloze).

A Symbol per type of SyntaxError seems easier to maintain and wouldn't create so many classes/constants/etc which would be messy in docs.
I think that's a much easier improvement to do than adding a bunch of subclasses.
It's also better memory-wise as classes are not small.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105438

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115509] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (3 preceding siblings ...)
  2023-11-28 15:19 ` [ruby-core:115508] " Eregon (Benoit Daloze) via ruby-core
@ 2023-11-28 15:27 ` Dan0042 (Daniel DeLorme) via ruby-core
  2023-11-28 17:11 ` [ruby-core:115510] " kddnewton (Kevin Newton) via ruby-core
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Dan0042 (Daniel DeLorme) via ruby-core @ 2023-11-28 15:27 UTC (permalink / raw
  To: ruby-core; +Cc: Dan0042 (Daniel DeLorme)

Issue #20024 has been updated by Dan0042 (Daniel DeLorme).

byroot (Jean Boussier) wrote in #note-3:
> Rather than sub classes, which have the annoying issue of having to stick around even if for some reason the case no longer exist, could we simply expose `SyntaxError#issue` (the name isn't very inspired) that would return a symbol?

That sounds good to me. It's much less cumbersome than dozens of subclasses, and has much better forward-compatibility. If this is introduced in ruby 3.3 but a "foo" syntax error is added to ruby 3.4, then `err.is_a?(FooSyntaxError)` would raise an error in ruby 3.3 but `err.message_code == :foo` would not.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105439

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115510] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (4 preceding siblings ...)
  2023-11-28 15:27 ` [ruby-core:115509] " Dan0042 (Daniel DeLorme) via ruby-core
@ 2023-11-28 17:11 ` kddnewton (Kevin Newton) via ruby-core
  2023-11-28 17:17 ` [ruby-core:115511] " zverok (Victor Shepelev) via ruby-core
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: kddnewton (Kevin Newton) via ruby-core @ 2023-11-28 17:11 UTC (permalink / raw
  To: ruby-core; +Cc: kddnewton (Kevin Newton)

Issue #20024 has been updated by kddnewton (Kevin Newton).

I think I am okay with that compromise. I don't like that you won't be able to rescue specific errors, but I understand the concern about forward compatibility.

My other desire for subclasses is that different syntax errors have different information on them. For example, `_1 = 1` is a syntax error that has the additional information that you're assigning to `_1`. Most libraries right now parse that out of the message, but if we had separate objects then we could define the shapes of the errors to have that information as a field. For example, that would be the `name` or `identifier` field. This could probably be addressed with something like a `metadata` hash or something else, if we're just adding fields to SyntaxError.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105440

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115511] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (5 preceding siblings ...)
  2023-11-28 17:11 ` [ruby-core:115510] " kddnewton (Kevin Newton) via ruby-core
@ 2023-11-28 17:17 ` zverok (Victor Shepelev) via ruby-core
  2023-11-28 17:23 ` [ruby-core:115513] " jeremyevans0 (Jeremy Evans) via ruby-core
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: zverok (Victor Shepelev) via ruby-core @ 2023-11-28 17:17 UTC (permalink / raw
  To: ruby-core; +Cc: zverok (Victor Shepelev)

Issue #20024 has been updated by zverok (Victor Shepelev).

>  I don't like that you won't be able to rescue specific errors, but I understand the concern about forward compatibility.

An aside note on this: in general, `rescue` clause is already looking very alike pattern-matching, and if we continue this approach (with adding causes/details to exceptions), we might once want to be able to...

```ruby
rescue SyntaxError(issue: :void_value) => ex
```

Just a thought (but might have many interesting consequences).

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105441

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115513] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (6 preceding siblings ...)
  2023-11-28 17:17 ` [ruby-core:115511] " zverok (Victor Shepelev) via ruby-core
@ 2023-11-28 17:23 ` jeremyevans0 (Jeremy Evans) via ruby-core
  2023-11-28 17:25 ` [ruby-core:115515] " kddnewton (Kevin Newton) via ruby-core
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jeremyevans0 (Jeremy Evans) via ruby-core @ 2023-11-28 17:23 UTC (permalink / raw
  To: ruby-core; +Cc: jeremyevans0 (Jeremy Evans)

Issue #20024 has been updated by jeremyevans0 (Jeremy Evans).

zverok (Victor Shepelev) wrote in #note-7:
> An aside note on this: in general, `rescue` clause is already looking very alike pattern-matching, and if we continue this approach (with adding causes/details to exceptions), we might once want to be able to...
> 
> ```ruby
> rescue SyntaxError(issue: :void_value) => ex
> ```

That's already valid syntax, which calls the `SyntaxError` method with an `issue` keyword argument (the method should return a class or module to work correctly).

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105442

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115515] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (7 preceding siblings ...)
  2023-11-28 17:23 ` [ruby-core:115513] " jeremyevans0 (Jeremy Evans) via ruby-core
@ 2023-11-28 17:25 ` kddnewton (Kevin Newton) via ruby-core
  2023-11-28 17:31 ` [ruby-core:115516] " zverok (Victor Shepelev) via ruby-core
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: kddnewton (Kevin Newton) via ruby-core @ 2023-11-28 17:25 UTC (permalink / raw
  To: ruby-core; +Cc: kddnewton (Kevin Newton)

Issue #20024 has been updated by kddnewton (Kevin Newton).

@zverok That's an interesting proposal but let's move that discussion to another issue. I don't want this to become a syntax discussion because that could take us in a totally different direction.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105444

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115516] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (8 preceding siblings ...)
  2023-11-28 17:25 ` [ruby-core:115515] " kddnewton (Kevin Newton) via ruby-core
@ 2023-11-28 17:31 ` zverok (Victor Shepelev) via ruby-core
  2023-11-28 22:41 ` [ruby-core:115527] " Dan0042 (Daniel DeLorme) via ruby-core
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: zverok (Victor Shepelev) via ruby-core @ 2023-11-28 17:31 UTC (permalink / raw
  To: ruby-core; +Cc: zverok (Victor Shepelev)

Issue #20024 has been updated by zverok (Victor Shepelev).

@jeremyevans0 
> That's already valid syntax

I am aware; I just have a feeling that repurposing it with PM might still be possible (the chance that a lot of code uses this syntax with `UpcasedMethods` to produce exception classes dynamically is relatively low—though, I admit, not non-existent). 

Anyway, I think @kddnewton is correct: here is not the best place, and the idea is very rough, I am sorry for stealing the discussion, just got a sudden idea.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105445

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115527] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (9 preceding siblings ...)
  2023-11-28 17:31 ` [ruby-core:115516] " zverok (Victor Shepelev) via ruby-core
@ 2023-11-28 22:41 ` Dan0042 (Daniel DeLorme) via ruby-core
  2023-11-30  3:55 ` [ruby-core:115533] " yui-knk (Kaneko Yuichiro) via ruby-core
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Dan0042 (Daniel DeLorme) via ruby-core @ 2023-11-28 22:41 UTC (permalink / raw
  To: ruby-core; +Cc: Dan0042 (Daniel DeLorme)

Issue #20024 has been updated by Dan0042 (Daniel DeLorme).

This idea of subclasses/identifier is coming up in the context of Prism and SyntaxError, but I think it applies to any kind of Exception, so I would like to see this #issue added to Exception rather than just SyntaxError. For pretty much any type of Exception I see things like this:

```ruby
rescue SomeError => err
  if err.message =~ /some pattern/
    #ignore the error, or do something else
  else
    raise
  end
```

Ideally this `SomeError` would have a subclass that allows to catch the exact error subtype, but in practice that doesn't alway happen, and is often out of our control. it would be really nice to be able to compare against a Symbol rather than a string message. Especially for a message that might be expensive to generate (#18285).

---

(PS: @zverok I think your idea is *very much* worth exploring)

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105457

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115533] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (10 preceding siblings ...)
  2023-11-28 22:41 ` [ruby-core:115527] " Dan0042 (Daniel DeLorme) via ruby-core
@ 2023-11-30  3:55 ` yui-knk (Kaneko Yuichiro) via ruby-core
  2023-11-30  8:50 ` [ruby-core:115544] " ioquatix (Samuel Williams) via ruby-core
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: yui-knk (Kaneko Yuichiro) via ruby-core @ 2023-11-30  3:55 UTC (permalink / raw
  To: ruby-core; +Cc: yui-knk (Kaneko Yuichiro)

Issue #20024 has been updated by yui-knk (Kaneko Yuichiro).

# SyntaxError includes multiple errors

`SyntaxError` includes multiple errors like below, in this case two errors are included into one `SyntaxError`. Therefore it needs to consider how to handle such cases.

```ruby
begin
  eval <<~CODE
    def m
      retry

  CODE
rescue SyntaxError => e
  puts e.message
end

# (eval at test.rb:2):2: Invalid retry without rescue
#   retry
#   ^~~~~
# (eval at test.rb:2):3: syntax error, unexpected end-of-input, expecting `end' or dummy end
```

We need to avoid losing information to provide rich information to `SyntaxError` users. Therefore these are not options for this problem:
* Merge multiple errors into one `SyntaxError` subclass. Because this is misleading if these errors are different types of errors.
* Use only a single error and ignore following erorrs. Because information is lost.

# irb use case and error tolerance parser

Regarding the [irb use case](https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267), it categorizes error , which is recoverable by adding tokens to the end, as `recoverable_error` and error, which is recoverable by deleting tokens, as `unrecoverable_error`, so that irb can determine to require more input or not.

When irb was created, error tolerance parser didn't exist. Therefore irb needs to take care of how to recover the input using `SyntaxError#message`. However it's parser responsibility to recover errors.

irb can categorize syntax error if error tolerance parser provides information how the parser recover errors.
If only single error and recovery requires only token insert operations for the last of input, it's `recoverable_error`.

By the way, does the parser always raise `SyntaxError` for invalid input? For this use case, it's better to not raise `SyntaxError` but users ask to parser to have syntax errors or not after parsing.

# Proposal

Considering these points, my proposal is defining parser interface.

* Parser can run without SyntaxError mode
* Parser provides a method to get syntax errors
* Each syntax error includes
  * message: "syntax error, unexpected end-of-input, expecting `end' or dummy end"
  * location: (1,0)-(1,1), first/last & line/column
  * operations for recovery:
    * insert / delete
    * location of recovery

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105468

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115544] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (11 preceding siblings ...)
  2023-11-30  3:55 ` [ruby-core:115533] " yui-knk (Kaneko Yuichiro) via ruby-core
@ 2023-11-30  8:50 ` ioquatix (Samuel Williams) via ruby-core
  2023-12-04 17:23 ` [ruby-core:115586] " kddnewton (Kevin Newton) via ruby-core
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ioquatix (Samuel Williams) via ruby-core @ 2023-11-30  8:50 UTC (permalink / raw
  To: ruby-core; +Cc: ioquatix (Samuel Williams)

Issue #20024 has been updated by ioquatix (Samuel Williams).

I'm interested in this topic and proposed a while ago to add `SyntaxError#diagnostics` which is a more detailed information relating to parse errors and how to rectify them. I'd be interested in discussing this in more detail.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105479

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115586] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (12 preceding siblings ...)
  2023-11-30  8:50 ` [ruby-core:115544] " ioquatix (Samuel Williams) via ruby-core
@ 2023-12-04 17:23 ` kddnewton (Kevin Newton) via ruby-core
  2023-12-07  2:58 ` [ruby-core:115615] " yui-knk (Kaneko Yuichiro) via ruby-core
  2024-01-10 22:46 ` [ruby-core:116162] " palkan (Vladimir Dementyev) via ruby-core
  15 siblings, 0 replies; 17+ messages in thread
From: kddnewton (Kevin Newton) via ruby-core @ 2023-12-04 17:23 UTC (permalink / raw
  To: ruby-core; +Cc: kddnewton (Kevin Newton)

Issue #20024 has been updated by kddnewton (Kevin Newton).

@yui-knk your proposal matches how prism works today. Prism never raises a syntax error, it returns a list of errors that have (exactly as you have said) a type, a message, and a location.

I agree that a single raised error is not great because it loses information. I like the idea that @ioquatix has proposed of `SyntaxError#diagnostics`. With such a proposal, we could attach all syntax errors in the file to a single raised error, and you could query them appropriately.

In this case we would need to design the shape of a `SyntaxError::Diagnostic` class. I would imagine it would have the same fields that @yui-knk is proposing, something like:

```ruby
class SyntaxError::Location
  attr_reader :start_line, :end_line # 1-indexed
  attr_reader :start_column, :end_column # in bytes
end

class SyntaxError::Diagnostic
  attr_reader :location # SyntaxError::Location
  attr_reader :message # String
  attr_reader :type # Symbol
end

class SyntaxError
  attr_reader :diagnostics # Array[SyntaxError::Diagnostic]
end
```

Would that be amenable?

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105519

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:115615] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (13 preceding siblings ...)
  2023-12-04 17:23 ` [ruby-core:115586] " kddnewton (Kevin Newton) via ruby-core
@ 2023-12-07  2:58 ` yui-knk (Kaneko Yuichiro) via ruby-core
  2024-01-10 22:46 ` [ruby-core:116162] " palkan (Vladimir Dementyev) via ruby-core
  15 siblings, 0 replies; 17+ messages in thread
From: yui-knk (Kaneko Yuichiro) via ruby-core @ 2023-12-07  2:58 UTC (permalink / raw
  To: ruby-core; +Cc: yui-knk (Kaneko Yuichiro)

Issue #20024 has been updated by yui-knk (Kaneko Yuichiro).

For the [irb use cases](https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267), irb wants to know additional input may be able to recover the current invalid input.
For example `end; def m` is not recoverable with additional input because `end;` can not be recovered by adding some tokens. Then irb raises "syntax error" for the input.
On the other hand `def m` is recoverable with additional input. Then irb requires next input.
Parser needs to provide information how the parser recover errors for removing dependency from irb to syntax error messages.  

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-105546

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ruby-core:116162] [Ruby master Feature#20024] SyntaxError subclasses
  2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
                   ` (14 preceding siblings ...)
  2023-12-07  2:58 ` [ruby-core:115615] " yui-knk (Kaneko Yuichiro) via ruby-core
@ 2024-01-10 22:46 ` palkan (Vladimir Dementyev) via ruby-core
  15 siblings, 0 replies; 17+ messages in thread
From: palkan (Vladimir Dementyev) via ruby-core @ 2024-01-10 22:46 UTC (permalink / raw
  To: ruby-core; +Cc: palkan (Vladimir Dementyev)

Issue #20024 has been updated by palkan (Vladimir Dementyev).

kddnewton (Kevin Newton) wrote in #note-14:

> In this case we would need to design the shape of a `SyntaxError::Diagnostic` class. I would imagine it would have the same fields that @yui-knk is proposing, something like:
>  ...

I think, we can get some inspiration from Parser: [diagnostic](https://github.com/ruby-next/parser/blob/master/lib/parser/diagnostic.rb) ([messages](https://github.com/ruby-next/parser/blob/master/lib/parser/messages.rb) — a lot of them, btw). For example, it also includes a _level_ information (error, fatal, warn); something similar to the "recoverable" property proposed above, but more granular.

----------------------------------------
Feature #20024: SyntaxError subclasses
https://bugs.ruby-lang.org/issues/20024#change-106168

* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in:

https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/lib/irb/ruby-lex.rb#L220-L267

Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example:

https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/if_spec.rb#L323
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/numbered_parameters_spec.rb#L31
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/pattern_matching_spec.rb#L210
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/rescue_spec.rb#L262
https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/language/yield_spec.rb#L196

It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best.

I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling.

Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them.

This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-01-10 22:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-27 20:16 [ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses kddnewton (Kevin Newton) via ruby-core
2023-11-28 10:28 ` [ruby-core:115503] " Eregon (Benoit Daloze) via ruby-core
2023-11-28 13:35 ` [ruby-core:115506] " kddnewton (Kevin Newton) via ruby-core
2023-11-28 13:44 ` [ruby-core:115507] " byroot (Jean Boussier) via ruby-core
2023-11-28 15:19 ` [ruby-core:115508] " Eregon (Benoit Daloze) via ruby-core
2023-11-28 15:27 ` [ruby-core:115509] " Dan0042 (Daniel DeLorme) via ruby-core
2023-11-28 17:11 ` [ruby-core:115510] " kddnewton (Kevin Newton) via ruby-core
2023-11-28 17:17 ` [ruby-core:115511] " zverok (Victor Shepelev) via ruby-core
2023-11-28 17:23 ` [ruby-core:115513] " jeremyevans0 (Jeremy Evans) via ruby-core
2023-11-28 17:25 ` [ruby-core:115515] " kddnewton (Kevin Newton) via ruby-core
2023-11-28 17:31 ` [ruby-core:115516] " zverok (Victor Shepelev) via ruby-core
2023-11-28 22:41 ` [ruby-core:115527] " Dan0042 (Daniel DeLorme) via ruby-core
2023-11-30  3:55 ` [ruby-core:115533] " yui-knk (Kaneko Yuichiro) via ruby-core
2023-11-30  8:50 ` [ruby-core:115544] " ioquatix (Samuel Williams) via ruby-core
2023-12-04 17:23 ` [ruby-core:115586] " kddnewton (Kevin Newton) via ruby-core
2023-12-07  2:58 ` [ruby-core:115615] " yui-knk (Kaneko Yuichiro) via ruby-core
2024-01-10 22:46 ` [ruby-core:116162] " palkan (Vladimir Dementyev) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).