ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
@ 2024-02-15  8:45 usa (Usaku NAKAMURA) via ruby-core
  2024-02-15  9:05 ` [ruby-core:116771] " mrkn (Kenta Murata) via ruby-core
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: usa (Usaku NAKAMURA) via ruby-core @ 2024-02-15  8:45 UTC (permalink / raw
  To: ruby-core; +Cc: usa (Usaku NAKAMURA)

Issue #20266 has been reported by usa (Usaku NAKAMURA).

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116771] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
@ 2024-02-15  9:05 ` mrkn (Kenta Murata) via ruby-core
  2024-02-15  9:32 ` [ruby-core:116774] " nobu (Nobuyoshi Nakada) via ruby-core
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: mrkn (Kenta Murata) via ruby-core @ 2024-02-15  9:05 UTC (permalink / raw
  To: ruby-core; +Cc: mrkn (Kenta Murata)

Issue #20266 has been updated by mrkn (Kenta Murata).


I agree with this proposal.  Even if Ruby enables `\Q` and `\E` features in Onigumo, they don't work as expected if the embedded string contains `\E`.  Therefore, it would be better for Ruby to have a short syntax for `#{Regexp.quote(str)}`.

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106794

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116774] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
  2024-02-15  9:05 ` [ruby-core:116771] " mrkn (Kenta Murata) via ruby-core
@ 2024-02-15  9:32 ` nobu (Nobuyoshi Nakada) via ruby-core
  2024-02-15  9:45 ` [ruby-core:116775] " knu (Akinori MUSHA) via ruby-core
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2024-02-15  9:32 UTC (permalink / raw
  To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)

Issue #20266 has been updated by nobu (Nobuyoshi Nakada).


https://github.com/nobu/ruby/tree/quoting-interpolation

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106797

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116775] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
  2024-02-15  9:05 ` [ruby-core:116771] " mrkn (Kenta Murata) via ruby-core
  2024-02-15  9:32 ` [ruby-core:116774] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2024-02-15  9:45 ` knu (Akinori MUSHA) via ruby-core
  2024-02-15 23:05 ` [ruby-core:116786] " shan (Shannon Skipper) via ruby-core
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: knu (Akinori MUSHA) via ruby-core @ 2024-02-15  9:45 UTC (permalink / raw
  To: ruby-core; +Cc: knu (Akinori MUSHA)

Issue #20266 has been updated by knu (Akinori MUSHA).


I was also part of the discussion circle regarding this idea.  The lack of support for easily escaping a string for regular expressions has led users to often omit it when it seems obvious that a string does not need escaping (for example, when it is alphanumeric) or when it "looks" practically okay to do so.  However, omitting escaping for something like a domain name could potentially create a vulnerability since the dot is a meta character.

Consider the scenario where the variable `hostname` is set to `"example.co.jp"`.  In the expression `%r{\Ahttps://#{hostname}/.match?(callback_url)` where necessary escaping is omitted, it unwantedly matches `"https://example.co/jp/…"` which is a URL under a completely different domain.

That's why I believe it is necessary for Ruby to provide an easy and readable way to escape a string in interpolation.  It would help code reviewers and reviewees a lot if escaping costed just one character, whereas "Add Regexp.quote() here and here" can look scary and pedantic.

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106799

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116786] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
                   ` (2 preceding siblings ...)
  2024-02-15  9:45 ` [ruby-core:116775] " knu (Akinori MUSHA) via ruby-core
@ 2024-02-15 23:05 ` shan (Shannon Skipper) via ruby-core
  2024-02-16  5:35 ` [ruby-core:116789] " rubyFeedback (robert heiler) via ruby-core
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: shan (Shannon Skipper) via ruby-core @ 2024-02-15 23:05 UTC (permalink / raw
  To: ruby-core; +Cc: shan (Shannon Skipper)

Issue #20266 has been updated by shan (Shannon Skipper).


I wonder if `#{^foo}` might be a passable alternative for `#{=foo}` since "pinning" *almost* makes sense and and uptick is less likely to actually be intended than an equals sign to start a quoted interpolation?

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106810

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116789] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
                   ` (3 preceding siblings ...)
  2024-02-15 23:05 ` [ruby-core:116786] " shan (Shannon Skipper) via ruby-core
@ 2024-02-16  5:35 ` rubyFeedback (robert heiler) via ruby-core
  2024-02-16 19:36 ` [ruby-core:116806] " Dan0042 (Daniel DeLorme) via ruby-core
  2024-02-16 20:51 ` [ruby-core:116808] " matheusrich (Matheus Richard) via ruby-core
  6 siblings, 0 replies; 8+ messages in thread
From: rubyFeedback (robert heiler) via ruby-core @ 2024-02-16  5:35 UTC (permalink / raw
  To: ruby-core; +Cc: rubyFeedback (robert heiler)

Issue #20266 has been updated by rubyFeedback (robert heiler).


I don't have any pro or con opinion on the feature itself; in regards to ^foo versus
=foo, I think users may wonder about both: ^ specifically because many regexes may
have it, such as /^foobar/, and with = they may assume some assignment to be made.
At the least that was my first impression when seeing it, perhaps inspired by erb.

I guess **IF** the rationale is that Regexp.quote(i) is too long to type, which seems a
reasonable statement, then it makes sense to use a shorthand syntax. But probably all
shorthand syntaxes here may not be "perfect". Remember the perl-inspired $ variables;
not everyone can remember them easily. (Unfortunately the longer $ named variables 
weren't a big improvement either.)

Just for sake of completion, as this was already discussed, could someone show 
alternative syntax suggestions, if they were made? Just so we can more easily compare
the preferred variant over the other variants, e. g. two so far, even if one may be
"inofficial" by shan:

    #{^foo}
    #{=foo}

I'll also try shan's suggestion via the first one, as "side-by-side" comparison:

    foo = "[a-z]"
    p /#{=foo}/

    foo = "[a-z]"
    p /#{^foo}/

Hmm. And my initial thought of the second one used with leading ^

    p /^#{^foo}/

And for comparison the other one also with leading ^:

    p /^#{=foo}/

I think none of them will win any beauty contest, but it could still be
interesting for a comparison.

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106813

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116806] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
                   ` (4 preceding siblings ...)
  2024-02-16  5:35 ` [ruby-core:116789] " rubyFeedback (robert heiler) via ruby-core
@ 2024-02-16 19:36 ` Dan0042 (Daniel DeLorme) via ruby-core
  2024-02-16 20:51 ` [ruby-core:116808] " matheusrich (Matheus Richard) via ruby-core
  6 siblings, 0 replies; 8+ messages in thread
From: Dan0042 (Daniel DeLorme) via ruby-core @ 2024-02-16 19:36 UTC (permalink / raw
  To: ruby-core; +Cc: Dan0042 (Daniel DeLorme)

Issue #20266 has been updated by Dan0042 (Daniel DeLorme).


TBH I'm not entirely sure it's worth new syntax, but I've definitely felt the verbosity of `Regexp.escape` before, and I like how `#{= expr}` has smilarity with erb's `<%= expr %>`

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106829

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:116808] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal
  2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
                   ` (5 preceding siblings ...)
  2024-02-16 19:36 ` [ruby-core:116806] " Dan0042 (Daniel DeLorme) via ruby-core
@ 2024-02-16 20:51 ` matheusrich (Matheus Richard) via ruby-core
  6 siblings, 0 replies; 8+ messages in thread
From: matheusrich (Matheus Richard) via ruby-core @ 2024-02-16 20:51 UTC (permalink / raw
  To: ruby-core; +Cc: matheusrich (Matheus Richard)

Issue #20266 has been updated by matheusrich (Matheus Richard).


I wonder if this new syntax would open the doors to adding some kind of similar behavior to normal string interpolation too.

----------------------------------------
Feature #20266: New syntax to escape embed strings in Regexp literal
https://bugs.ruby-lang.org/issues/20266#change-106831

* Author: usa (Usaku NAKAMURA)
* Status: Open
* Priority: Normal
----------------------------------------
# Premise

When using embed strings in Regexp literal, it is interpreted as a part of the Regexp.

```ruby
foo = "[a-z]"
p /#{foo}/ #=> /[a-z]/
```

So, currently we often have to escape the embed strings.

```ruby
foo = "[a-z]"
p /#{Regexp.quote(foo)}/ #=> /\[a\-z\]/
```

This is very long and painful to write every time.
So, I propose new syntax to escape embed strings automatically.

# Proposal

Adding new token `#{=` in Regexp literal:

```ruby
foo = "[a-z]"
p /#{=foo}/ #=> /\[a\-z\]/
```

When `#{=` is used instead of `#{`, ruby calls `Regexp.quote` internally.

# Compatibility

Current ruby causes syntax error when using `#{=`, then there is no incompatibilty.

# Out of scope of this proposal

I do not propose about `#{=` in another literals.  They are out of scope of this proposal.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-02-16 20:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-15  8:45 [ruby-core:116769] [Ruby master Feature#20266] New syntax to escape embed strings in Regexp literal usa (Usaku NAKAMURA) via ruby-core
2024-02-15  9:05 ` [ruby-core:116771] " mrkn (Kenta Murata) via ruby-core
2024-02-15  9:32 ` [ruby-core:116774] " nobu (Nobuyoshi Nakada) via ruby-core
2024-02-15  9:45 ` [ruby-core:116775] " knu (Akinori MUSHA) via ruby-core
2024-02-15 23:05 ` [ruby-core:116786] " shan (Shannon Skipper) via ruby-core
2024-02-16  5:35 ` [ruby-core:116789] " rubyFeedback (robert heiler) via ruby-core
2024-02-16 19:36 ` [ruby-core:116806] " Dan0042 (Daniel DeLorme) via ruby-core
2024-02-16 20:51 ` [ruby-core:116808] " matheusrich (Matheus Richard) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).