ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~
@ 2021-04-01 18:08 headius
  2021-04-01 18:13 ` [ruby-core:103154] " headius
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: headius @ 2021-04-01 18:08 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been reported by headius (Charles Nutter).

----------------------------------------
Bug #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103154] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
@ 2021-04-01 18:13 ` headius
  2021-04-01 18:25 ` [ruby-core:103155] " headius
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: headius @ 2021-04-01 18:13 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by headius (Charles Nutter).


I will also point out that this method, like many others, will *not* always set $~. If you pass a string, it remains whatever it was before:

```
$ rvm ruby-3.0 do ruby -e '"foo".start_with?("foo"); p $~'
nil
```

Avoiding the use of $~ would make this behavior consistent.

----------------------------------------
Bug #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91226

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103155] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
  2021-04-01 18:13 ` [ruby-core:103154] " headius
@ 2021-04-01 18:25 ` headius
  2021-04-01 19:04 ` [ruby-core:103156] [Ruby master Feature#17771] " tom.enebo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: headius @ 2021-04-01 18:25 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by headius (Charles Nutter).


I see this behavior was explicitly blessed by matz in #13712 but I still believe this is not the best choice.

Around the same time as that discussion, another boolean query method `match?` was added that explicitly does *not* set the last match frame variable.

I feel this is inconsistent and the boolean query methods that accept a Regexp should be as fast as possible. If you want a MatchData use methods that provide it.

----------------------------------------
Bug #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91227

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103156] [Ruby master Feature#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
  2021-04-01 18:13 ` [ruby-core:103154] " headius
  2021-04-01 18:25 ` [ruby-core:103155] " headius
@ 2021-04-01 19:04 ` tom.enebo
  2021-04-01 19:06 ` [ruby-core:103157] " headius
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: tom.enebo @ 2021-04-01 19:04 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by enebo (Thomas Enebo).


It really feels like an unintended side-effect of the method.  If you write this method and accept a variable then depending on the type of that variable there is either some MatchData (MD) as a side-effect or there isn't.  This is inconsistent.  If you wanted to explicitly use MD then you have to know what you are supplying.  If you know it is a regexp then just writing str =~ /^my_pat/ is what you want.

----------------------------------------
Feature #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91229

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103157] [Ruby master Feature#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
                   ` (2 preceding siblings ...)
  2021-04-01 19:04 ` [ruby-core:103156] [Ruby master Feature#17771] " tom.enebo
@ 2021-04-01 19:06 ` headius
  2021-04-02 10:28 ` [ruby-core:103176] " eregontp
  2021-04-02 14:49 ` [ruby-core:103187] " marcandre-ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: headius @ 2021-04-01 19:06 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by headius (Charles Nutter).


An alternative to using `str =~ /^pat/` for a `starts_with?` that provides a MatchData would be to add a `starts_with` that is not a boolean query method.

----------------------------------------
Feature #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91230

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103176] [Ruby master Feature#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
                   ` (3 preceding siblings ...)
  2021-04-01 19:06 ` [ruby-core:103157] " headius
@ 2021-04-02 10:28 ` eregontp
  2021-04-02 14:49 ` [ruby-core:103187] " marcandre-ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: eregontp @ 2021-04-02 10:28 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by Eregon (Benoit Daloze).


I don't think there is a rule that predicate methods only return a boolean and never set `$~`.
It is the case for `String#match` vs `String#match?`, but it doesn't mean it holds for other Regexp methods.
I see it a bit like the use of `!`, which in the core library is generally only used if there is also a non-`!` variant (e.g., `Array#delete`).

`String#start_with?` enables to match a regexp without the need to manually build another regexp like `/\A#{regexp}/` (from the user point of view, there might be internal caching depending on the regexp engine), so I think that is a valid use case for using `start_with?` and accessing the MatchData after.

StringScanner has a similar functionality for matching a regexp from the start, as if there was a `\A`, but does not expose `$~` directly:
`ruby -rstrscan -e 's = StringScanner.new("test string"); s.scan(/(\w)\w+/); p s[1]'` => `"t"`.

That said, I'm not against no longer setting $~ for String#start_with?, but I do worry about the compatibility issue here, especially since it might be quite hard to debug why $~ is suddenly `nil` or the previous MatchData in the Ruby version changing this behavior.

----------------------------------------
Feature #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91251

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:103187] [Ruby master Feature#17771] String#start_with? should not construct MatchData or set $~
  2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
                   ` (4 preceding siblings ...)
  2021-04-02 10:28 ` [ruby-core:103176] " eregontp
@ 2021-04-02 14:49 ` marcandre-ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: marcandre-ruby-core @ 2021-04-02 14:49 UTC (permalink / raw)
  To: ruby-core

Issue #17771 has been updated by marcandre (Marc-Andre Lafortune).


I also believe it is unintended behavior and should be removed. 

----------------------------------------
Feature #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91262

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-02 14:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
2021-04-01 18:13 ` [ruby-core:103154] " headius
2021-04-01 18:25 ` [ruby-core:103155] " headius
2021-04-01 19:04 ` [ruby-core:103156] [Ruby master Feature#17771] " tom.enebo
2021-04-01 19:06 ` [ruby-core:103157] " headius
2021-04-02 10:28 ` [ruby-core:103176] " eregontp
2021-04-02 14:49 ` [ruby-core:103187] " marcandre-ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).