ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: eregontp@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:103176] [Ruby master Feature#17771] String#start_with? should not construct MatchData or set $~
Date: Fri, 02 Apr 2021 10:28:02 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-91251.20210402102802.286@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17771.20210401180802.286@ruby-lang.org

Issue #17771 has been updated by Eregon (Benoit Daloze).


I don't think there is a rule that predicate methods only return a boolean and never set `$~`.
It is the case for `String#match` vs `String#match?`, but it doesn't mean it holds for other Regexp methods.
I see it a bit like the use of `!`, which in the core library is generally only used if there is also a non-`!` variant (e.g., `Array#delete`).

`String#start_with?` enables to match a regexp without the need to manually build another regexp like `/\A#{regexp}/` (from the user point of view, there might be internal caching depending on the regexp engine), so I think that is a valid use case for using `start_with?` and accessing the MatchData after.

StringScanner has a similar functionality for matching a regexp from the start, as if there was a `\A`, but does not expose `$~` directly:
`ruby -rstrscan -e 's = StringScanner.new("test string"); s.scan(/(\w)\w+/); p s[1]'` => `"t"`.

That said, I'm not against no longer setting $~ for String#start_with?, but I do worry about the compatibility issue here, especially since it might be quite hard to debug why $~ is suddenly `nil` or the previous MatchData in the Ruby version changing this behavior.

----------------------------------------
Feature #17771: String#start_with? should not construct MatchData or set $~
https://bugs.ruby-lang.org/issues/17771#change-91251

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
----------------------------------------
I am working on making $~ more thread-safe in JRuby and came across this unexpected behavior:

```ruby
$ rvm ruby-3.0 do ruby -e '"foo".start_with?(/foo/); p $~'
#<MatchData "foo">
```

The `start_with?` method was added 11 years ago in https://bugs.ruby-lang.org/issues/3388 but I do not think the set of $~ was an intended feature. The `start_with?` method could be much faster and more thread-safe if it did not use the frame-local backref slot and did not allocate a MatchData.

Compare with `match?` which was added specifically (without MatchData or backref setting) to provide a fast way to check if a Regexp matches.

I propose that `start_with?` stop constructing MatchData, stop setting backref, and provide only its boolean result in the same way as `match?`.



-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2021-04-02 10:28 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 18:08 [ruby-core:103153] [Ruby master Bug#17771] String#start_with? should not construct MatchData or set $~ headius
2021-04-01 18:13 ` [ruby-core:103154] " headius
2021-04-01 18:25 ` [ruby-core:103155] " headius
2021-04-01 19:04 ` [ruby-core:103156] [Ruby master Feature#17771] " tom.enebo
2021-04-01 19:06 ` [ruby-core:103157] " headius
2021-04-02 10:28 ` eregontp [this message]
2021-04-02 14:49 ` [ruby-core:103187] " marcandre-ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-91251.20210402102802.286@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).