ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:92972] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
@ 2019-06-05  7:27 ` kimmo.lehto
  2019-06-05  8:02 ` [ruby-core:92973] " sawadatsuyoshi
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: kimmo.lehto @ 2019-06-05  7:27 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been reported by kke (Kimmo Lehto).

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:92973] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
  2019-06-05  7:27 ` [ruby-core:92972] [Ruby trunk Feature#15899] String#before and String#after kimmo.lehto
@ 2019-06-05  8:02 ` sawadatsuyoshi
  2019-06-05  8:06 ` [ruby-core:92974] " sawadatsuyoshi
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-06-05  8:02 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by sawa (Tsuyoshi Sawada).


Since you are mentioning that `String#delete_suffix` and `String#delete_prefix` do not accept regexps and that is a weak point, you should better use regexps in the examples illustrating your proposal.

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78350

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:92974] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
  2019-06-05  7:27 ` [ruby-core:92972] [Ruby trunk Feature#15899] String#before and String#after kimmo.lehto
  2019-06-05  8:02 ` [ruby-core:92973] " sawadatsuyoshi
@ 2019-06-05  8:06 ` sawadatsuyoshi
  2019-06-05  9:12 ` [ruby-core:92976] " shevegen
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-06-05  8:06 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by sawa (Tsuyoshi Sawada).


Using `partition` looks reasonable, and it can accept regexes.

```ruby
str = 'application/json; charset=utf-8'
before, _, after = str.partition(/; /)
before # => "application/json"
after # => "charset=utf-8"
```

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78351

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:92976] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2019-06-05  8:06 ` [ruby-core:92974] " sawadatsuyoshi
@ 2019-06-05  9:12 ` shevegen
  2019-06-06  7:00 ` [ruby-core:92995] " kimmo.lehto
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: shevegen @ 2019-06-05  9:12 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by shevegen (Robert A. Heiler).


I can see where it may be useful, since it could shorten code like this:
    
    first_part = "hello world!".split(' ').first

To:

    first_part = "hello world!.before(' ')

It is not a huge improvement in my opinion, though. (My comment here has
not yet addressed the other part about using regexes - see a bit later for
that.)

I am not a big fan of the names, though. I somehow associate #before and #after
more with time-based operations; and rack/sinatra middleware (route) filters.

I do not have a better or alternative suggestion, although since we already have
delete_prefix, perhaps we could have some methods that return the desired prefix
instead (or suffix).

As for lack of regex support, I think sawa already pointed out that it may be
better to reason for changing delete_prefix and delete_suffix instead. That way
your demonstrated use case could be simplified as well.

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78353

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:92995] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2019-06-05  9:12 ` [ruby-core:92976] " shevegen
@ 2019-06-06  7:00 ` kimmo.lehto
  2019-06-14  7:30 ` [ruby-core:93132] " kimmo.lehto
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: kimmo.lehto @ 2019-06-06  7:00 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by kke (Kimmo Lehto).


> Using partition looks reasonable, and it can accept regexes.

It also has the problem of creating extra objects that you need to discard with `_` or assign and just leave unused.

> I am not a big fan of the names, though. I somehow associate #before and #after
more with time-based operations; and rack/sinatra middleware (route) filters.

How about `str.preceding(';')` and `str.following(';')`? 

Perhaps `str.prior_to(';')` and `str.behind(';')`?

Possibility of opposite reading direction can make these problematic.

`str.left_from(';')`, `str.right_from(';')`? Sounds a bit clunky.

Head and tail could be the unixy choice and more versatile for other use cases.

```ruby
class String
  def head(count = 10, separator = "\n")
    ...
  end

  def tail(count = 10, separator = "\n")
    ...
  end
end
```

For my example use case, it would become:


```ruby
str = "application/json; charset=utf-8"
mime = str.head(1, ';')
labels = str.tail(1, ';')
```

And to emulate something like `$ curl xttp://x.example.com | head` you would use `response.body.head`


----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78373

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:93132] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2019-06-06  7:00 ` [ruby-core:92995] " kimmo.lehto
@ 2019-06-14  7:30 ` kimmo.lehto
  2019-06-14 14:54 ` [ruby-core:93143] " ruby-core
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: kimmo.lehto @ 2019-06-14  7:30 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by kke (Kimmo Lehto).


How about `first` and `last`?

```ruby
'hello world'.first(2)
 => 'he'
'hello world'.last(2)
 => 'ld'
'hello world'.first
 => 'h'
'hello world'.last
 => 'd'
'hello world'.first(1, ' ')
 => 'hello'
'hello world'.last(1, ' ')
 => 'world'
'application/json; charset=utf-8'.first(1, ';')
 => 'application/json'
```


----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78561

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:93143] [Ruby trunk Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2019-06-14  7:30 ` [ruby-core:93132] " kimmo.lehto
@ 2019-06-14 14:54 ` ruby-core
  2019-07-09 18:34 ` [ruby-core:93645] [Ruby master " eddm
  2019-11-04 20:57 ` [ruby-core:95677] " jonathan
  8 siblings, 0 replies; 9+ messages in thread
From: ruby-core @ 2019-06-14 14:54 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by marcandre (Marc-Andre Lafortune).


sawa is right. Just use `partition` and `rpartition`.

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-78571

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 





-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:93645] [Ruby master Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2019-06-14 14:54 ` [ruby-core:93143] " ruby-core
@ 2019-07-09 18:34 ` eddm
  2019-11-04 20:57 ` [ruby-core:95677] " jonathan
  8 siblings, 0 replies; 9+ messages in thread
From: eddm @ 2019-07-09 18:34 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by edd314159 (Edd Morgan).

File 2269.diff added
File test.rb added
File test_mem.rb added

I'd like to add my +1 to this idea. Splitting a string by a substring (and only caring about the first result) is a use case I run into all the time. In fact, the example given by @kke of splitting a `Content-Type` HTTP header by the semicolon is the one I needed it for most recently.

It's true, `partition` and `rpartition` can absolutely achieve the same thing. But they have the side effect of returning (and, of course, allocating) extra String objects that are frequently discarded. This not only negatively impacts performance, but results in less readable code: we have to resort to the convention of prefixing the throwaway variable name with an underscore. This underscore is a convention agreed upon, informally, by humans to indicate the irrelevance of the variable, and I'm sure many Ruby programmers are unaware of the convention, or simply forget about it.

I have suggested an implementation in PR #2269 on Github: https://github.com/ruby/ruby/pull/2269

I also attach the following benchmark to show that when these new methods are used for this use case, performance is ~30% improved for splitting by a String (and moreso when splitting by Regex):

``` ruby
eddmorgan@eddbook ~/Projects/rubydev/build → make run

../ruby/revision.h unchanged
./miniruby -I../ruby/lib -I. -I.ext/common   ../ruby/test.rb
                       user     system      total        real
String#before      0.182367   0.000587   0.182954 (  0.183625)
String#partition   0.303105   0.000877   0.303982 (  0.304961)
                       user     system      total        real
String#after       0.199295   0.000672   0.199967 (  0.200794)
String#partition   0.302300   0.001409   0.303709 (  0.305278)
```

----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-79253

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 



---Files--------------------------------
test.rb (712 Bytes)
test_mem.rb (326 Bytes)
2269.diff (3.77 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:95677] [Ruby master Feature#15899] String#before and String#after
       [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2019-07-09 18:34 ` [ruby-core:93645] [Ruby master " eddm
@ 2019-11-04 20:57 ` jonathan
  8 siblings, 0 replies; 9+ messages in thread
From: jonathan @ 2019-11-04 20:57 UTC (permalink / raw)
  To: ruby-core

Issue #15899 has been updated by jonathanhefner (Jonathan Hefner).


I use monkey-patched versions of these in many of my Ruby scripts.  They have a few benefits vs. the alternatives:

* vs. `split` + `first` / `last`
  * using `split` can cause an unintended result when the delimiter is not present, e.g. `"abc".split("x", 2).last == "abc"`
* vs. `partition`
  * `before` and `after` can be chained, and can result in fewer object allocations
* vs. regex + capture group
  * `before` and `after` are easier to read (and write)

I've also found [`before_last`](https://www.rubydoc.info/gems/casual_support/String:before_last) and [`after_last`](https://www.rubydoc.info/gems/casual_support/String:after_last) helpful for similar reasons.

kke (Kimmo Lehto) wrote:

> What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string.

Regarding `before`, I agree.

Regarding `after`, I originally wrote my monkey-patched `after` to return an empty string, but eventually changed it to return nil.  I was hesitant because a nil result can be an unexpected "gotcha", but an empty string seems wrong because it throws away information.  For example, if `str.after("x") == ""`, it might be because the delimiter wasn't found, or because the delimiter was at the end of the string.  (Compared to `str.before("x") == str`, which always means the delimiter wasn't found.)


----------------------------------------
Feature #15899: String#before and String#after
https://bugs.ruby-lang.org/issues/15899#change-82464

* Author: kke (Kimmo Lehto)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
There  seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

``` ruby
str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]
```

These create intermediate objects or/and are ugly.

The `String#delete_suffix` and `String#delete_prefix` do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

``` ruby
> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"
```

What should happen if the marker isn't found? In my opinion, `before` should return the full string and `after` an empty string. 



---Files--------------------------------
test.rb (712 Bytes)
test_mem.rb (326 Bytes)
2269.diff (3.77 KB)


-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-11-04 20:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-15899.20190605072723@ruby-lang.org>
2019-06-05  7:27 ` [ruby-core:92972] [Ruby trunk Feature#15899] String#before and String#after kimmo.lehto
2019-06-05  8:02 ` [ruby-core:92973] " sawadatsuyoshi
2019-06-05  8:06 ` [ruby-core:92974] " sawadatsuyoshi
2019-06-05  9:12 ` [ruby-core:92976] " shevegen
2019-06-06  7:00 ` [ruby-core:92995] " kimmo.lehto
2019-06-14  7:30 ` [ruby-core:93132] " kimmo.lehto
2019-06-14 14:54 ` [ruby-core:93143] " ruby-core
2019-07-09 18:34 ` [ruby-core:93645] [Ruby master " eddm
2019-11-04 20:57 ` [ruby-core:95677] " jonathan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).