ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:91256] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
@ 2019-01-25  7:19 ` sawadatsuyoshi
  2019-01-25  9:04 ` [ruby-core:91257] " zn
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-01-25  7:19 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been reported by sawa (Tsuyoshi Sawada).

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91257] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
  2019-01-25  7:19 ` [ruby-core:91256] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring sawadatsuyoshi
@ 2019-01-25  9:04 ` zn
  2019-01-25  9:51 ` [ruby-core:91258] " sawadatsuyoshi
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: zn @ 2019-01-25  9:04 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by znz (Kazuhiro NISHIYAMA).


`String#split` with `-1` does not remove empty strings.

```
>> "aba".split("a", -1)
=> ["", "b", ""]
>> "abaa".split("a", -1)
=> ["", "b", "", ""]
```

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76505

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91258] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
  2019-01-25  7:19 ` [ruby-core:91256] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring sawadatsuyoshi
  2019-01-25  9:04 ` [ruby-core:91257] " zn
@ 2019-01-25  9:51 ` sawadatsuyoshi
  2019-01-25 11:08 ` [ruby-core:91259] " sawadatsuyoshi
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-01-25  9:51 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by sawa (Tsuyoshi Sawada).


znz (Kazuhiro NISHIYAMA) wrote:
> `String#split` with `-1` does not remove empty strings.
> 
> ```
> >> "aba".split("a", -1)
> => ["", "b", ""]
> >> "abaa".split("a", -1)
> => ["", "b", "", ""]
> ```

I want `["b"]`.


----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76506

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91259] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (2 preceding siblings ...)
  2019-01-25  9:51 ` [ruby-core:91258] " sawadatsuyoshi
@ 2019-01-25 11:08 ` sawadatsuyoshi
  2019-01-25 11:33 ` [ruby-core:91260] " knu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-01-25 11:08 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by sawa (Tsuyoshi Sawada).


An example of a frequent use case of `split("a", initial_empty_string: false)` is when we have a text like `text` in the following, and want to extract the paragraphs that follow `SECTION`:

```ruby
text = <<~_
  SECTION
  Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam in massa eget mauris lobortis fermentum non in risus. Etiam sit amet dui et velit laoreet pulvinar. Donec convallis, nisi ut lobortis volutpat, est sapien bibendum ante, ac laoreet enim neque at nulla. Aliquam ex urna, porttitor nec mi vitae, suscipit lacinia diam. Maecenas semper, enim id eleifend viverra, lorem velit facilisis tellus, sit amet efficitur nulla nibh sit amet eros. Cras erat mauris, rutrum id mattis nec, auctor eu diam. Aenean mattis at nisl sit amet aliquam. Proin euismod hendrerit eros, quis rhoncus ipsum.

  SECTION
  Curabitur eget quam quis nulla lacinia dapibus ut quis mauris. Maecenas volutpat molestie pulvinar. Mauris porttitor semper arcu. Fusce congue tempor urna in suscipit. Duis a neque lacinia, consectetur elit id, ullamcorper neque. Morbi sit amet eleifend ipsum, sit amet porta libero. Mauris euismod ipsum sit amet ante porttitor consequat. Suspendisse malesuada nunc quis orci posuere dapibus. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nulla quis massa ut tortor pulvinar egestas in ut nunc. Aenean vitae malesuada elit, nec posuere massa. Nullam risus ipsum, fermentum at fringilla eget, tincidunt nec ante. Pellentesque malesuada pulvinar bibendum. Cras massa erat, tristique vitae vehicula et, aliquet vestibulum magna.
_

text.split(/^SECTION\n/, initial_empty_string: false).map(&:strip)
```

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76507

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91260] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (3 preceding siblings ...)
  2019-01-25 11:08 ` [ruby-core:91259] " sawadatsuyoshi
@ 2019-01-25 11:33 ` knu
  2019-01-25 12:06 ` [ruby-core:91261] " shevegen
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: knu @ 2019-01-25 11:33 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by knu (Akinori MUSHA).


Isn't the new option name too long?  I'd use `.drop_while(&:empty?)`.

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76508

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91261] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (4 preceding siblings ...)
  2019-01-25 11:33 ` [ruby-core:91260] " knu
@ 2019-01-25 12:06 ` shevegen
  2019-01-25 12:07 ` [ruby-core:91262] " shevegen
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: shevegen @ 2019-01-25 12:06 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by shevegen (Robert A. Heiler).


This reminds me a bit of Dir['*'] versus Dir.entries(Dir.pwd). The latter
also has . and .. entries such as:

    => ["foobar.md", "..", "."]

To me the . and .. entries were never useful. I ended up switching to Dir[]
consistently anyway so I don't see . or .., but I am bringing this example
because I agree with the statement by sawa about empty strings not being
too terribly useful as a result, if you may wish to work with it. Perhaps
it may be useful if you wish to .join on it again, but if you are only 
interested in non-empty results (or non-empty strings) then I think it may
be ok to have an additional way to return only the entries you are interested
in. Of course you can process the result on your own as-is, via .reject or
.select (or .filter), but it may be more convenient to simply pass in another
option to .split as second argument.

So from this point of view I agree with sawa, even though I personally probably
don't need this much at all (oddly enough I think almost all of the use cases
I personally have had, were left to pass only one argument to .split()).

The only adaptation I would suggest is that I think the proposed syntax is too
long.

    "aba".split("a", terminal_empty_string: :none) # => ["b"]
    "aba".split("a", terminal_empty_string: :initial) # => ["", "b"]

I understand that, I assume, sawa proposes flexibility, which is fine,
but it is a bit clumsy and long, IMO. Perhaps something simpler?

    ignore_empty: true

Can't think of many more. Rails/Active* has .blank? which I do not like
as a name, but from a conceptual point of view, being able to have a
short way to refer to something like the following, may be nice to 
have in general:

    "ruby, please ignore nil and empty strings as results, as I need
the alternative only".

In my own code I (mis)use symbols a lot, so I may propose
:ignore_empty_string too. :)

(It's actually almost as long as sawa's suggestion, but when I just tried
it, making this shorter was not easy, since we lose a bit of meaning what
we try to convey here. That is also one reason why it may be useful to 
somehow refer to situations where we could easily filter away nil and
'' empty strings, via a single word/command. Even .blank? may become a
bit more verbose if you try to use it via the API above, such as
ignore_blanks: true - or something like that. Good API design is hard...

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76509

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91262] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (5 preceding siblings ...)
  2019-01-25 12:06 ` [ruby-core:91261] " shevegen
@ 2019-01-25 12:07 ` shevegen
  2019-01-28  2:46 ` [ruby-core:91304] " sawadatsuyoshi
  2019-01-28  3:32 ` [ruby-core:91305] " knu
  8 siblings, 0 replies; 9+ messages in thread
From: shevegen @ 2019-01-25 12:07 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by shevegen (Robert A. Heiler).


> Isn't the new option name too long? I'd use .drop_while(&:empty?).

I personally agree with your observation here; but I think that
.drop_while(&:empty?) is also not ideal. I'd then actually prefer
sawa's longer variant than the combined drop_whilte(&:empty?) 
syntax. :)

----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76510

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91304] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (6 preceding siblings ...)
  2019-01-25 12:07 ` [ruby-core:91262] " shevegen
@ 2019-01-28  2:46 ` sawadatsuyoshi
  2019-01-28  3:32 ` [ruby-core:91305] " knu
  8 siblings, 0 replies; 9+ messages in thread
From: sawadatsuyoshi @ 2019-01-28  2:46 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by sawa (Tsuyoshi Sawada).


> (shevegen (Robert A. Heiler):) This reminds me a bit of Dir['*'] versus Dir.entries(Dir.pwd). The latter
also has . and .. entries

Actually, I had the same thing in mind. I have never felt the initial `""` in `String#split` useful (as well as the `.` and `..` in `Dir[]`). They are along the same lines to me.

And I agree with knu that the name for the option was too long. I had felt that too. So I came up with a different name. What about `leader`?

```ruby
"aba".split(`"a", leader: false) # => ["b"]
```



----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76549

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [ruby-core:91305] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring
       [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
                   ` (7 preceding siblings ...)
  2019-01-28  2:46 ` [ruby-core:91304] " sawadatsuyoshi
@ 2019-01-28  3:32 ` knu
  8 siblings, 0 replies; 9+ messages in thread
From: knu @ 2019-01-28  3:32 UTC (permalink / raw
  To: ruby-core

Issue #15562 has been updated by knu (Akinori MUSHA).


I believe an initial empty string should often be useful and significant, so it is a reasonable default to include one.  String#split is used for splitting strings like `key=value`, `/path/components`, not to mention CSV, where `key=` and `=value` need to be differentiated and `elements.join('/')` should round-trip.


----------------------------------------
Feature #15562: `String#split` option to suppress the initial empty substring
https://bugs.ruby-lang.org/issues/15562#change-76550

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
`String#split` returns an empty substring if any at the beginning of the original string, even though it does not return an empty substring at the end of the original string:

```ruby
"aba".split("a") # => ["", "b"]
```

This is probably heritage from Perl or AWK, and may have some use cases, but in some (if not most) use cases, this looks asymmetric, and the initial empty string is unnatural and often requires some additional code to remove it. I propose to give an option to `String#split` to suppress it, perhaps like this (with `true` being the default):

```ruby
"aba".split("a", initial_empty_string: false) # => ["b"]
"aba".split("a", initial_empty_string: true) # => ["", "b"]
"aba".split("ba", initial_empty_string: true) # => ["b"]
```

This does not mean to suppress empty strings in the middle. So it should work like this:

```ruby
"aaaba".split("a", initial_empty_string: false) # => ["", "", "b"]
"aaaba".split("a", initial_empty_string: true) # => ["", "", "", "b"]
```

Or may be we can even go on further to control both the initial and the final ones like (with `:initial` being the default):

```ruby
"aba".split("a", terminal_empty_string: :none) # => ["b"]
"aba".split("a", terminal_empty_string: :initial) # => ["", "b"]
"aba".split("a", terminal_empty_string: :final) # => ["b", ""]
"aba".split("a", terminal_empty_string: :both) # => ["", "b", ""]
```




-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-01-28  3:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <redmine.issue-15562.20190125071912@ruby-lang.org>
2019-01-25  7:19 ` [ruby-core:91256] [Ruby trunk Feature#15562] `String#split` option to suppress the initial empty substring sawadatsuyoshi
2019-01-25  9:04 ` [ruby-core:91257] " zn
2019-01-25  9:51 ` [ruby-core:91258] " sawadatsuyoshi
2019-01-25 11:08 ` [ruby-core:91259] " sawadatsuyoshi
2019-01-25 11:33 ` [ruby-core:91260] " knu
2019-01-25 12:06 ` [ruby-core:91261] " shevegen
2019-01-25 12:07 ` [ruby-core:91262] " shevegen
2019-01-28  2:46 ` [ruby-core:91304] " sawadatsuyoshi
2019-01-28  3:32 ` [ruby-core:91305] " knu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).