ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "Dan0042 (Daniel DeLorme) via ruby-core" <ruby-core@ml.ruby-lang.org>
To: ruby-core@ml.ruby-lang.org
Cc: "Dan0042 (Daniel DeLorme)" <noreply@ruby-lang.org>
Subject: [ruby-core:117367] [Ruby master Feature#20394] Add an offset parameter to `String#to_i`
Date: Thu, 28 Mar 2024 19:13:22 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-107526.20240328191322.7941@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-20394.20240326105747.7941@ruby-lang.org

Issue #20394 has been updated by Dan0042 (Daniel DeLorme).


byroot (Jean Boussier) wrote in #note-10:
> `StringIO` isn't as convenient as you make it out to be. Maybe it could become that, but it isn't today.

Hmm, it's not like it matters very much, but I get the weird feeling you misunderstood something in what I said. It's not like we'd ever limit ourselves to just the StringIO interface; my point was that StringIO provides a byte-oriented cursor interface ON TOP of String. Since we can still use the underlying String buffer, that means StringIO+String is a strict superset of String. There's no way that can be any *less* convenient than just String. Same thing for IO::Buffer afaict.

----------------------------------------
Feature #20394: Add an offset parameter to `String#to_i`
https://bugs.ruby-lang.org/issues/20394#change-107526

* Author: byroot (Jean Boussier)
* Status: Closed
----------------------------------------
### Context

I maintain the `redis-client` gem, and it comes with an optional swapable implementation in C that binds the `hiredis` C client, [which used to performs up to 5 times faster in some cases](https://github.com/redis-rb/redis-client/commit/9fabd57c6786a03fe0c6021eab5b181d9316d9d7).

I recently paired with @tenderlovemaking to try to close this gap, or even try to make the pure Ruby version faster, and we came up with several optimizations that now almost make both version on par (assuming YJIT is enabled).

An important source of performance loss, is that the Redis protocol is line based and to parse it in Ruby requires to slice a lot of small strings from the buffer. To give an example, here's how an Array with two String (`["foo", "plop"]`) is serialized in RESP3 (Redis protocol):

```
*2\r\n
$3\r\n
foo\r\n
$4\r\n
plop\r\n
```

From this you can understand that a big hotspot in the parser is essentially `Integer(gets)`.

With @tenderlovemaking we managed to get [a fairly significant perf boost](https://github.com/redis-rb/redis-client/commit/41b3abe94243d2598211d448c4e457a3585ff9d5#diff-a8b5ce23fb9396492f56bf0bd23090910918a488416cfb488cef8b5b34877328) by avoiding these string allocation using `String#getbyte` and [basically implementing a rudimentary `String#to_i(offset: )` in Ruby](https://github.com/redis-rb/redis-client/commit/41b3abe94243d2598211d448c4e457a3585ff9d5#diff-5f15c6483e788ee14f367f65fb951800d52341726f528bcddff1e2cd3e62cab9R105-R115).

But while the gains are huge with YJIT enabled, they are much more tame with the interpreter. And it feels a bit wrong to have to implement this sorts of things for performance reasons.

### `String#to_i(offset: )`

Similar to `String#unpack(offset:)` ([Feature #18254]), I believe `String#to_i(offset: )` would be useful.

### Alternative new `String#unpack` format

Another possibility would be to add a new format to `String#pack` `String#unpack` for decimal numbers. It sounds a bit weird at first, but given it supports things like Base64 and hexadecimal, perhaps it's not that much of a stretch?







-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

      parent reply	other threads:[~2024-03-28 19:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-26 10:57 [ruby-core:117324] [Ruby master Feature#20394] Add an offset parameter to `String#to_i` byroot (Jean Boussier) via ruby-core
2024-03-26 11:18 ` [ruby-core:117325] " Eregon (Benoit Daloze) via ruby-core
2024-03-26 11:21 ` [ruby-core:117327] " Eregon (Benoit Daloze) via ruby-core
2024-03-26 11:39 ` [ruby-core:117328] " byroot (Jean Boussier) via ruby-core
2024-03-26 18:40 ` [ruby-core:117331] " Dan0042 (Daniel DeLorme) via ruby-core
2024-03-26 19:32 ` [ruby-core:117332] " shan (Shannon Skipper) via ruby-core
2024-03-27  3:57 ` [ruby-core:117339] " mame (Yusuke Endoh) via ruby-core
2024-03-27  7:41 ` [ruby-core:117340] " byroot (Jean Boussier) via ruby-core
2024-03-27 20:46 ` [ruby-core:117350] " Dan0042 (Daniel DeLorme) via ruby-core
2024-03-28  3:22 ` [ruby-core:117353] " mame (Yusuke Endoh) via ruby-core
2024-03-28  6:59 ` [ruby-core:117358] " byroot (Jean Boussier) via ruby-core
2024-03-28  7:20 ` [ruby-core:117359] " zverok (Victor Shepelev) via ruby-core
2024-03-28  7:28 ` [ruby-core:117360] " byroot (Jean Boussier) via ruby-core
2024-03-28  8:52 ` [ruby-core:117362] " ioquatix (Samuel Williams) via ruby-core
2024-03-28  8:54 ` [ruby-core:117363] " byroot (Jean Boussier) via ruby-core
2024-03-28 19:13 ` Dan0042 (Daniel DeLorme) via ruby-core [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-107526.20240328191322.7941@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    --cc=ruby-core@ml.ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).