ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: mame@ruby-lang.org
To: ruby-core@ruby-lang.org
Subject: [ruby-core:94640] [Ruby master Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup
Date: Thu, 29 Aug 2019 04:29:21 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-81237.20190829042920.ffc4dba8d9530372@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-15806.20190427234134@ruby-lang.org

Issue #15806 has been updated by mame (Yusuke Endoh).

Assignee set to nobu (Nobuyoshi Nakada)
Status changed from Closed to Assigned

usa said it might not work on windows when the install path includes non-ASCII characters.  Please check it out, @nobu .

----------------------------------------
Misc #15806: Explicitly initialise encodings on init to remove branches on encoding lookup
https://bugs.ruby-lang.org/issues/15806#change-81237

* Author: methodmissing (Lourens Naudé)
* Status: Assigned
* Priority: Normal
* Assignee: nobu (Nobuyoshi Nakada)
----------------------------------------
References Github PR https://github.com/ruby/ruby/pull/2128

I noticed that the encoding table is loaded on startup of even just `miniruby` (minimal viable interpreter use case) through this backtrace during ruby setup:

```
/home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] encoding.c:587
/home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c948a] encoding.c:1357
/home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol.c:42
/home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] inits.c:25
/home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval.c:74
/home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c:91
/home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41
```

Therefore I think it makes sense to instead initialize encodings explicitly just prior to symbol init, which is the first entry point into the interpreter loading that currently triggers `rb_enc_init` and remove the initialization check branches from the various lookup methods.

Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm Bi Bim` with `Ir` (instructions retired),  `Bc` (branches taken) and `Bcm` (branches missed) relevant here as there are no indirect branches (function pointers etc.):

(hot function, many instructions retired and branches taken and missed)
```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_enc_from_index(int index)
   835,669          0       0          0       0  {
13,133,536  6,337,652  50,267          0       0      if (!enc_table.list) {
         3          0       0          0       0  	rb_enc_init();
         .          .       .          .       .      }
23,499,349  8,006,202 293,161          0       0      if (index < 0 || enc_table.count <= (index &= ENC_INDEX_MASK)) {
         .          .       .          .       .  	return 0;
         .          .       .          .       .      }
30,024,494          0       0          0       0      return enc_table.list[index].enc;
 1,671,338          0       0          0       0  }
```

(cold function, representative of the utf8 variant more or less too)

```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_ascii8bit_encoding(void)
         .          .       .          .       .  {
    27,702      9,235     955          0       0      if (!enc_table.list) {
         .          .       .          .       .  	rb_enc_init();
         .          .       .          .       .      }
     9,238          0       0          0       0      return enc_table.list[ENCINDEX_ASCII].enc;
     9,232          0       0          0       0  }
```

I think lazy loading encodings and populating the table is fine, but initializing it can be done more explicitly in the boot process.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

      parent reply	other threads:[~2019-08-29  4:29 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-15806.20190427234134@ruby-lang.org>
2019-04-27 23:41 ` [ruby-core:92452] [Ruby trunk Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup lourens
2019-08-29  4:29 ` mame [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-81237.20190829042920.ffc4dba8d9530372@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).