ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:92452] [Ruby trunk Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup
       [not found] <redmine.issue-15806.20190427234134@ruby-lang.org>
@ 2019-04-27 23:41 ` lourens
  2019-08-29  4:29 ` [ruby-core:94640] [Ruby master " mame
  1 sibling, 0 replies; 2+ messages in thread
From: lourens @ 2019-04-27 23:41 UTC (permalink / raw)
  To: ruby-core

Issue #15806 has been reported by methodmissing (Lourens Naudé).

----------------------------------------
Misc #15806: Explicitly initialise encodings on init to remove branches on encoding lookup
https://bugs.ruby-lang.org/issues/15806

* Author: methodmissing (Lourens Naudé)
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
References Github PR https://github.com/ruby/ruby/pull/2128

I noticed that the encoding table is loaded on startup of even just `miniruby` (minimal viable interpreter use case) through this backtrace during ruby setup:

```
/home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] encoding.c:587
/home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c948a] encoding.c:1357
/home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol.c:42
/home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] inits.c:25
/home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval.c:74
/home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c:91
/home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41
```

Therefore I think it makes sense to instead initialize encodings explicitly just prior to symbol init, which is the first entry point into the interpreter loading that currently triggers `rb_enc_init` and remove the initialization check branches from the various lookup methods.

Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm Bi Bim` with `Ir` (instructions retired),  `Bc` (branches taken) and `Bcm` (branches missed) relevant here as there are no indirect branches (function pointers etc.):

(hot function, many instructions retired and branches taken and missed)
```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_enc_from_index(int index)
   835,669          0       0          0       0  {
13,133,536  6,337,652  50,267          0       0      if (!enc_table.list) {
         3          0       0          0       0  	rb_enc_init();
         .          .       .          .       .      }
23,499,349  8,006,202 293,161          0       0      if (index < 0 || enc_table.count <= (index &= ENC_INDEX_MASK)) {
         .          .       .          .       .  	return 0;
         .          .       .          .       .      }
30,024,494          0       0          0       0      return enc_table.list[index].enc;
 1,671,338          0       0          0       0  }
```

(cold function, representative of the utf8 variant more or less too)

```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_ascii8bit_encoding(void)
         .          .       .          .       .  {
    27,702      9,235     955          0       0      if (!enc_table.list) {
         .          .       .          .       .  	rb_enc_init();
         .          .       .          .       .      }
     9,238          0       0          0       0      return enc_table.list[ENCINDEX_ASCII].enc;
     9,232          0       0          0       0  }
```

I think lazy loading encodings and populating the table is fine, but initializing it can be done more explicitly in the boot process.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [ruby-core:94640] [Ruby master Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup
       [not found] <redmine.issue-15806.20190427234134@ruby-lang.org>
  2019-04-27 23:41 ` [ruby-core:92452] [Ruby trunk Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup lourens
@ 2019-08-29  4:29 ` mame
  1 sibling, 0 replies; 2+ messages in thread
From: mame @ 2019-08-29  4:29 UTC (permalink / raw)
  To: ruby-core

Issue #15806 has been updated by mame (Yusuke Endoh).

Assignee set to nobu (Nobuyoshi Nakada)
Status changed from Closed to Assigned

usa said it might not work on windows when the install path includes non-ASCII characters.  Please check it out, @nobu .

----------------------------------------
Misc #15806: Explicitly initialise encodings on init to remove branches on encoding lookup
https://bugs.ruby-lang.org/issues/15806#change-81237

* Author: methodmissing (Lourens Naudé)
* Status: Assigned
* Priority: Normal
* Assignee: nobu (Nobuyoshi Nakada)
----------------------------------------
References Github PR https://github.com/ruby/ruby/pull/2128

I noticed that the encoding table is loaded on startup of even just `miniruby` (minimal viable interpreter use case) through this backtrace during ruby setup:

```
/home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] encoding.c:587
/home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c948a] encoding.c:1357
/home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol.c:42
/home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] inits.c:25
/home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval.c:74
/home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c:91
/home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41
```

Therefore I think it makes sense to instead initialize encodings explicitly just prior to symbol init, which is the first entry point into the interpreter loading that currently triggers `rb_enc_init` and remove the initialization check branches from the various lookup methods.

Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm Bi Bim` with `Ir` (instructions retired),  `Bc` (branches taken) and `Bcm` (branches missed) relevant here as there are no indirect branches (function pointers etc.):

(hot function, many instructions retired and branches taken and missed)
```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_enc_from_index(int index)
   835,669          0       0          0       0  {
13,133,536  6,337,652  50,267          0       0      if (!enc_table.list) {
         3          0       0          0       0  	rb_enc_init();
         .          .       .          .       .      }
23,499,349  8,006,202 293,161          0       0      if (index < 0 || enc_table.count <= (index &= ENC_INDEX_MASK)) {
         .          .       .          .       .  	return 0;
         .          .       .          .       .      }
30,024,494          0       0          0       0      return enc_table.list[index].enc;
 1,671,338          0       0          0       0  }
```

(cold function, representative of the utf8 variant more or less too)

```
         .          .       .          .       .  rb_encoding *
         .          .       .          .       .  rb_ascii8bit_encoding(void)
         .          .       .          .       .  {
    27,702      9,235     955          0       0      if (!enc_table.list) {
         .          .       .          .       .  	rb_enc_init();
         .          .       .          .       .      }
     9,238          0       0          0       0      return enc_table.list[ENCINDEX_ASCII].enc;
     9,232          0       0          0       0  }
```

I think lazy loading encodings and populating the table is fine, but initializing it can be done more explicitly in the boot process.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-08-29  4:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <redmine.issue-15806.20190427234134@ruby-lang.org>
2019-04-27 23:41 ` [ruby-core:92452] [Ruby trunk Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup lourens
2019-08-29  4:29 ` [ruby-core:94640] [Ruby master " mame

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).