From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id D50681F453 for ; Sat, 27 Apr 2019 23:41:40 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 70CC6120A41; Sun, 28 Apr 2019 08:41:36 +0900 (JST) Received: from o1678916x28.outbound-mail.sendgrid.net (o1678916x28.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id 588B11209C9 for ; Sun, 28 Apr 2019 08:41:34 +0900 (JST) Received: by filter0168p3mdw1.sendgrid.net with SMTP id filter0168p3mdw1-15809-5CC4E8AF-2 2019-04-27 23:41:35.143589155 +0000 UTC m=+183098.972232651 Received: from herokuapp.com (unknown [3.88.222.166]) by ismtpd0036p1iad1.sendgrid.net (SG) with ESMTP id sHxQ8GVvQ2SONo83VTiT3Q for ; Sat, 27 Apr 2019 23:41:35.141 +0000 (UTC) Date: Sat, 27 Apr 2019 23:41:35 +0000 (UTC) From: lourens@bearmetal.eu Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 67935 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 15806 X-Redmine-Issue-Author: methodmissing X-Redmine-Sender: methodmissing X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?Fr=2F4MkNsnJ5O85XS4movydqTd1a9M98=2FgJAFvUl026PMEfh+7XqeVhdz2dBupu?= =?us-ascii?Q?NkjEzc1FR=2FHqJ4LkVUTc796KsSg1wifyQhSr1Zc?= =?us-ascii?Q?=2FgZQpaCcZd4rAakv+0daXXhNnIkWdXVgEfHQRqr?= =?us-ascii?Q?J9dSwzELPgcX8TqIr=2FDra=2F=2FYHLdl8c6=2F=2Fe1YMxh?= =?us-ascii?Q?ubxEP3kDZudlq66O3Os1DNTRW7oLnkckTHw=3D=3D?= To: ruby-core@ruby-lang.org X-ML-Name: ruby-core X-Mail-Count: 92452 Subject: [ruby-core:92452] [Ruby trunk Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #15806 has been reported by methodmissing (Lourens Naud=E9). ---------------------------------------- Misc #15806: Explicitly initialise encodings on init to remove branches on = encoding lookup https://bugs.ruby-lang.org/issues/15806 * Author: methodmissing (Lourens Naud=E9) * Status: Open * Priority: Normal * Assignee: = ---------------------------------------- References Github PR https://github.com/ruby/ruby/pull/2128 I noticed that the encoding table is loaded on startup of even just `miniru= by` (minimal viable interpreter use case) through this backtrace during rub= y setup: ``` /home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] enc= oding.c:587 /home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c9= 48a] encoding.c:1357 /home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol= .c:42 /home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] i= nits.c:25 /home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval= .c:74 /home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c= :91 /home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41 ``` Therefore I think it makes sense to instead initialize encodings explicitly= just prior to symbol init, which is the first entry point into the interpr= eter loading that currently triggers `rb_enc_init` and remove the initializ= ation check branches from the various lookup methods. Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm= Bi Bim` with `Ir` (instructions retired), `Bc` (branches taken) and `Bcm`= (branches missed) relevant here as there are no indirect branches (functio= n pointers etc.): (hot function, many instructions retired and branches taken and missed) ``` . . . . . rb_encoding * . . . . . rb_enc_from_index(int ind= ex) 835,669 0 0 0 0 { 13,133,536 6,337,652 50,267 0 0 if (!enc_table.list) { 3 0 0 0 0 rb_enc_init(); . . . . . } 23,499,349 8,006,202 293,161 0 0 if (index < 0 || enc_= table.count <=3D (index &=3D ENC_INDEX_MASK)) { . . . . . return 0; . . . . . } 30,024,494 0 0 0 0 return enc_table.list= [index].enc; 1,671,338 0 0 0 0 } ``` (cold function, representative of the utf8 variant more or less too) ``` . . . . . rb_encoding * . . . . . rb_ascii8bit_encoding(voi= d) . . . . . { 27,702 9,235 955 0 0 if (!enc_table.list) { . . . . . rb_enc_init(); . . . . . } 9,238 0 0 0 0 return enc_table.list= [ENCINDEX_ASCII].enc; 9,232 0 0 0 0 } ``` I think lazy loading encodings and populating the table is fine, but initia= lizing it can be done more explicitly in the boot process. -- = https://bugs.ruby-lang.org/ Unsubscribe: