From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id D756F1F461 for ; Thu, 29 Aug 2019 04:29:30 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 600B4120B21; Thu, 29 Aug 2019 13:29:23 +0900 (JST) Received: from o1678948x4.outbound-mail.sendgrid.net (o1678948x4.outbound-mail.sendgrid.net [167.89.48.4]) by neon.ruby-lang.org (Postfix) with ESMTPS id 3015B120AF9 for ; Thu, 29 Aug 2019 13:29:20 +0900 (JST) Received: by filter0017p3iad2.sendgrid.net with SMTP id filter0017p3iad2-24580-5D6754A1-B 2019-08-29 04:29:21.283728332 +0000 UTC m=+119076.982382784 Received: from herokuapp.com (unknown [54.227.36.202]) by ismtpd0040p1mdw1.sendgrid.net (SG) with ESMTP id A-wjzIX2STeYqtaQh-jtFg for ; Thu, 29 Aug 2019 04:29:21.057 +0000 (UTC) Date: Thu, 29 Aug 2019 04:29:21 +0000 (UTC) From: mame@ruby-lang.org Message-ID: References: Mime-Version: 1.0 X-Redmine-MailingListIntegration-Message-Ids: 70192 X-Redmine-Project: ruby-trunk X-Redmine-Issue-Id: 15806 X-Redmine-Issue-Author: methodmissing X-Redmine-Issue-Assignee: nobu X-Redmine-Sender: mame X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-SG-EID: =?us-ascii?Q?EJh2gqwnyqXtd++xo=2FinyA1V0bXouTB4FkWnzNiKb4+GP7VHWZZ+1slJZmPDmA?= =?us-ascii?Q?v1aWKJxSWDj0UJcgL5k8ETrSG9AVv9cvxR62zKr?= =?us-ascii?Q?bJMyvps4tRGBExNbPTTq6CLZfZweDBXkAwlzPH+?= =?us-ascii?Q?zdpDv6G=2Fo4oQR4Ons+s3xgK4B0+0FzWLsKvrI1M?= =?us-ascii?Q?3cUkNNbTpmuJSUV8fQjj=2FuGzmoaCank8Yfw=3D=3D?= To: ruby-core@ruby-lang.org X-ML-Name: ruby-core X-Mail-Count: 94640 Subject: [ruby-core:94640] [Ruby master Misc#15806] Explicitly initialise encodings on init to remove branches on encoding lookup X-BeenThere: ruby-core@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Ruby developers List-Id: Ruby developers List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Errors-To: ruby-core-bounces@ruby-lang.org Sender: "ruby-core" Issue #15806 has been updated by mame (Yusuke Endoh). Assignee set to nobu (Nobuyoshi Nakada) Status changed from Closed to Assigned usa said it might not work on windows when the install path includes non-AS= CII characters. Please check it out, @nobu . ---------------------------------------- Misc #15806: Explicitly initialise encodings on init to remove branches on = encoding lookup https://bugs.ruby-lang.org/issues/15806#change-81237 * Author: methodmissing (Lourens Naud=E9) * Status: Assigned * Priority: Normal * Assignee: nobu (Nobuyoshi Nakada) ---------------------------------------- References Github PR https://github.com/ruby/ruby/pull/2128 I noticed that the encoding table is loaded on startup of even just `miniru= by` (minimal viable interpreter use case) through this backtrace during rub= y setup: ``` /home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] enc= oding.c:587 /home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c9= 48a] encoding.c:1357 /home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol= .c:42 /home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] i= nits.c:25 /home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval= .c:74 /home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c= :91 /home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41 ``` Therefore I think it makes sense to instead initialize encodings explicitly= just prior to symbol init, which is the first entry point into the interpr= eter loading that currently triggers `rb_enc_init` and remove the initializ= ation check branches from the various lookup methods. Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm= Bi Bim` with `Ir` (instructions retired), `Bc` (branches taken) and `Bcm`= (branches missed) relevant here as there are no indirect branches (functio= n pointers etc.): (hot function, many instructions retired and branches taken and missed) ``` . . . . . rb_encoding * . . . . . rb_enc_from_index(int ind= ex) 835,669 0 0 0 0 { 13,133,536 6,337,652 50,267 0 0 if (!enc_table.list) { 3 0 0 0 0 rb_enc_init(); . . . . . } 23,499,349 8,006,202 293,161 0 0 if (index < 0 || enc_= table.count <=3D (index &=3D ENC_INDEX_MASK)) { . . . . . return 0; . . . . . } 30,024,494 0 0 0 0 return enc_table.list= [index].enc; 1,671,338 0 0 0 0 } ``` (cold function, representative of the utf8 variant more or less too) ``` . . . . . rb_encoding * . . . . . rb_ascii8bit_encoding(voi= d) . . . . . { 27,702 9,235 955 0 0 if (!enc_table.list) { . . . . . rb_enc_init(); . . . . . } 9,238 0 0 0 0 return enc_table.list= [ENCINDEX_ASCII].enc; 9,232 0 0 0 0 } ``` I think lazy loading encodings and populating the table is fine, but initia= lizing it can be done more explicitly in the boot process. -- = https://bugs.ruby-lang.org/ Unsubscribe: