From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id B392E1FB09 for ; Wed, 23 Dec 2020 08:38:54 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 09/10] config: config_fh_parse: micro-optimize Date: Wed, 23 Dec 2020 08:38:52 +0000 Message-Id: <20201223083853.30721-10-e@80x24.org> In-Reply-To: <20201223083853.30721-1-e@80x24.org> References: <20201223083853.30721-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: We can avoid a slow regexp capture and instead and rely on rindex + substr to extract the section from the config file. Then we use the defined-or-assignment (//=) operator combined with the documented return value of `push' to ensure @section_order is unique without repeating a hash lookup. Finally, we avoid short-lived variables inside the loop and declare them subroutine-wide to knock a teeny bit of allocation time. Combined, these optimizations bring the ~1.22s PublicInbox::Config->new time down to ~0.98s with 50K inboxes. --- lib/PublicInbox/Config.pm | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm index 4d143c6e..60107d45 100644 --- a/lib/PublicInbox/Config.pm +++ b/lib/PublicInbox/Config.pm @@ -132,20 +132,15 @@ sub default_file { sub config_fh_parse ($$$) { my ($fh, $rs, $fs) = @_; - my %rv; - my (%section_seen, @section_order); + my (%rv, %section_seen, @section_order, $line, $k, $v, $section, $cur); local $/ = $rs; - while (defined(my $line = <$fh>)) { + while (defined($line = <$fh>)) { # performance critical with giant configs chomp $line; - my ($k, $v) = split($fs, $line, 2); - my ($section) = ($k =~ /\A(\S+)\.[^\.]+\z/); - unless (defined $section_seen{$section}) { - $section_seen{$section} = 1; - push @section_order, $section; - } + ($k, $v) = split($fs, $line, 2); + $section = substr($k, 0, rindex($k, '.')); + $section_seen{$section} //= push(@section_order, $section); - my $cur = $rv{$k}; - if (defined $cur) { + if (defined($cur = $rv{$k})) { if (ref($cur) eq "ARRAY") { push @$cur, $v; } else {