From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, NICE_REPLY_A,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 8D08D1F8C8 for ; Tue, 5 Oct 2021 17:17:43 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F139F385AC29 for ; Tue, 5 Oct 2021 17:17:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F139F385AC29 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1633454261; bh=/D769s2GBDWon+W4i0vRUB6UmqUMy4CdbapdEXj6YXI=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=vkEhYERWVcTuJ5kkr4GSNizMBUqGJ/H0phi0zdd7R11wkyU7qmElTd+rx5nakzY/k bHku/X2Gv+wwbys9qrKjYsu/vEct8KIFBw9df9IiLUAl7mnfs4urb6My/Sg4h7ivLS La4P0YJfm2TLiAAB+0FiVm82er9C9fGK2mLmT+G8= Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by sourceware.org (Postfix) with ESMTPS id 38013385BF9F for ; Tue, 5 Oct 2021 17:15:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 38013385BF9F Received: by mail-qv1-xf32.google.com with SMTP id k3so86773qve.10 for ; Tue, 05 Oct 2021 10:15:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/D769s2GBDWon+W4i0vRUB6UmqUMy4CdbapdEXj6YXI=; b=YtJCllJnQ7Eg6zCxvxOSBbSAeS3Y1YdkFpTUqc5Q30iXBbS1m8NIgeXns403qfsOH9 TOaDbTx/oA8W9wkuecP7uR9rMCAYcIuk16zumX1bNu0w9/IfUw4mXEdKoTCKGzGhW86j BXOj836ngKxEoDHk9G3LC3D6N5JvdP6Y3flMxoBgvodIoRLWYJ+y3gmbJRtjaTDLCHlT /cBEOIzprAkWNTRQKS/JTj1qrknlAXYyd1umY1F0M8p2klxShkNiZ7vP43R/g8BvVy5f SX9Xuukb8sgT0wupE4saecTBcxDisTG8Gl5L2NubDGyNBvBFF6RS0MuyPdzlkq4r95w0 NJzA== X-Gm-Message-State: AOAM531d9l5YMTeIk5ZplqCKymPnFno2PgPXYvgGvmdgeZkq4g9YmZaX iRzcwajXA+jnnqjoa9EHUIoV+MgQ2FRjCw== X-Google-Smtp-Source: ABdhPJxbLyrOpm+lpjBq1MxrbzZHhjRf4ciMY3kQnNH7dTNCk1CN9dqwvwQE+IzzZzb4fuL4PISzfQ== X-Received: by 2002:ad4:5445:: with SMTP id h5mr12900375qvt.64.1633454154748; Tue, 05 Oct 2021 10:15:54 -0700 (PDT) Received: from ?IPv6:2804:431:c7cb:807a:9735:fcf:f170:1c20? ([2804:431:c7cb:807a:9735:fcf:f170:1c20]) by smtp.gmail.com with ESMTPSA id x18sm9830706qkx.94.2021.10.05.10.15.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Oct 2021 10:15:54 -0700 (PDT) Subject: Re: [PATCH v6 2/2] BZ #17645, fix slow DSO sorting behavior in dynamic loader -- Algorithm changes To: Chung-Lin Tang , GNU C Library , Florian Weimer References: <30aa9fbd-fc08-9a6f-d34b-accb47abd8af@codesourcery.com> <334c6848-e805-4666-add5-e611f1d7c255@linaro.org> <4fefc9b5-64fa-1384-075c-4b3363d95941@codesourcery.com> Message-ID: Date: Tue, 5 Oct 2021 14:15:52 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <4fefc9b5-64fa-1384-075c-4b3363d95941@codesourcery.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Adhemerval Zanella via Libc-alpha Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 05/10/2021 11:27, Chung-Lin Tang wrote: > Hi Adhemerval, > > On 2021/8/11 5:08 AM, Adhemerval Zanella wrote: >>> +void >>> +_dl_sort_maps (struct link_map **maps, unsigned int nmaps, >>> +           unsigned int skip, bool for_fini) >>> +{ >>> +  /* Index code for sorting algorithm currently in use.  */ >>> +  static int32_t algorithm = 0; >>> +  if (__glibc_unlikely (algorithm == 0)) >>> +    algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, >>> +                 int32_t, NULL); >> This is not race free, a better alternative would be to add a new member on >> the GLRO struct and a init function to setup the value after tunables >> initialization.  Also, but using a enumeration to define to possible values, >> there is no need to add the __builtin_unreachable() below. > > I'm not sure why there can be a race with TUNABLE_GET(), since isn't all tunables > set only once during initialization? We are trying to avoid even 'benign' data races, where a static variable might be concurrent set from a constant value. One pattern that we used to have is to check if a syscall was present and set a static variable, similar to what you do. Another way to do it would be use relaxed atomic, such as the way mips64 getdents does [1]. But I still think it would be better organized if we move the selected dl_maps sorting algorithm to a GLRO variable. [1] sysdeps/unix/sysv/linux/mips/mips64/getdents64.c > > That said, I agree that placing the algorithm code initializing separately in > _dl_sort_maps_init looks better. > > > > The changes to this part (relative to my v6 patch) I see you have on the > azanella/dso-opt branch: > >  void > +_dl_sort_maps_init (void) > +{ > +  int32_t algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, int32_t, NULL); > +  GLRO(dl_dso_sort_algo) = algorithm == 1 ? dso_sort_algorithm_original > +                                         : dso_sort_algorithm_dfs; > +} > + > +void >  _dl_sort_maps (struct link_map **maps, unsigned int nmaps, >                unsigned int skip, bool for_fini) >  { > -  /* Index code for sorting algorithm currently in use.  */ > -  static int32_t algorithm = 0; > -  if (__glibc_unlikely (algorithm == 0)) > -    algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, > -                            int32_t, NULL); > - >    /* It can be tempting to use a static function pointer to store and call >       the current selected sorting algorithm routine, but experimentation >       shows that current processors still do not handle indirect branches > @@ -291,12 +293,10 @@ >       PTR_MANGLE/DEMANGLE, further impairing performance of small, common >       input cases. A simple if-case with direct function calls appears to >       be the fastest.  */ > -  if (__glibc_likely (algorithm == 1)) > +  if (GLRO(dl_dso_sort_algo) == dso_sort_algorithm_original) >      _dl_sort_maps_original (maps, nmaps, skip, for_fini); > -  else if (algorithm == 2) > -    _dl_sort_maps_dfs (maps, nmaps, skip, for_fini); >    else > -    __builtin_unreachable (); > +    _dl_sort_maps_dfs (maps, nmaps, skip, for_fini); >  } > > > I'm not sure if defining 'enum dso_sort_algorithm' is really needed? > Is __builtin_unreachable() that undesirable? The tunables code should already sanitize the input and returns the possible algorithms as a type, instead of a number. I think it is clear to work with predefined type, than hint the compiler explicitly. > > Also, the final code should probably add __glibc_likely() to the default > algorithm case. Sometime I think we abuse the __glibc_{un}likely, specially on short code like that. But please reinstate it if you see that it does improve code generation. > > Thanks, > Chung-Lin