From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.5 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 68B861F670 for ; Fri, 15 Oct 2021 13:13:17 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7AB663857C6C for ; Fri, 15 Oct 2021 13:13:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7AB663857C6C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1634303596; bh=JWx00vGTE8uL0fSWultWxMFs18B/GYUAj0Lz2s8FAtc=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=x1EiiH6ex40S+PGvWKcQxhwJxpIAPTsN/zKS9N8mtbW7nWyM0ar1CPoDKOc47dLlY bKEb8sxsAA/6b4qWhGo/rO2iLhmex/bYADgNBfi/+gDwI5YkLCq7CYXcHLuPJB3x/Z oOrm0OPSPjBWy3mgbUpF3TjuQjZrOzANXdkng4yE= Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) by sourceware.org (Postfix) with ESMTPS id 709B7385780B for ; Fri, 15 Oct 2021 13:12:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 709B7385780B Received: by mail-vk1-xa33.google.com with SMTP id f126so5124345vke.3 for ; Fri, 15 Oct 2021 06:12:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=JWx00vGTE8uL0fSWultWxMFs18B/GYUAj0Lz2s8FAtc=; b=MMR6QwQ5KnAJFweSIv341cFvsTw3+QtY8R3m/SeXBUS6FkYABh2x2ULp9fB45h0fjl UWz1xuLkfbq6jYUNZp308rZ8sO9gCiC7+2GDMtsgsZeS5FBnGa/SDuXSp+bFVGiYQcXC A31p4uPuUt2VV0QfJSMB8IItTmqm1wP7VL/iLv0WcKI6GBrKiGJGcSoynwZY6bpqnOLB OJ8CuxoSnl6mqGaj1cTYdvKMcPDHDwTIMtU5iSzFcf97xNJEbItnMykOYZVhb6v8luJv sbyS2Huxho4kLSZpi3iuyRdGGo4NRcQWp9FaKQk6ec1IQh8OAS0f7KOtRf3TZdCNBg/W P/VQ== X-Gm-Message-State: AOAM530PdxiUqcYmHeYgsTsKTpJtJ2J6er+hwOvCyuE2IPcBUMRut3YD lB33j+lNYg8uH0cm9iRHS6mBmjOiyBXPdg== X-Google-Smtp-Source: ABdhPJzbtKmjK3aaO3AzUH1qYalRLNaDK+aJoshxBrKGJtPXrOZVdcFg5aaU7wg3VvGiGJczD1X5sA== X-Received: by 2002:a1f:2f8d:: with SMTP id v135mr12179249vkv.18.1634303526858; Fri, 15 Oct 2021 06:12:06 -0700 (PDT) Received: from ?IPv6:2804:431:c7ca:c6c7:f05e:9652:ab99:7fa2? ([2804:431:c7ca:c6c7:f05e:9652:ab99:7fa2]) by smtp.gmail.com with ESMTPSA id s81sm3779911vks.55.2021.10.15.06.12.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 15 Oct 2021 06:12:06 -0700 (PDT) Subject: Re: [PATCH v3 3/7] stdlib: Optimization qsort{_r} swap implementation (BZ #19305) To: Noah Goldstein References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> <20210903171144.952737-4-adhemerval.zanella@linaro.org> Message-ID: <4eee0432-9e35-06a2-56ba-d5589805cc00@linaro.org> Date: Fri, 15 Oct 2021 10:12:04 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Adhemerval Zanella via Libc-alpha Reply-To: Adhemerval Zanella Cc: GNU C Library Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 13/10/2021 00:29, Noah Goldstein wrote: > +static void > +swap_bytes (void * restrict a, void * restrict b, size_t n) > +{ > +  /* Use multiple small memcpys with constant size to enable inlining > +     on most targets.  */ > +  enum { SWAP_GENERIC_SIZE = 32 }; > +  unsigned char tmp[SWAP_GENERIC_SIZE]; > +  while (n > SWAP_GENERIC_SIZE) > +    { > +      memcpy (tmp, a, SWAP_GENERIC_SIZE); > +      a = memcpy (a, b, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; > +      b = memcpy (b, tmp, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; > +      n -= SWAP_GENERIC_SIZE; > +    } > +  memcpy (tmp, a, n); > +  memcpy (a, b, n); > +  memcpy (b, tmp, n); > +} > + > +/* Replace the indirect call with a serie of if statements.  It should help > +   the branch predictor.  */ > >   > 1) Really? On Intel at least an indirect call that is always going to the same place > is certainly going to be predicted as well if not better than 2/3 branches + direct call. > I shamelessly copy the same strategy Linux kernel used on its lib/sort.c (8fb583c4258d08f0). Maybe Linux internal usage of its qsort() leads to better predictable branch, and for this change I would prefer to work better on different architectures than assume an specific one. > 2) If you're going to just test which swap function to use, why bother initializing > swap_func? Why not just use an int? Indeed this is no much gain on glibc usage. The kernel provides a API to use used-defined swap_func, which is not our case.