From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, PDS_RDNS_DYNAMIC_FP,RCVD_IN_DNSWL_MED,RDNS_DYNAMIC,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D73CA1F670 for ; Fri, 15 Oct 2021 16:46:00 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA2F53857C7E for ; Fri, 15 Oct 2021 16:45:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BA2F53857C7E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1634316359; bh=t1Tr/Ekji9Ojvl3DCDhhPYdCkUeQ4UhfFK2JklBOlLU=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=v+mdCdnVCS/LbAv0yimpcTX2oKohP6DXbwe3iU6EVM5JsC+LbUc4TBQN+rom9DdJZ Jf77EWVVA/1pKyRYd7w50vwNS2k7KYzGWvbHTd5iMVXG6qLiOpEpbuwf5zvaUiohX+ sRsg/B5cjiHvJE26yYxYChg3MNH+RIH129NvwCjA= Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by sourceware.org (Postfix) with ESMTPS id 69C633858C60 for ; Fri, 15 Oct 2021 16:45:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 69C633858C60 Received: by mail-pg1-x52e.google.com with SMTP id 133so9118689pgb.1 for ; Fri, 15 Oct 2021 09:45:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=t1Tr/Ekji9Ojvl3DCDhhPYdCkUeQ4UhfFK2JklBOlLU=; b=NNzcrR2GlKE5imusSEHS3Jv2AhmHmXtDFvAY+MaRsIgNFOyFYuwLnr6ULpGPUJpVaU k8O7N8BazopkI+MVpY7wVK7sJMZu6S0yiGStXCW7daODp9g0jhkLPZXy/0Lp47WHbP/o mwpsOJBDPj7Iy6Khq7UsaDMTq9/1XvLYxIpssfJzPAHukagWWHQaAl4FexaOU8jDRI+6 IaGwkYs2liX6rG6B3+CswZBdqYVpgvRaOruyKNw4TCHHQd3EBe4ZRvSY//482pfKSAqD /fnV/C+17jJH6EgaaThnpXbbYfgzYgGJufNcWfJqMub1gSaXBWuKL8N1fpkQIN2okaU4 QkOw== X-Gm-Message-State: AOAM533YsbAf274QJov1HNm7BjQ/ZCKuGxm/7jNgKYle1XSRx7h7aIbW /z97ikSvivGMHlcGLYi+o/jwN+simX1qBhzdvWxSsKmd X-Google-Smtp-Source: ABdhPJyuJBWPMxheTIvv1puV3huYcQHGjiJxbpNyVK1zt7Pbow1ffdSKrzxO3BHc59DxSz4Tr5ZDaseWiQT3jmLEMTo= X-Received: by 2002:a63:dc42:: with SMTP id f2mr10113888pgj.152.1634316339537; Fri, 15 Oct 2021 09:45:39 -0700 (PDT) MIME-Version: 1.0 References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> <20210903171144.952737-4-adhemerval.zanella@linaro.org> <4eee0432-9e35-06a2-56ba-d5589805cc00@linaro.org> In-Reply-To: <4eee0432-9e35-06a2-56ba-d5589805cc00@linaro.org> Date: Fri, 15 Oct 2021 11:45:28 -0500 Message-ID: Subject: Re: [PATCH v3 3/7] stdlib: Optimization qsort{_r} swap implementation (BZ #19305) To: Adhemerval Zanella Content-Type: text/plain; charset="UTF-8" X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Noah Goldstein via Libc-alpha Reply-To: Noah Goldstein Cc: GNU C Library Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On Fri, Oct 15, 2021 at 8:12 AM Adhemerval Zanella wrote: > > > > On 13/10/2021 00:29, Noah Goldstein wrote: > > +static void > > +swap_bytes (void * restrict a, void * restrict b, size_t n) > > +{ > > + /* Use multiple small memcpys with constant size to enable inlining > > + on most targets. */ > > + enum { SWAP_GENERIC_SIZE = 32 }; > > + unsigned char tmp[SWAP_GENERIC_SIZE]; > > + while (n > SWAP_GENERIC_SIZE) > > + { > > + memcpy (tmp, a, SWAP_GENERIC_SIZE); > > + a = memcpy (a, b, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; > > + b = memcpy (b, tmp, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; > > + n -= SWAP_GENERIC_SIZE; > > + } > > + memcpy (tmp, a, n); > > + memcpy (a, b, n); > > + memcpy (b, tmp, n); > > +} > > + > > +/* Replace the indirect call with a serie of if statements. It should help > > + the branch predictor. */ > > > > > > 1) Really? On Intel at least an indirect call that is always going to the same place > > is certainly going to be predicted as well if not better than 2/3 branches + direct call. > > > > I shamelessly copy the same strategy Linux kernel used on its lib/sort.c > (8fb583c4258d08f0). Maybe Linux internal usage of its qsort() leads to > better predictable branch, and for this change I would prefer to work > better on different architectures than assume an specific one. If it's running in the Kernel spectre mitigations can make indirect jumps particularly expensive. I don't think we have the same issue targeting userland. > > > 2) If you're going to just test which swap function to use, why bother initializing > > swap_func? Why not just use an int? > > Indeed this is no much gain on glibc usage. The kernel provides a API to use > used-defined swap_func, which is not our case. >