From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.4 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 6DE9E1F8C6 for ; Fri, 3 Sep 2021 19:19:47 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 78EE338460B1 for ; Fri, 3 Sep 2021 19:19:45 +0000 (GMT) Received: from zimbra.cs.ucla.edu (zimbra.cs.ucla.edu [131.179.128.68]) by sourceware.org (Postfix) with ESMTPS id 51088384383F for ; Fri, 3 Sep 2021 19:18:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 51088384383F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cs.ucla.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=cs.ucla.edu Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 78096160171; Fri, 3 Sep 2021 12:18:48 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id WNZlG9x6D8cm; Fri, 3 Sep 2021 12:18:44 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 040C6160172; Fri, 3 Sep 2021 12:18:44 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id g-GosEWg5IYv; Fri, 3 Sep 2021 12:18:43 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D7867160171; Fri, 3 Sep 2021 12:18:43 -0700 (PDT) To: Adhemerval Zanella References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> From: Paul Eggert Organization: UCLA Computer Science Department Subject: Re: [PATCH v3 0/7] Use introsort for qsort Message-ID: <3d2c0890-7a7e-518e-d39e-e3b0df85dd94@cs.ucla.edu> Date: Fri, 3 Sep 2021 12:18:43 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: libc-alpha@sourceware.org Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 9/3/21 10:11 AM, Adhemerval Zanella wrote: > The performance difference with the provided benchmark clearly show the > tradeoff of using a instrosort over a mergesort. Zhang, Meng and Liang report that=20 introsort is particularly bad with reverse-sorted input; that case=20 should be added to the benchmarks. The proposed change would appear to hurt performance significantly in=20 the common case of sorting pointers. My guess is that typical glibc=20 users would prefer speed to hardening against rare signal-handling or=20 low-memory cases, since portable code will have to assume that qsort=20 isn't hardened anyway. If hardening is a goal, perhaps we should consider an alternate API=20 instead of changing qsort's implementation. If we do that, we can=20 improve the API as well as changing the function's name. The new API=20 could promise to not call malloc, or more generally to be=20 async-signal-safe if the comparison function is. The comparison function=20 could take a third void* argument so it can in effect be a closure (this=20 is qsort's greatest failing), the sorting function could take a flag=20 argument for options, and it could return an error indication (in case=20 it detects a comparison function gone haywire, say). That sort of thing. > The improvements over sizes larger > than 8 is mostly due the optimization done on swap operations. A similar optimization could be done to the existing code independently,=20 no? I expect that an apples-to-apples performance comparison would be=20 even less favorable to the proposed change.