From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-5.5 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 84C7F1F8C6 for ; Tue, 7 Sep 2021 14:32:56 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7A5F73857420 for ; Tue, 7 Sep 2021 14:32:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7A5F73857420 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1631025174; bh=5vQvBIeaNF9GLApzF6g45JkfDF1w2pS1zyWDuo5HgPs=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=EEvq42nZRHhDJo833KFPXXaFPy/y2gvM6/pI22tQ7oqoB104mq8uoa6Q6Z/a2NVPX 2KsWoFM++iWJjeciqkIr+FchZX8l7isSDX1qNIi2KYwKLjHjMKcbmuvwyT6jSUXwF3 qei9pbYJecWKg76GSB7megl79Ge4h/Gwnc3wXJwI= Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) by sourceware.org (Postfix) with ESMTPS id 7F2F63858002 for ; Tue, 7 Sep 2021 14:32:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7F2F63858002 Received: by mail-qv1-xf31.google.com with SMTP id a12so1143071qvz.4 for ; Tue, 07 Sep 2021 07:32:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=5vQvBIeaNF9GLApzF6g45JkfDF1w2pS1zyWDuo5HgPs=; b=POO9UqkkQgLFBZh798YU/PRA1x5TpZdfUchzw2u/hyYJsYoWsWrX39AvzT8jsE1BlC mYrt9UzTZlw23t9GyNUSyGiK3KEnnobdGBONlqS6nBrRKAGonm7xqR+L4RHe3lV3Up1r Y4AGnx+RkI7/fYQa7pcTjDx25vOrdQiCK5u1pXF+PyTYrTbn7/c9MElip/Z1YQI9hQq0 02tJ9LJWEdKtr0cSumy0uGh7uhsWCYhEefXnMOzlzpYOjwJWWTojwcZzCiRpSwqyHP0o U2TMNj/EE3gIaIW76U8W2o5oYGgFSsOqRZHzIAl6maCdnlhL86xPkwzD9l8AkDp8I4Yh 2CSA== X-Gm-Message-State: AOAM5319jL0aDYnEQcqgRs5DC7bogi7bY12EuBPjHd1TRQ96Jw1Hn09E lht7ojIBuCRC4pJotucUkQqFHpBwRWocNA== X-Google-Smtp-Source: ABdhPJzl8W7dlyvUjLBL1F/cxg1EkVu9PdcC4mW3oRgch0K0p2zHytuqGDK8fWCv6Ih7vmZLIHC0Sg== X-Received: by 2002:a0c:a992:: with SMTP id a18mr17217841qvb.10.1631025153827; Tue, 07 Sep 2021 07:32:33 -0700 (PDT) Received: from ?IPv6:2804:431:c7cb:733d:f360:5bf3:9813:d8f5? ([2804:431:c7cb:733d:f360:5bf3:9813:d8f5]) by smtp.gmail.com with ESMTPSA id h12sm7406098qth.1.2021.09.07.07.32.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 07 Sep 2021 07:32:33 -0700 (PDT) Subject: Re: [PATCH v3 0/7] Use introsort for qsort To: Paul Eggert , Carlos O'Donell References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> <3d2c0890-7a7e-518e-d39e-e3b0df85dd94@cs.ucla.edu> <54a8f384-dfe4-db41-a09e-2209e0e62478@cs.ucla.edu> Message-ID: <6e9bbf08-4a76-6d33-1319-abda26a19692@linaro.org> Date: Tue, 7 Sep 2021 11:32:31 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <54a8f384-dfe4-db41-a09e-2209e0e62478@cs.ucla.edu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Adhemerval Zanella via Libc-alpha Reply-To: Adhemerval Zanella Cc: libc-alpha@sourceware.org Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 06/09/2021 21:14, Paul Eggert wrote: > On 9/6/21 7:13 AM, Carlos O'Donell wrote: > >> All existing ruby implementations I can see use qsort in an >> AS-safe-requiring context (process.c) and this would help those scenarios. > > I don't see how those scenarios are practical. Assuming glibc and the standard Ruby implementation, any async signal problems could occur only if Ruby's spawn method has more than 32 redirects, as this is the minimum number that would cause qsort to call malloc on practical platforms. In practice Ruby spawns don't have that many redirects, so Ruby programs won't run into a problem unless they're contrived to illustrate this portability bug in the Ruby implementation. I consider this a QoI to move toward provide memory bounded and async-safe interfaces even though the interface contract does not really specify such constraint. I give you that it might lead to non-portable code such the ruby case, but I prefer to actually move the interface to be async-safe instead of either trying break the caller expectations (that only works due an optimization) or changing the implementation and let the users handle any potential breakage. Eventually what might happen when some breakage occurs is users like ruby to roll out their own qsort() implementation to fulfill its need. We already have some concession about symbols that are not support to work in some scenarios (such as malloc() after fork()). > >> If we want better performance we need to: >> >> - Add new APIs mirroring what BSD is doing. > > BSD's APIs are not good, partly because they rely on nonportable C extensions (blocks) that I doubt whether glibc can insist on, partly for other reasons discussed below. An API like what I suggested in my previous email would be better, if we want to add new APIs. The blocks usage is only for *_b symbols, the BSD do provide default C compliant mergesort and heapsort. Also, FreeBSD really specifies the worst-case space complexity and its quicksort has constant-size (although it uses recursion). > >> - New APIs explicitly call out the *type* of sort algorithm. > > "Type" should not mean "heapsort" vs "quicksort" vs "mergesort" as in BSD. Users need performance, async-signal-safety, and stability; the BSD "type" approach is merely a means to that end and any new API should focus on their needs rather than on "type". > >> we should aim for a good implementation that is as safe as we >> can make it (limit memory usage, avoid malloc, avoid quadratic performance >> behaviour). > > If such an approach doesn't significantly hurt typical performance, I'm all for it. I'm skeptical that it's doable, though. As I put, it is always a tradeoff. Making async-safe means that we need to either use a different algorithm with constant worst-case space, use a hybrid approach, or use a different mechanism to allocate memory. > > Let's not underestimate ordinary users' desires for performance here. qsort is used in other core apps (e.g., Emacs) without running afoul of async-signal-safety bugs. > > Also, this is not merely about core apps like Emacs and Ruby; it's also about user one-off apps.  I often see Glibc qsort in benchmarks, and if the benchmarks results become significantly worse we'll be more likely to scare users away from Glibc and toward other platforms. Unless we can fix the significant performance issues in the proposed patch, the benefit of working around an only-theoretical bug in Ruby's implementation doesn't seem worth this cost. I really don't think providing a async-safe with constant worst-case space and be *clear* about it will 'scare away' uses, specially because we are making the interface behaving in a more constraint way which imho is good way forward. Users will roll out their implementation due different needs and tradeoff, such as gcc or python. What I think really upsets users it finding out that if they use a different element size or a large number its qsort now starts to call mallocs, or starts to opens system files, or when memory is low it fallbacks to a completely different implementation (which might have quadratic performance in some cases). One possibility is to keep the mergesort using the static array if it fits and fallback to another sorting (possible heapsort or at least the intrasort I am suggesting). I still prefer to keep it simple and with a predictable performance.