unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org>
To: Paul Eggert <eggert@cs.ucla.edu>, Carlos O'Donell <carlos@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v3 0/7] Use introsort for qsort
Date: Tue, 7 Sep 2021 11:32:31 -0300	[thread overview]
Message-ID: <6e9bbf08-4a76-6d33-1319-abda26a19692@linaro.org> (raw)
In-Reply-To: <54a8f384-dfe4-db41-a09e-2209e0e62478@cs.ucla.edu>



On 06/09/2021 21:14, Paul Eggert wrote:
> On 9/6/21 7:13 AM, Carlos O'Donell wrote:
> 
>> All existing ruby implementations I can see use qsort in an
>> AS-safe-requiring context (process.c) and this would help those scenarios.
> 
> I don't see how those scenarios are practical. Assuming glibc and the standard Ruby implementation, any async signal problems could occur only if Ruby's spawn method has more than 32 redirects, as this is the minimum number that would cause qsort to call malloc on practical platforms. In practice Ruby spawns don't have that many redirects, so Ruby programs won't run into a problem unless they're contrived to illustrate this portability bug in the Ruby implementation.

I consider this a QoI to move toward provide memory bounded and async-safe
interfaces even though the interface contract does not really specify such
constraint.  I give you that it might lead to non-portable code such the
ruby case, but I prefer to actually move the interface to be async-safe
instead of either trying break the caller expectations (that only works due
an optimization) or changing the implementation and let the users handle 
any potential breakage.

Eventually what might happen when some breakage occurs is users like ruby 
to roll out their own qsort() implementation to fulfill its need.

We already have some concession about symbols that are not support to work
in some scenarios (such as malloc() after fork()).

> 
>> If we want better performance we need to:
>>
>> - Add new APIs mirroring what BSD is doing.
> 
> BSD's APIs are not good, partly because they rely on nonportable C extensions (blocks) that I doubt whether glibc can insist on, partly for other reasons discussed below. An API like what I suggested in my previous email would be better, if we want to add new APIs.

The blocks usage is only for *_b symbols, the BSD do provide default
C compliant mergesort and heapsort.  Also, FreeBSD really specifies
the worst-case space complexity and its quicksort has constant-size
(although it uses recursion).

> 
>> - New APIs explicitly call out the *type* of sort algorithm.
> 
> "Type" should not mean "heapsort" vs "quicksort" vs "mergesort" as in BSD. Users need performance, async-signal-safety, and stability; the BSD "type" approach is merely a means to that end and any new API should focus on their needs rather than on "type".
> 
>> we should aim for a good implementation that is as safe as we
>> can make it (limit memory usage, avoid malloc, avoid quadratic performance
>> behaviour).
> 
> If such an approach doesn't significantly hurt typical performance, I'm all for it. I'm skeptical that it's doable, though.

As I put, it is always a tradeoff.  Making async-safe means that we need to
either use a different algorithm with constant worst-case space, use a
hybrid approach, or use a different mechanism to allocate memory.

> 
> Let's not underestimate ordinary users' desires for performance here. qsort is used in other core apps (e.g., Emacs) without running afoul of async-signal-safety bugs.
> 
> Also, this is not merely about core apps like Emacs and Ruby; it's also about user one-off apps.  I often see Glibc qsort in benchmarks, and if the benchmarks results become significantly worse we'll be more likely to scare users away from Glibc and toward other platforms. Unless we can fix the significant performance issues in the proposed patch, the benefit of working around an only-theoretical bug in Ruby's implementation doesn't seem worth this cost.

I really don't think providing a async-safe with constant worst-case space
and be *clear* about it will 'scare away' uses, specially because we are
making the interface behaving in a more constraint way which imho is good
way forward.  Users will roll out their implementation due different needs
and tradeoff, such as gcc or python.

What I think really upsets users it finding out that if they use a different
element size or a large number its qsort now starts to call mallocs, or starts
to opens system files, or when memory is low it fallbacks to a completely
different implementation (which might have quadratic performance in some cases).

One possibility is to keep the mergesort using the static array if it
fits and fallback to another sorting (possible heapsort or at least
the intrasort I am suggesting).  I still prefer to keep it simple and
with a predictable performance.

  reply	other threads:[~2021-09-07 14:32 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 17:11 [PATCH v3 0/7] Use introsort for qsort Adhemerval Zanella via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 1/7] benchtests: Add bench-qsort Adhemerval Zanella via Libc-alpha
2021-09-04  9:09   ` Alexander Monakov via Libc-alpha
2021-09-06 18:30     ` Adhemerval Zanella via Libc-alpha
2021-10-13  3:19   ` Noah Goldstein via Libc-alpha
2021-10-15 12:52     ` Adhemerval Zanella via Libc-alpha
2021-10-15 16:39       ` Noah Goldstein via Libc-alpha
2021-10-15 17:19         ` Adhemerval Zanella via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 2/7] support: Fix getopt_long with CMDLINE_OPTIONS Adhemerval Zanella via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 3/7] stdlib: Optimization qsort{_r} swap implementation (BZ #19305) Adhemerval Zanella via Libc-alpha
2021-10-13  3:29   ` Noah Goldstein via Libc-alpha
2021-10-13  3:39     ` Noah Goldstein via Libc-alpha
2021-10-15 13:29       ` Adhemerval Zanella via Libc-alpha
2021-10-15 17:17         ` Noah Goldstein via Libc-alpha
2021-10-15 17:34           ` Adhemerval Zanella via Libc-alpha
2021-10-15 17:45             ` Noah Goldstein via Libc-alpha
2021-10-15 17:56               ` Adhemerval Zanella via Libc-alpha
2021-10-15 13:12     ` Adhemerval Zanella via Libc-alpha
2021-10-15 16:45       ` Noah Goldstein via Libc-alpha
2021-10-15 17:21         ` Adhemerval Zanella via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 4/7] stdlib: Move insertion sort out qsort Adhemerval Zanella via Libc-alpha
2021-09-06 20:35   ` Fangrui Song via Libc-alpha
2021-09-06 20:48     ` Fangrui Song via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 5/7] stdlib: qsort: Move some macros to inline function Adhemerval Zanella via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 6/7] stdlib: Implement introsort with qsort Adhemerval Zanella via Libc-alpha
2021-09-04  9:17   ` Alexander Monakov via Libc-alpha
2021-09-06 18:43     ` Adhemerval Zanella via Libc-alpha
2021-09-06 20:23   ` Fangrui Song via Libc-alpha
2021-10-13  3:53   ` Noah Goldstein via Libc-alpha
2021-09-03 17:11 ` [PATCH v3 7/7] stdlib: Remove use of mergesort on qsort (BZ #21719) Adhemerval Zanella via Libc-alpha
2021-09-03 19:18 ` [PATCH v3 0/7] Use introsort for qsort Paul Eggert
2021-09-06 14:13   ` Carlos O'Donell via Libc-alpha
2021-09-06 17:03     ` Zack Weinberg via Libc-alpha
2021-09-06 18:19       ` Adhemerval Zanella via Libc-alpha
2021-09-07  0:14     ` Paul Eggert
2021-09-07 14:32       ` Adhemerval Zanella via Libc-alpha [this message]
2021-09-07 17:39         ` Paul Eggert
2021-09-07 18:07           ` Adhemerval Zanella via Libc-alpha
2021-09-07 19:28             ` Paul Eggert
2021-09-08 11:56               ` Adhemerval Zanella via Libc-alpha
2021-09-09  0:39                 ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e9bbf08-4a76-6d33-1319-abda26a19692@linaro.org \
    --to=libc-alpha@sourceware.org \
    --cc=adhemerval.zanella@linaro.org \
    --cc=carlos@redhat.com \
    --cc=eggert@cs.ucla.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).