unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>, Zack Weinberg <zack@owlfolio.org>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>,
	GNU libc development <libc-alpha@sourceware.org>,
	goldstein.w.n@gmail.com
Subject: Re: [PATCH] stdlib: Optimize number of calls to comparison function
Date: Wed, 6 Dec 2023 09:51:26 -0300	[thread overview]
Message-ID: <4b7d0325-7c04-458a-8bec-37ca97b62fd7@linaro.org> (raw)
In-Reply-To: <87plzjd42s.fsf@oldenburg.str.redhat.com>



On 06/12/23 07:21, Florian Weimer wrote:
> * Zack Weinberg:
> 
>> On Mon, Dec 4, 2023, at 1:31 PM, Kuan-Wei Chiu wrote:
>>> On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote:
>>>> I think the factor in stdlib/tst-qsort5.c needs to be adjusted:
>>>>
>>>>   /* This is an arbitrary factor which is true for the current
>>>>      implementation across a wide range of sizes.  */
>>>>   TEST_VERIFY (factor <= 4.5);
>>>
>>> It seems that the factor can be adjusted to around 3.5. I can send
>>> another patch for this adjustment or resend it as a patch series.
>>
>> Before you go any further with this patch series I think we need to
>> decide whether our backward compatibility constraints mean our qsort
>> has to be a stable sort.  If so, we should make it *always* be a
>> stable sort and write that into the documentation, and that would
>> mean junking the entire heapsort implementation.
> 
> That makes sense, although there might not exist an in-place sorting
> algorithm that takes constant extra space regardless of element count
> and element size and has reasonable performance.  Maybe we could say,
> “stable if element size is less than 1024 bytes” or something like that.
> 
> What I'm concerned about is that with the current implementation, we
> take a performance hit *and* have compatibility problems.  The
> compatibility problems would be easier to justify if we actually made
> things faster.  Not calling malloc internally is unlikely to be
> compelling to programmers.

My initial intention was to remove internal multiple sorting strategies
that has different semantics and that might eventually trigger corner
cases issues (the stable default mergesort versus the fallback quicksort),
and fix the quicksort fallback that had the long standing issue of O(n^2) 
on adversarial inputs.

However it seems that glibc qsort now become an instance on Hyrum's law,
where it does not really matter that neither POSIX nor C standard promised 
sorting stability.

So I presume it would be better to keep with mergesort for all input sizes,
along with the limited stack buffer (so there is no failure point up a 
certain input number/size); and fallback to heapsort on memory allocation
failure.  I will prepare a patch to restore it.

Other system does provide mergesort as a different symbol [1], which also
returns a failure is case memory can not be allocated.  But I don't think
it would be an improvement in this situation (no one will really adapt
their code to use mergesort if a stable allocation is required).  I also
though about adding a tunable to enable mergesort on qsort, but it would
ending up only add more complexity and required more testing.

[1] https://man.freebsd.org/cgi/man.cgi?query=mergesort&sektion=3

 

  reply	other threads:[~2023-12-06 12:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-02 21:48 [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu
2023-12-04  8:20 ` Florian Weimer
2023-12-04 18:31   ` Kuan-Wei Chiu
2023-12-05 10:44     ` Florian Weimer
2023-12-05 20:00       ` Kuan-Wei Chiu
2023-12-05 20:35     ` Zack Weinberg
2023-12-06 10:21       ` Florian Weimer
2023-12-06 12:51         ` Adhemerval Zanella Netto [this message]
2023-12-17 15:42           ` Zack Weinberg
2023-12-17 15:55             ` Florian Weimer
2023-12-17 16:47             ` Kuan-Wei Chiu
2023-12-17 18:02               ` Florian Weimer
2023-12-05  3:22   ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu
2023-12-05  3:22     ` [PATCH v2 1/2] " Kuan-Wei Chiu
2023-12-05  3:22     ` [PATCH v2 2/2] stdlib: Adjust the factor in tst-qsort5 Kuan-Wei Chiu
2024-02-16  7:08 ` [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu
2024-03-27 19:45   ` Adhemerval Zanella Netto
2024-03-27 19:59     ` Kuan-Wei Chiu
2024-03-27 20:42       ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b7d0325-7c04-458a-8bec-37ca97b62fd7@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=visitorckw@gmail.com \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).