git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>
Subject: Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
Date: Fri, 8 Oct 2021 19:30:20 +0200	[thread overview]
Message-ID: <c43ce422-0eba-0c7e-af01-a85cd948b0bb@web.de> (raw)
In-Reply-To: <87tuhsez93.fsf@evledraar.gmail.com>

Am 08.10.21 um 09:23 schrieb Ævar Arnfjörð Bjarmason:
>
> Just to your upthread:
>
>     "Right, so we'd need to ship our own random number generator."
>
> I don't really think this matters in either case here, and if anything a
> flaky failure in this test would quickly point us in the right
> direction, as opposed to say having the N test_expect_success being run
> in rand() order or whatever.
>
> If we'd like results we can compare across platforms we're surely better
> of here running this in a loop with different per-platform srand()
> values N times for some high value of N, than we are in picking one
> "golden" distribution.

A mergesort bug that only causes invalid results for certain RNG seeds
is not impossible, but unlikely.  Portability of results is more useful
for comparing the number of operations needed for different types of
input, i.e. for performance work, not so much for correctness checking.
(And those results need to be taken with enough salt to avoid micro-
optimizing for specific distributions.)

Adding more rand and shuffle distributions, parameterized with different
seeds, is certainly possible.  Not sure what it would prove, though.  We
would visit a bigger part of the permutation space, but that thing is so
huge (N!) that any reasonable sample is still small.  That's why I added
the unriffle modes, to find maxima.

> But just on srand() and rand() use more generally in the test suite: I
> think it's fine to just assume that we can call srand()/rand() and get
> "predictable" results, because what we're really after in most cases is
> to avoid hard-to-diagnose flakyness. If as a result of random
> distribution we'll get a consistent failure on one OS (or the flakyness
> is just OpenBSD...).

I can't find any current use of rand() in t/, except perhaps
t/helper/test-genrandom.c, which open-codes it to get reproducible
results.  I don't see how calling rand() instead would improve it.

> Also generally: If you'd like "portable" rand() for a test just shell
> out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
> Linux, OSX, Solaris & OpenBSD, all returned the same number for both:
>
>     ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'
>
> Whereas a C program doing the same:
>
>     #include <stdio.h>
>     #include <stdlib.h>
>
>     int main(void)
>     {
>             srand(1);
>             printf("rand = %d\n", rand());
>             return 0;
>     }
>
> Returns different numbers an all, and on OpenBSD the number is different
> each time, per their well-known non-standard srand()/rand() behavior.

For test shell code that needs only a few random numbers this would
be fine.

For test-genrandom it would also work, but I don't see any benefit in
converting it to a scripting language.

Shelling out to a script to avoid a multiplication and a modulo in
test-mergesort is not interesting, to put it mildly.  A mode that sorts
input from stdin like the sort subcommand, but returns the operation
counts, might be useful if you want to test distributions generated by
a Perl script or other data source of your choice.

René

  reply	other threads:[~2021-10-08 17:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
2021-10-02 16:56     ` René Scharfe
2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` René Scharfe
2021-10-03 17:33         ` Junio C Hamano
2021-10-07 20:00           ` René Scharfe
2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
2021-10-08  4:17               ` Jeff King
2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
2021-10-08 17:30                 ` René Scharfe [this message]
2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
2022-01-17 18:22     ` René Scharfe
2022-01-18  5:07       ` René Scharfe
2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
2022-01-18 12:27           ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c43ce422-0eba-0c7e-af01-a85cd948b0bb@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).