git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>
Subject: Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
Date: Fri, 08 Oct 2021 09:23:21 +0200	[thread overview]
Message-ID: <87tuhsez93.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <b73452fa-f5b0-0ab4-25e8-7494637c49f5@web.de>


On Fri, Oct 08 2021, René Scharfe wrote:

> Use MINSTD to generate pseudo-random numbers consistently instead of
> using rand(3), whose output can vary from system to system, and reset
> its seed before filling in the test values.  This gives repeatable
> results across versions and systems, which simplifies sharing and
> comparing of results between developers.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Change: Use uint32_t to avoid relying on unsigned int being exactly
> 4 bytes wide.  D'oh!
>
>  t/helper/test-mergesort.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
> index 29758cf89b..c6fa816be3 100644
> --- a/t/helper/test-mergesort.c
> +++ b/t/helper/test-mergesort.c
> @@ -2,6 +2,12 @@
>  #include "cache.h"
>  #include "mergesort.h"
>
> +static uint32_t minstd_rand(uint32_t *state)
> +{
> +	*state = (uint64_t)*state * 48271 % 2147483647;
> +	return *state;
> +}
> +
>  struct line {
>  	char *text;
>  	struct line *next;
> @@ -60,8 +66,9 @@ static void dist_sawtooth(int *arr, int n, int m)
>  static void dist_rand(int *arr, int n, int m)
>  {
>  	int i;
> +	uint32_t seed = 1;
>  	for (i = 0; i < n; i++)
> -		arr[i] = rand() % m;
> +		arr[i] = minstd_rand(&seed) % m;
>  }
>
>  static void dist_stagger(int *arr, int n, int m)
> @@ -81,8 +88,9 @@ static void dist_plateau(int *arr, int n, int m)
>  static void dist_shuffle(int *arr, int n, int m)
>  {
>  	int i, j, k;
> +	uint32_t seed = 1;
>  	for (i = j = 0, k = 1; i < n; i++)
> -		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
> +		arr[i] = minstd_rand(&seed) % m ? (j += 2) : (k += 2);
>  }
>
>  #define DIST(name) { #name, dist_##name }

Just to your upthread:

    "Right, so we'd need to ship our own random number generator."

I don't really think this matters in either case here, and if anything a
flaky failure in this test would quickly point us in the right
direction, as opposed to say having the N test_expect_success being run
in rand() order or whatever.

If we'd like results we can compare across platforms we're surely better
of here running this in a loop with different per-platform srand()
values N times for some high value of N, than we are in picking one
"golden" distribution.

But just on srand() and rand() use more generally in the test suite: I
think it's fine to just assume that we can call srand()/rand() and get
"predictable" results, because what we're really after in most cases is
to avoid hard-to-diagnose flakyness. If as a result of random
distribution we'll get a consistent failure on one OS (or the flakyness
is just OpenBSD...).

Also generally: If you'd like "portable" rand() for a test just shell
out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
Linux, OSX, Solaris & OpenBSD, all returned the same number for both:

    ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'

Whereas a C program doing the same:
    
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(void)
    {
            srand(1);
            printf("rand = %d\n", rand());
            return 0;
    }

Returns different numbers an all, and on OpenBSD the number is different
each time, per their well-known non-standard srand()/rand() behavior.

  parent reply	other threads:[~2021-10-08  7:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
2021-10-02 16:56     ` René Scharfe
2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` René Scharfe
2021-10-03 17:33         ` Junio C Hamano
2021-10-07 20:00           ` René Scharfe
2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
2021-10-08  4:17               ` Jeff King
2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason [this message]
2021-10-08 17:30                 ` René Scharfe
2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
2022-01-17 18:22     ` René Scharfe
2022-01-18  5:07       ` René Scharfe
2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
2022-01-18 12:27           ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuhsez93.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).