git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>, "SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load
Date: Tue,  4 Dec 2018 17:34:57 +0100	[thread overview]
Message-ID: <20181204163457.15717-4-szeder.dev@gmail.com> (raw)
In-Reply-To: <20181204163457.15717-1-szeder.dev@gmail.com>

Unfortunately, we have a few flaky tests, whose failures tend to be
hard to reproduce.  We've found that the best we can do to reproduce
such a failure is to run the test repeatedly while the machine is
under load, and wait in the hope that the load creates enough variance
in the timing of the test's commands that a failure is evenually
triggered.  I have a command to do that, and I noticed that two other
contributors have rolled their own scripts to do the same, all
choosing slightly different approaches.

To help reproduce failures in flaky tests, introduce the '--stress'
option to run a test script repeatedly in multiple parallel
invocations until one of them fails, thereby using the test script
itself to increase the load on the machine.

The number of parallel invocations is determined by, in order of
precedence: the number specified as '--stress=<N>', or the value of
the GIT_TEST_STRESS_LOAD environment variable, or twice the number of
available processors in '/proc/cpuinfo', or 8.

To prevent the several parallel invocations of the same test from
interfering with each other:

  - Include the parallel job's number in the name of the trash
    directory and the various output files under 't/test-results/' as
    a '.stress-<Nr>' suffix.

  - Add the parallel job's number to the port number specified by the
    user or to the test number, so even tests involving daemons
    listening on a TCP socket can be stressed.

  - Make '--stress' imply '--verbose-log' and discard the test's
    standard ouput and error; dumping the output of several parallel
    tests to the terminal would create a big ugly mess.

'wait' for all parallel jobs before exiting (either because a failure
was found or because the user lost patience and aborted the stress
test), allowing the still running tests to finish.  Otherwise the "OK
X.Y" progress output from the last iteration would likely arrive after
the user got back the shell prompt, interfering with typing in the
next command.  OTOH, this waiting might induce a considerable delay
between hitting ctrl-C and the test actually exiting; I'm not sure
this is the right tradeoff.

Based on Jeff King's 'stress' script.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/README                | 13 ++++++-
 t/test-lib-functions.sh |  7 +++-
 t/test-lib.sh           | 82 +++++++++++++++++++++++++++++++++++++++--
 3 files changed, 96 insertions(+), 6 deletions(-)

diff --git a/t/README b/t/README
index 28711cc508..9851de25c2 100644
--- a/t/README
+++ b/t/README
@@ -186,6 +186,16 @@ appropriately before running "make".
 	this feature by setting the GIT_TEST_CHAIN_LINT environment
 	variable to "1" or "0", respectively.
 
+--stress::
+--stress=<N>::
+	Run the test script repeatedly in multiple parallel
+	invocations until one of them fails.  Useful for reproducing
+	rare failures in flaky tests.  The number of parallel
+	invocations is, in order of precedence: <N>, or the value of
+	the GIT_TEST_STRESS_LOAD environment variable, or twice the
+	number of available processors in '/proc/cpuinfo', or 8.
+	Implies `--verbose-log`.
+
 You can also set the GIT_TEST_INSTALLED environment variable to
 the bindir of an existing git installation to test that installation.
 You still need to have built this git sandbox, from which various
@@ -425,7 +435,8 @@ This test harness library does the following things:
  - Creates an empty test directory with an empty .git/objects database
    and chdir(2) into it.  This directory is 't/trash
    directory.$test_name_without_dotsh', with t/ subject to change by
-   the --root option documented above.
+   the --root option documented above, and a '.stress-<N>' suffix
+   appended by the --stress option.
 
  - Defines standard test helper functions for your scripts to
    use.  These functions are designed to make all scripts behave
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index d9a602cd0f..9af11e3eed 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1288,8 +1288,6 @@ test_set_port () {
 			# root-only port, use a larger one instead.
 			port=$(($port + 10000))
 		fi
-
-		eval $var=$port
 		;;
 	*[^0-9]*)
 		error >&7 "invalid port number: $port"
@@ -1298,4 +1296,9 @@ test_set_port () {
 		# The user has specified the port.
 		;;
 	esac
+
+	# Make sure that parallel '--stress' test jobs get different
+	# ports.
+	port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
+	eval $var=$port
 }
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 49e4563405..9b7f687396 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -71,8 +71,81 @@ then
 	exit 1
 fi
 
+TEST_STRESS_SFX="${GIT_TEST_STRESS_JOB_NR:+.stress-$GIT_TEST_STRESS_JOB_NR}"
 TEST_NAME="$(basename "$0" .sh)"
-TEST_RESULTS_BASE="$TEST_OUTPUT_DIRECTORY/test-results/$TEST_NAME"
+TEST_RESULTS_BASE="$TEST_OUTPUT_DIRECTORY/test-results/$TEST_NAME$TEST_STRESS_SFX"
+
+# If --stress was passed, run this test repeatedly in several parallel loops.
+case "$GIT_TEST_STRESS_STARTED, $* " in
+done,*)
+	# Don't stress test again.
+	;;
+*' --stress '*|*' '--stress=*' '*)
+	job_count=${*##*--stress=}
+	if test "$job_count" != "$*"
+	then
+		job_count=${job_count%% *}
+	elif test -n "$GIT_TEST_STRESS_LOAD"
+	then
+		job_count="$GIT_TEST_STRESS_LOAD"
+	elif test -r /proc/cpuinfo
+	then
+		job_count=$((2 * $(grep -c ^processor /proc/cpuinfo)))
+	else
+		job_count=8
+	fi
+
+	mkdir -p "$(dirname "$TEST_RESULTS_BASE")"
+	stressfail="$TEST_RESULTS_BASE.stress-failed"
+	rm -f "$stressfail"
+	trap 'echo aborted >"$stressfail"' TERM INT HUP
+
+	job_nr=0
+	while test $job_nr -lt "$job_count"
+	do
+		(
+			GIT_TEST_STRESS_STARTED=done
+			GIT_TEST_STRESS_JOB_NR=$job_nr
+			export GIT_TEST_STRESS_STARTED GIT_TEST_STRESS_JOB_NR
+
+			cnt=0
+			while ! test -e "$stressfail"
+			do
+				if $TEST_SHELL_PATH "$0" "$@" >/dev/null 2>&1
+				then
+					printf >&2 "OK   %2d.%d\n" $GIT_TEST_STRESS_JOB_NR $cnt
+				elif test -f "$stressfail" &&
+				     test "$(cat "$stressfail")" = "aborted"
+				then
+					printf >&2 "ABORTED %2d.%d\n" $GIT_TEST_STRESS_JOB_NR $cnt
+				else
+					printf >&2 "FAIL %2d.%d\n" $GIT_TEST_STRESS_JOB_NR $cnt
+					echo $GIT_TEST_STRESS_JOB_NR >>"$stressfail"
+				fi
+				cnt=$(($cnt + 1))
+			done
+		) &
+		job_nr=$(($job_nr + 1))
+	done
+
+	job_nr=0
+	while test $job_nr -lt "$job_count"
+	do
+		wait
+		job_nr=$(($job_nr + 1))
+	done
+
+	if test -f "$stressfail" && test "$(cat "$stressfail")" != "aborted"
+	then
+		echo "Log(s) of failed test run(s) be found in:"
+		for f in $(cat "$stressfail")
+		do
+			echo "  $TEST_RESULTS_BASE.stress-$f.out"
+		done
+	fi
+	exit
+	;;
+esac
 
 # if --tee was passed, write the output not only to the terminal, but
 # additionally to the file test-results/$BASENAME.out, too.
@@ -80,7 +153,7 @@ case "$GIT_TEST_TEE_STARTED, $* " in
 done,*)
 	# do not redirect again
 	;;
-*' --tee '*|*' --va'*|*' -V '*|*' --verbose-log '*)
+*' --tee '*|*' --va'*|*' -V '*|*' --verbose-log '*|*' --stress '*|*' '--stress=*' '*)
 	mkdir -p "$(dirname "$TEST_RESULTS_BASE")"
 
 	# Make this filename available to the sub-process in case it is using
@@ -341,6 +414,9 @@ do
 	-V|--verbose-log)
 		verbose_log=t
 		shift ;;
+	--stress|--stress=*)
+		verbose_log=t
+		shift ;;
 	*)
 		echo "error: unknown test option '$1'" >&2; exit 1 ;;
 	esac
@@ -1028,7 +1104,7 @@ then
 fi
 
 # Test repository
-TRASH_DIRECTORY="trash directory.$TEST_NAME"
+TRASH_DIRECTORY="trash directory.$TEST_NAME$TEST_STRESS_SFX"
 test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
 case "$TRASH_DIRECTORY" in
 /*) ;; # absolute path is good
-- 
2.20.0.rc2.156.g5a9fd2ce9c


  parent reply	other threads:[~2018-12-04 16:35 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04 16:34 [RFC PATCH 0/3] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests SZEDER Gábor
2018-12-04 16:34 ` [PATCH 1/3] test-lib: consolidate naming of test-results paths SZEDER Gábor
2018-12-05  4:57   ` Jeff King
2018-12-04 16:34 ` [PATCH 2/3] test-lib-functions: introduce the 'test_set_port' helper function SZEDER Gábor
2018-12-05  5:17   ` Jeff King
2018-12-05 12:20     ` SZEDER Gábor
2018-12-05 21:59       ` Jeff King
2018-12-04 16:34 ` SZEDER Gábor [this message]
2018-12-04 17:04   ` [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load Ævar Arnfjörð Bjarmason
2018-12-04 17:37     ` SZEDER Gábor
2018-12-05  5:46     ` Jeff King
2018-12-04 18:11   ` Ævar Arnfjörð Bjarmason
2018-12-05  5:50     ` Jeff King
2018-12-05 12:07     ` SZEDER Gábor
2018-12-05 14:01       ` Ævar Arnfjörð Bjarmason
2018-12-05 14:39         ` SZEDER Gábor
2018-12-05 19:59           ` Ævar Arnfjörð Bjarmason
2018-12-05  5:44   ` Jeff King
2018-12-05 10:34     ` SZEDER Gábor
2018-12-05 21:36       ` Jeff King
2018-12-06  0:22         ` Junio C Hamano
2018-12-06  5:35           ` Jeff King
2018-12-06  6:41             ` Junio C Hamano
2018-12-06 22:56         ` SZEDER Gábor
2018-12-07  1:03           ` Jeff King
2018-12-05 14:01     ` SZEDER Gábor
2018-12-05 21:56       ` Jeff King
2018-12-06 23:10         ` SZEDER Gábor
2018-12-07  1:14           ` Jeff King
2018-12-09 22:56 ` [PATCH v2 0/7] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests SZEDER Gábor
2018-12-09 22:56   ` [PATCH v2 1/7] test-lib: translate SIGTERM and SIGHUP to an exit SZEDER Gábor
2018-12-11 10:57     ` Jeff King
2018-12-09 22:56   ` [PATCH v2 2/7] test-lib: parse some --options earlier SZEDER Gábor
2018-12-11 11:09     ` Jeff King
2018-12-11 12:42       ` SZEDER Gábor
2018-12-17 21:44         ` Jeff King
2018-12-30 19:04           ` SZEDER Gábor
2019-01-03  4:53             ` Jeff King
2018-12-09 22:56   ` [PATCH v2 3/7] test-lib: consolidate naming of test-results paths SZEDER Gábor
2018-12-09 22:56   ` [PATCH v2 4/7] test-lib: set $TRASH_DIRECTORY earlier SZEDER Gábor
2018-12-09 22:56   ` [PATCH v2 5/7] test-lib: extract Bash version check for '-x' tracing SZEDER Gábor
2018-12-09 22:56   ` [PATCH v2 6/7] test-lib-functions: introduce the 'test_set_port' helper function SZEDER Gábor
2018-12-09 22:56   ` [PATCH v2 7/7] test-lib: add the '--stress' option to run a test repeatedly under load SZEDER Gábor
2018-12-10  1:34     ` [PATCH] fixup! " SZEDER Gábor
2018-12-11 11:16   ` [PATCH v2 0/7] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests Jeff King
2018-12-30 19:16   ` [PATCH v3 0/8] " SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 1/8] test-lib: translate SIGTERM and SIGHUP to an exit SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 2/8] test-lib: parse options in a for loop to keep $@ intact SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 3/8] test-lib: parse command line options earlier SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 4/8] test-lib: consolidate naming of test-results paths SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 5/8] test-lib: set $TRASH_DIRECTORY earlier SZEDER Gábor
2018-12-30 22:44       ` SZEDER Gábor
2018-12-30 22:48         ` [PATCH v3.1 " SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 6/8] test-lib: extract Bash version check for '-x' tracing SZEDER Gábor
2018-12-31 17:14       ` Carlo Arenas
2018-12-30 19:16     ` [PATCH v3 7/8] test-lib-functions: introduce the 'test_set_port' helper function SZEDER Gábor
2018-12-30 19:16     ` [PATCH v3 8/8] test-lib: add the '--stress' option to run a test repeatedly under load SZEDER Gábor
2019-01-05  1:08     ` [PATCH v4 0/8] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 1/8] test-lib: translate SIGTERM and SIGHUP to an exit SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 2/8] test-lib: extract Bash version check for '-x' tracing SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 3/8] test-lib: parse options in a for loop to keep $@ intact SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 4/8] test-lib: parse command line options earlier SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 5/8] test-lib: consolidate naming of test-results paths SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 6/8] test-lib: set $TRASH_DIRECTORY earlier SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 7/8] test-lib-functions: introduce the 'test_set_port' helper function SZEDER Gábor
2019-01-05  1:08       ` [PATCH v4 8/8] test-lib: add the '--stress' option to run a test repeatedly under load SZEDER Gábor
2019-01-07  8:49       ` [PATCH v4 0/8] test-lib: add the '--stress' option to help reproduce occasional failures in flaky tests Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181204163457.15717-4-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).