user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 6/6] t/run.perl: to avoid repeated process spawning for *.t
Date: Wed, 18 Dec 2019 03:36:45 +0000	[thread overview]
Message-ID: <20191218033645.5037-7-e@80x24.org> (raw)
In-Reply-To: <20191218033645.5037-1-e@80x24.org>

Spawning a new Perl interpreter for every test case
means Perl has to reparse and recompile every single file
it needs, costing us performance and development time.

Now that we've modified our code to avoid global state,
we can preload everything we need.

The new "check-run" test target is now 20-30% faster
than the original "check" target.
---
 .gitignore                    |   1 +
 MANIFEST                      |   1 +
 Makefile.PL                   |  12 ++-
 lib/PublicInbox/TestCommon.pm |  12 ++-
 t/nntpd.t                     |   3 +
 t/run.perl                    | 182 ++++++++++++++++++++++++++++++++++
 6 files changed, 205 insertions(+), 6 deletions(-)
 create mode 100755 t/run.perl

diff --git a/.gitignore b/.gitignore
index bdb8cf15..7f4142ba 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,3 +19,4 @@
 /NEWS.html
 /NEWS.atom
 /NEWS
+*.log
diff --git a/MANIFEST b/MANIFEST
index 6bff79ad..997b6e88 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -265,6 +265,7 @@ t/purge.t
 t/qspawn.t
 t/replace.t
 t/reply.t
+t/run.perl
 t/search-thr-index.t
 t/search.t
 t/sigfd.t
diff --git a/Makefile.PL b/Makefile.PL
index 33688095..70eb94c1 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -168,8 +168,16 @@ dsyn :: \$(addsuffix .syntax, \$(filter \$(changed), \$(syn_files)))
 check-manifest :: MANIFEST
 	if git ls-files >\$?.gen 2>&1; then diff -u \$? \$?.gen; fi
 
-check:: pure_all check-manifest
-	\$(EATMYDATA) \$(PROVE) -bvw -j\$(N)
+# the traditional way running per-*.t processes:
+check-each :: pure_all check-manifest
+	\$(EATMYDATA) \$(PROVE) --state=save -bvw -j\$(N)
+
+# lightly-tested way to runn tests, relies "--state=save" in check-each
+# for best performance
+check-run :: pure_all check-manifest
+	\$(EATMYDATA) \$(PROVE) -bvw t/run.perl :: -j\$(N)
+
+check :: check-each
 
 lib/PublicInbox/UserContent.pm :: contrib/css/216dark.css
 	\$(PERL) -I lib \$@ \$?
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 2804b5b6..d6ed0876 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -60,7 +60,7 @@ sub require_git ($;$) {
 
 sub key2script ($) {
 	my ($key) = @_;
-	return $key if $key =~ m!\A/!;
+	return $key if (index($key, '/') >= 0);
 	# n.b. we may have scripts which don't start with "public-inbox" in
 	# the future:
 	$key =~ s/\A([-\.])/public-inbox$1/;
@@ -101,9 +101,11 @@ sub key2sub ($) {
 		my $f = key2script($key);
 		open my $fh, '<', $f or die "open $f: $!";
 		my $str = do { local $/; <$fh> };
-		my ($fc, $rest) = ($key =~ m/([a-z])([a-z0-9]+)\z/);
-		$fc = uc($fc);
-		my $pkg = "PublicInbox::TestScript::$fc$rest";
+		my $pkg = (split(m!/!, $f))[-1];
+		$pkg =~ s/([a-z])([a-z0-9]+)(\.t)?\z/\U$1\E$2/;
+		$pkg .= "_T" if $3;
+		$pkg =~ tr/-.//d;
+		$pkg = "PublicInbox::TestScript::$pkg";
 		eval <<EOF;
 package $pkg;
 use strict;
@@ -111,6 +113,8 @@ use subs qw(exit);
 
 *exit = *PublicInbox::TestCommon::run_script_exit;
 sub main {
+# the below "line" directive is a magic comment, see perlsyn(1) manpage
+# line 1 "$f"
 $str
 	0;
 }
diff --git a/t/nntpd.t b/t/nntpd.t
index 30f3fb9c..c3712b67 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -302,6 +302,9 @@ Date: Fri, 02 Oct 1993 00:00:00 +0000
 		is($? >> 8, 0, 'no errors');
 	}
 	SKIP: {
+		if ($INC{'Search/Xapian.pm'} && ($ENV{TEST_RUN_MODE}//1)) {
+			skip 'Search/Xapian.pm pre-loaded (by t/run.perl?)', 1;
+		}
 		my @of = `lsof -p $td->{pid} 2>/dev/null`;
 		skip('lsof broken', 1) if (!scalar(@of) || $?);
 		my @xap = grep m!Search/Xapian!, @of;
diff --git a/t/run.perl b/t/run.perl
new file mode 100755
index 00000000..15d1f7e0
--- /dev/null
+++ b/t/run.perl
@@ -0,0 +1,182 @@
+#!/usr/bin/perl -w
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+#
+# Parallel test runner which preloads code and reuses worker processes
+# to give a nice speedup over prove(1).  It also generates per-test
+# .log files (similar to automake tests).
+#
+# *.t files run by this shouldnot rely on global state.
+#
+# Usage: $PERL -I lib -w t/run.perl -j4
+# Or via prove(1): prove -lvw t/run.perl :: -j4
+use strict;
+use PublicInbox::TestCommon;
+use Cwd qw(getcwd);
+use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
+use Errno qw(EINTR);
+use POSIX qw(_POSIX_PIPE_BUF WNOHANG);
+my $jobs = 1;
+my $repeat = 1;
+$| = 1;
+our $log_suffix = '.log';
+my ($shuffle, %pids, @err);
+GetOptions('j|jobs=i' => \$jobs,
+	'repeat=i' => \$repeat,
+	'log=s' => \$log_suffix,
+	's|shuffle' => \$shuffle,
+) or die "Usage: $0 [-j JOBS] [--log=SUFFIX] [--repeat RUNS]";
+if (($ENV{TEST_RUN_MODE} // 1) == 0) {
+	die "$0 is not compatible with TEST_RUN_MODE=0\n";
+}
+my @tests = scalar(@ARGV) ? @ARGV : glob('t/*.t');
+my $cwd = getcwd();
+open OLDOUT, '>&STDOUT' or die "dup STDOUT: $!";
+open OLDERR, '>&STDERR' or die "dup STDERR: $!";
+OLDOUT->autoflush(1);
+OLDERR->autoflush(1);
+
+key2sub($_) for @tests; # precache
+
+if ($shuffle) {
+	require List::Util;
+} elsif (open(my $prove_state, '<', '.prove') && eval { require YAML::XS }) {
+	# reuse "prove --state=save" data to start slowest tests, first
+	my $state = YAML::XS::Load(do { local $/; <$prove_state> });
+	my $t = $state->{tests};
+	@tests = sort {
+		($t->{$b}->{elapsed} // 0) <=> ($t->{$a}->{elapsed} // 0)
+	} @tests;
+}
+
+our $tb = Test::More->builder;
+
+sub DIE (;$) {
+	print OLDERR @_;
+	exit(1);
+}
+
+our ($worker, $worker_test);
+
+sub test_status () {
+	$? = 255 if $? == 0 && !$tb->is_passing;
+	my $status = $? ? 'not ok' : 'ok';
+	print OLDOUT "$status $worker_test\n" if $log_suffix ne '';
+}
+
+# Test::Builder or Test2::Hub may call exit() from plan(skip_all => ...)
+END { test_status() if (defined($worker_test) && $worker == $$) }
+
+sub run_test ($) {
+	my ($test) = @_;
+	my $log_fh;
+	if ($log_suffix ne '') {
+		my $log = $test;
+		$log =~ s/\.[^\.]+\z/$log_suffix/ or DIE "can't log for $test";
+		open $log_fh, '>', $log or DIE "open $log: $!";
+		$log_fh->autoflush(1);
+		$tb->output($log_fh);
+		$tb->failure_output($log_fh);
+		$tb->todo_output($log_fh);
+		open STDOUT, '>&', $log_fh or DIE "1>$log: $!";
+		open STDERR, '>&', $log_fh or DIE "2>$log: $!";
+	}
+	$worker_test = $test;
+	run_script([$test]);
+	test_status();
+	$worker_test = undef;
+	push @err, "$test ($?)" if $?;
+}
+
+sub UINT_SIZE () { 4 }
+
+# worker processes will SIGUSR1 the producer process when it
+# sees EOF on the pipe.  On FreeBSD 11.2 and Perl 5.30.0,
+# sys/ioctl.ph gives the wrong value for FIONREAD().
+my $producer = $$;
+my $eof; # we stop respawning if true
+
+my $start_worker = sub {
+	my ($i, $j, $rd, $todo) = @_;
+	defined(my $pid = fork) or DIE "fork: $!";
+	if ($pid == 0) {
+		$worker = $$;
+		while (1) {
+			my $r = sysread($rd, my $buf, UINT_SIZE);
+			if (!defined($r)) {
+				next if $! == EINTR;
+				DIE "sysread: $!";
+			}
+			last if $r == 0;
+			DIE "short read $r" if $r != UINT_SIZE;
+			my $t = unpack('I', $buf);
+			run_test($todo->[$t]);
+			$tb->reset;
+			chdir($cwd) or DIE "chdir: $!";
+		}
+		kill 'USR1', $producer if !$eof; # sets $eof in $producer
+		DIE join('', map { "E: $_\n" } @err) if @err;
+		exit(0);
+	} else {
+		$pids{$pid} = $j;
+	}
+};
+
+# negative $repeat means loop forever:
+for (my $i = $repeat; $i != 0; $i--) {
+	my @todo = $shuffle ? List::Util::shuffle(@tests) : @tests;
+
+	# single-producer, multi-consumer queue relying on POSIX semantics
+	pipe(my ($rd, $wr)) or DIE "pipe: $!";
+
+	# fill the queue before forking so children can start earlier
+	my $n = (_POSIX_PIPE_BUF / UINT_SIZE);
+	if ($n >= $#todo) {
+		print $wr join('', map { pack('I', $_) } (0..$#todo)) or DIE;
+		close $wr or die;
+		$wr = undef;
+	} else { # write what we can...
+		$wr->autoflush(1);
+		print $wr join('', map { pack('I', $_) } (0..$n)) or DIE;
+		$n += 1; # and send more ($n..$#todo), later
+	}
+	$eof = undef;
+	local $SIG{USR1} = sub { $eof = 1 };
+	my $sigchld = sub {
+		my ($sig) = @_;
+		my $flags = $sig ? WNOHANG : 0;
+		while (1) {
+			my $pid = waitpid(-1, $flags) or return;
+			return if $pid < 0;
+			my $j = delete $pids{$pid};
+			if (!defined($j)) {
+				push @err, "reaped unknown $pid ($?)";
+				next;
+			}
+			push @err, "job[$j] ($?)" if $?;
+			# skip_all can exit(0), respawn if needed:
+			if (!$eof) {
+				print OLDERR "# respawning job[$j]\n";
+				$start_worker->($i, $j, $rd, \@todo);
+			}
+		}
+	};
+
+	# start the workers to consume the queue
+	for (my $j = 0; $j < $jobs; $j++) {
+		$start_worker->($i, $j, $rd, \@todo);
+	}
+
+	if ($wr) {
+		local $SIG{CHLD} = $sigchld;
+		# too many tests to fit in the pipe before starting workers,
+		# send the rest now the workers are running
+		print $wr join('', map { pack('I', $_) } ($n..$#todo)) or DIE;
+		close $wr or die;
+	}
+
+	$sigchld->(0) while scalar(keys(%pids));
+	DIE join('', map { "E: $_\n" } @err) if @err;
+}
+
+print OLDOUT "1..".($repeat * scalar(@tests))."\n" if $repeat >= 0;

  parent reply	other threads:[~2019-12-18  3:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18  3:36 [PATCH 0/6] test updates and speedups Eric Wong
2019-12-18  3:36 ` [PATCH 1/6] TODO: add UUCP address item Eric Wong
2019-12-18  3:36 ` [PATCH 2/6] viewvcs: flesh out some functionality and test Eric Wong
2019-12-18  3:36 ` [PATCH 3/6] Makefile.PL: sort target and var lists Eric Wong
2019-12-18  9:05   ` Eric Wong
2019-12-18  3:36 ` [PATCH 4/6] t/*.t: avoid sharing "my" variables in subs Eric Wong
2019-12-18  3:36 ` [PATCH 5/6] tests: move t/common.perl to PublicInbox::TestCommon Eric Wong
2019-12-19  4:02   ` Eric Wong
2019-12-18  3:36 ` Eric Wong [this message]
2019-12-19  4:03   ` [PATCH 6/6] t/run.perl: to avoid repeated process spawning for *.t Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191218033645.5037-7-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).