git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Eric Sunshine via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>, "Elijah Newren" <newren@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Fabian Stelzer" <fs@gigacodes.de>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>
Subject: [PATCH 01/18] t: add skeleton chainlint.pl
Date: Thu, 01 Sep 2022 00:29:39 +0000	[thread overview]
Message-ID: <3423df94bd6035640828a2508968cf8e1f5b4dda.1661992197.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1322.git.git.1661992197.gitgitgadget@gmail.com>

From: Eric Sunshine <sunshine@sunshineco.com>

Although chainlint.sed usefully identifies broken &&-chains in tests, it
has several shortcomings which include:

  * only detects &&-chain breakage in subshells (one-level deep)

  * does not check for broken top-level &&-chains; that task is left to
    the "magic exit code 117" checker built into test-lib.sh, however,
    that detection does not extend to `{...}` blocks, `$(...)`
    expressions, or compound statements such as `if...fi`,
    `while...done`, `case...esac`

  * uses heuristics, which makes it (potentially) fallible and difficult
    to tweak to handle additional real-world cases

  * written in `sed` and employs advanced `sed` operators which are
    probably not well-known to many programmers, thus the pool of people
    who can maintain it is likely small

  * manually simulates recursion into subshells which makes it much more
    difficult to reason about than, say, a traditional top-down parser

  * checks each test as the test is run, which can get expensive for
    tests which are run repeatedly by functions or loops since their
    bodies will be checked over and over (tens or hundreds of times)
    unnecessarily

To address these shortcomings, begin implementing a more functional and
precise test linter which understands shell syntax and semantics rather
than employing heuristics, thus is able to recognize structural problems
with tests beyond broken &&-chains.

The new linter is written in Perl, thus should be more accessible to a
wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about, and allows it to inspect compound
statements within test bodies to any depth.

Furthermore, it can check all test definitions in the entire project in
a single invocation rather than having to be invoked once per test, and
each test definition is checked only once no matter how many times the
test is actually run.

At this stage, the new linter is just a skeleton containing boilerplate
which handles command-line options, collects and reports statistics, and
feeds its arguments -- paths of test scripts -- to a (presently)
do-nothing script parser for validation. Subsequent changes will flesh
out the functionality.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 115 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100755 t/chainlint.pl

diff --git a/t/chainlint.pl b/t/chainlint.pl
new file mode 100755
index 00000000000..e8ab95c7858
--- /dev/null
+++ b/t/chainlint.pl
@@ -0,0 +1,115 @@
+#!/usr/bin/env perl
+#
+# Copyright (c) 2021-2022 Eric Sunshine <sunshine@sunshineco.com>
+#
+# This tool scans shell scripts for test definitions and checks those tests for
+# problems, such as broken &&-chains, which might hide bugs in the tests
+# themselves or in behaviors being exercised by the tests.
+#
+# Input arguments are pathnames of shell scripts containing test definitions,
+# or globs referencing a collection of scripts. For each problem discovered,
+# the pathname of the script containing the test is printed along with the test
+# name and the test body with a `?!FOO?!` annotation at the location of each
+# detected problem, where "FOO" is a tag such as "AMP" which indicates a broken
+# &&-chain. Returns zero if no problems are discovered, otherwise non-zero.
+
+use warnings;
+use strict;
+use File::Glob;
+use Getopt::Long;
+
+my $show_stats;
+my $emit_all;
+
+package ScriptParser;
+
+sub new {
+	my $class = shift @_;
+	my $self = bless {} => $class;
+	$self->{output} = [];
+	$self->{ntests} = 0;
+	return $self;
+}
+
+sub parse_cmd {
+	return undef;
+}
+
+# main contains high-level functionality for processing command-line switches,
+# feeding input test scripts to ScriptParser, and reporting results.
+package main;
+
+my $getnow = sub { return time(); };
+my $interval = sub { return time() - shift; };
+if (eval {require Time::HiRes; Time::HiRes->import(); 1;}) {
+	$getnow = sub { return [Time::HiRes::gettimeofday()]; };
+	$interval = sub { return Time::HiRes::tv_interval(shift); };
+}
+
+sub show_stats {
+	my ($start_time, $stats) = @_;
+	my $walltime = $interval->($start_time);
+	my ($usertime) = times();
+	my ($total_workers, $total_scripts, $total_tests, $total_errs) = (0, 0, 0, 0);
+	for (@$stats) {
+		my ($worker, $nscripts, $ntests, $nerrs) = @$_;
+		print(STDERR "worker $worker: $nscripts scripts, $ntests tests, $nerrs errors\n");
+		$total_workers++;
+		$total_scripts += $nscripts;
+		$total_tests += $ntests;
+		$total_errs += $nerrs;
+	}
+	printf(STDERR "total: %d workers, %d scripts, %d tests, %d errors, %.2fs/%.2fs (wall/user)\n", $total_workers, $total_scripts, $total_tests, $total_errs, $walltime, $usertime);
+}
+
+sub check_script {
+	my ($id, $next_script, $emit) = @_;
+	my ($nscripts, $ntests, $nerrs) = (0, 0, 0);
+	while (my $path = $next_script->()) {
+		$nscripts++;
+		my $fh;
+		unless (open($fh, "<", $path)) {
+			$emit->("?!ERR?! $path: $!\n");
+			next;
+		}
+		my $s = do { local $/; <$fh> };
+		close($fh);
+		my $parser = ScriptParser->new(\$s);
+		1 while $parser->parse_cmd();
+		if (@{$parser->{output}}) {
+			my $s = join('', @{$parser->{output}});
+			$emit->("# chainlint: $path\n" . $s);
+			$nerrs += () = $s =~ /\?![^?]+\?!/g;
+		}
+		$ntests += $parser->{ntests};
+	}
+	return [$id, $nscripts, $ntests, $nerrs];
+}
+
+sub exit_code {
+	my $stats = shift @_;
+	for (@$stats) {
+		my ($worker, $nscripts, $ntests, $nerrs) = @$_;
+		return 1 if $nerrs;
+	}
+	return 0;
+}
+
+Getopt::Long::Configure(qw{bundling});
+GetOptions(
+	"emit-all!" => \$emit_all,
+	"stats|show-stats!" => \$show_stats) or die("option error\n");
+
+my $start_time = $getnow->();
+my @stats;
+
+my @scripts;
+push(@scripts, File::Glob::bsd_glob($_)) for (@ARGV);
+unless (@scripts) {
+	show_stats($start_time, \@stats) if $show_stats;
+	exit;
+}
+
+push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
+show_stats($start_time, \@stats) if $show_stats;
+exit(exit_code(\@stats));
-- 
gitgitgadget


  reply	other threads:[~2022-09-01  0:30 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` Eric Sunshine via GitGitGadget [this message]
2022-09-01 12:27   ` [PATCH 01/18] t: add skeleton chainlint.pl Ævar Arnfjörð Bjarmason
2022-09-02 18:53     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
2022-09-03  6:00     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
2022-09-03  7:51     ` Eric Sunshine
2022-09-06 22:35   ` Eric Wong
2022-09-06 22:52     ` Eric Sunshine
2022-09-06 23:26       ` Jeff King
2022-11-21  4:02         ` Eric Sunshine
2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
2022-11-21 14:07             ` Eric Sunshine
2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
2022-11-21 14:48                 ` Eric Sunshine
2022-11-21 18:04           ` Jeff King
2022-11-21 18:47             ` Eric Sunshine
2022-11-21 18:50               ` Eric Sunshine
2022-11-21 18:52               ` Jeff King
2022-11-21 19:00                 ` Eric Sunshine
2022-11-21 19:28                   ` Jeff King
2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
2022-09-03  5:07   ` Elijah Newren
2022-09-03  5:24     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
2022-09-02 12:42   ` several messages Johannes Schindelin
2022-09-02 18:16     ` Eric Sunshine
2022-09-02 18:34       ` Jeff King
2022-09-02 18:44         ` Junio C Hamano
2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
2022-09-11  7:01   ` Eric Sunshine
2022-09-11 18:31     ` Jeff King
2022-09-12 23:17       ` Eric Sunshine
2022-09-13  0:04         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3423df94bd6035640828a2508968cf8e1f5b4dda.1661992197.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=fs@gigacodes.de \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).